It’s been great to see a conversation developing around how to acquire test devices and how to do so on a budget. But once you have a budget in place, how should you spend it? What makes a good test device, and why should you pick one device over another?
Most people naturally start with two criteria…popular and cheap. Popular is good because it’s (hopefully) representative of devices in use by the general population. Cheap is good for obvious reasons, but are those really the best criteria? When choosing devices, we’ve found it helps to consider a variety of factors.
1. Existing traffic
Often, teams begin a device collection (and later expand it) to suit the requirements of a project. This is great because it bases the decision on real-life data, and as a bonus, the project forces you to really get to know the devices you’ve purchased. So the first thing we recommend is to look at existing client traffic (if some exists). There’s no point ignoring browsers or devices that are already hitting your/your client’s site in significant numbers. And if you already know you need to buy an Android 2.3 device, you may as well choose a model that is frequently accessing the site. While it’s not uncommon to see a long-tail of 300-500 devices in your analytics, it often follows a classic 80/20 pattern. Focussing on the 20% of devices that produce 80% of the traffic is a great way to begin.
Be aware however that analytics can be deceptive. Many packages are still JavaScript based (or will at the very least provide this option as the default) so may not report traffic from devices with poor JavaScript support (…and will of course completely miss those with no JavaScript support). Even in the US and Europe, you can greatly increase the quality of your analytics results by choosing a server-side package such as the ‘mobile’ version offered by Google. On a recent project we convinced a client to switch and in less than 10 days doing so doubled the size of our Google Analytics devices list.
2. Regional traffic and market
Next, review overall market share and traffic in your region (or the regions you operate in) so that you can focus on the platforms that are most likely to access the site. If 80% of your traffic is from the US and Europe, there’s no point for example prioritising Symbian, but you will need to do so if your customers are primarily in APAC (or almost anywhere else). The information you glean from this step will often reinforce the data in the analytics from step 1 (and if it doesn’t…that’s always an interesting realisation).
Good sources for this type of data include the Global Mobile Stats page from MobiThinking, Statcounter’s mobile browser stats and the regular releases of market share statistics published by the likes of Comscore and Nielsen (although these will often be US only). Jason Grigsby has also compiled this huge list of sources.
Where possible, try to also review platform version statistics as new devices don’t necessarily come with the newest OS and users won’t always upgrade (if that option is even available to them). Your analytics package should be able to provide some version data, and regularly updated platform version stats can also be found on the Android and BlackBerry developer sites. (Apple sadly does not release these statistics but data released by native app analytics services such as Flurry can often provide an indication of platform version popularity).
Based on these first two steps, you should be able to devise a list of candidate devices/browsers while also eliminating devices that are completely unrelated to your market or product conditions.
3. Device-specific factors
The next step is to map this device list against the factors that make a good test device. This will help you pick the most useful models rather than simply opting for the cheapest (or sexiest) on the list. A great resource during this step is Device Atlas’ Data Explorer (login required) which enables you to query common device properties across thousands of devices. Another useful tool is GSM Arena which includes comprehensive (consumer-facing) device specifications, a robust advanced search/filter option, and a popularity meter providing a glimpse of the interest level for each device.
Here are the device-specific factors you should consider:
a) Form factor: Touch screens are increasingly popular but a good 30% of smartphones also have a keyboard, trackball or other indirect manipulation device so you want to make sure to test on multiple form factors.
b) Screen size: This is obviously a big one. You want to ensure you can test on a range of representative sizes (and orientations). Android devices are particularly handy as you can easily spoof the default viewport size by adjusting the Zoom Level. This is a great stop-gap if you only have a few devices on hand.
c) Performance: Devices vary greatly in CPU and overall performance (including factors such as quality of touch screen) so you want to ensure you don’t simply test on only really high or low-end devices.
d) DPI: Screen dpi also varies quite a bit and can greatly impact legibility. Although it’s hard to mitigate against poor dpi displays, testing on such devices can be hugely useful to get a feel for the many ways your site will look. And sometimes, a small tweak is all it takes to improve legibility on these displays while still maintaining a good balance for everyone else.
e) Screen conditions: This is also one that you can’t do too much about, but is good to keep in mind. Screen condition factors can include overall screen quality (which impacts sensitivity to touch input), variations in colour gamut, and the ability for users to adjust contrast. In general, the cheaper the displays the more potential for this type of variation. (Oddly however, some super cheap devices have a nicer display than more expensive ones…it’s all about where and how the designer chose to differentiate the product).
4. Project-specific factors
Next you want to double-check the list matches any project specific factors. If for example, you app revolves around “things that are nearby”, it’ll likely be important to test various flavours of geolocation implementation.
5. Budget
And of course, rounding this out is budget. In many cases, this will remain a primary consideration but following the preceding steps should enable you to better justify each purchase and convey to stakeholders the value of testing on each browser or device.
One OS, many flavours
Remember as well that there is no such thing as “testing on Android” or “testing on an iPhone”. An iPhone with iOS 5 is a different beast from one with iOS 4.3.n (or one where your user has installed Opera Mini). Be sure to track the OS versions found on your test devices, and think carefully each time you upgrade. Owning four BlackBerry devices with four different versions of the OS is infinitely more valuable than owning four with the same version.
And in the case of Android (…and I presume eventually Windows Phone), we also have the added layer of complexity that is the OEM. Owning five MotoBlur variants of Android is also not as useful as owning five Android devices from multiple manufacturers.
So ideally, you want to end up with a collection of devices that is representative of your audience, of overall market share, but also diverse in multiple ways. Something like this:
- iPhone 3GS, iOS 4.3.n, 320 x 480 px (no retina display)
- iPhone 4, iOS 5, 320 x 480 px (retina display)
- iPad, iOS 5, 1024 x 768 px (10″ tablet, no retina display)
- Android 2.1 – Motorola, 480 x 600 px (popular)
- Android 2.3 – HTC, 480 x 320 px (QWERTY)
- Android 2.3 – Huawei, 320 x 480 px (low CPU)
- Android 3.0 – Samsung, 320 x 480 (low CPU, low dpi)
- Android 2.3.4 – Kindle Fire, 1024 x 600 px (7″ tablet, proxied browsing)
And then…?
Once a site is launched, repeat the steps in this article every 3-6 months to ensure you keep abreast of changes in your traffic. This is a useful exercise even if you have no budget at that point. If you suddenly discover unexpectedly high traffic from a new OS or device, you can always use emulators to fill this gap. (As a general rule don’t bother rushing out to buy brand new models as it can take 3-6 months before you see any significant traffic from them).
And be sure to keep an eye out for edge cases that have the potential to become wildly popular. The Kindle and Kindle Fire are my favourite edge cases of late. They are cheap, popular, have a large screen, a good browser, but are in many other ways underpowered.
If all this seems a bit daunting, remember that it’s not that different from other types of decisions that rely on combinations of knowledge and experience. Kind of like when you first learn to cook Thai or Indian food, and have to stock up on spices and ingredients. If you try to do it all at once, the choice seems endless and it’s overwhelming. But after a while, you get to know those things you just can’t do without, what quantities you will need, and what goes well with what. You also learn when it’s safe to improvise.
Despite the seemingly crazy variety, acquiring test devices is not that different. If (as many smaller agencies do) you commonly/primarily build for one region of the world, you will quickly get to know the range of devices you need. You’ll also develop a list of “things you wish you had” and another of “things you have enough of”. All this will make the next round of testing easier and will help you prioritise when a bit of budget comes your way.
Phones are one thing…but what about the zombie apocalypse?
The future is admittedly a bit fuzzy but I think designing for (and testing on) phones will prove a useful transitionary stage. Diversity is on the rise and companies are working hard to release products that certainly qualify as complete edge cases today. Many of these products will (I hope…) take a good 3-5 years to become mainstream. These include fridges, in car displays, devices that pair or share displays with unrelated devices, or maybe even contexts where there is no (visual) display at all.
How we will test for these contexts is less than obvious, but that’s also why we should very soon expand the conversation from the “how” of acquiring devices to some possibly larger questions.
Barring the obvious “does it work” factor, what is the role of testing in determining if a web product is fit for purpose (let alone “a great experience”)? How will that role have to change as design and functionality are further decoupled from a fixed screen or capabilities context? And how will we determine a pass or fail once the ‘experience’ and ‘device’ are no more than just an aggregate of random user agents chosen by the user.
Maybe in the future, our users should determine a pass or fail. (Maybe they already do….)