I was just reading this TechCrunch article and thought I'd write a quick explanation of the philosophy we've generally tried to take here, which is somewhat different than any of the ones listed. Our work is almost exclusively in 3D rendering (with minimal UI, even) and has informed the way we've handled things as a result.
Basically, we've taken the tack that there are two directions to check: Hardware and Drivers. This means we want representatives of all the OS versions, and representatives of all the hardware chipsets. On the OS side that's 2.1, 2.2, 2.3.3, 3.2, and 4.0. On the hardware side that's Qualcomm Adreno, nVidia Tegra, PowerVR, and Samsung Mali.
This means minimum we end up with around 20 devices, in order to get each OS version represented with each chipset. In practice it's not quite so neat and tidy -- there aren't any 2.1 or 2.2 Tegra devices, for example. When in doubt, we get variations of hardware before variations of OS, as experience thus far is that while the OS may change the drivers appear to stay the same.
Generally, if we're doing a major update of some kind we'll make a point of running it on a variety of these devices and make sure everything behaves correctly. For minor updates where nothing risky has changed, we'll generally run through it on the devices on the desk of the individual developer. Generally these represent the hardware axis of the grid, given the aforementioned driver point.