Monday , August 21 2017

Benchmarks are meaningless when manufacturers game the system

Quadrant Benchmarks

We’ve all seen benchmarks, and used them as reference for how fast a device is. However, should we really give benchmarks as much weight as we once did? Probably not.

Some manufacturers have been caught out (some — Samsung — more than once) manipulating device performance specifically in response to known benchmarking apps being run on the device. It’s unfortunate we had to name Samsung for doing this, but as the biggest name in the Android space, and being almost synonymous with Android, it’s behaviour we had to point out.

The first instance became public way back in July, where a string labelled “Benchmark Booster” was discovered in Samsung’s source code on the Galaxy S4.

Fast forward to this week, with the Galaxy Note 3 being released, and lo and behold, Samsung are busted again!

Ausdroid wants to be clear though — it’s not that other manufacturers don’t manipulate the figures. We know they do. It’s known that HTC have gamed benchmarks in the past, and Sony have been caught doing it as well.

How do they do it?

Simple. We’ll use Samsung as the example (because that’s the one we know) but the others would be much the same. Samsung’s devices detect that particular benchmark software is running, and instead of allowing CPU cores to sleep when they’re not needed (the default behaviour), when the benchmarking app is detected, all four cores are kept running, and other optimisations are put in place as well, to artificially boost performance which isn’t normally available in other apps.

It’s not really cheating, but it doesn’t represent real-world performance; unless you fiddle quite extensively with your phone’s settings, and many don’t even allow you to do this, you’re not going to be able to extract the same performance for real world use, like games and other apps.

They’re all doing it

The team at AnandTech have done a fabulous write-up of the state of benchmarking which we highly recommend you read, and we’ve reproduced a table they’ve made which shows just how prevalent manipulation of benchmarking is:

Android devices that manipulate benchmarks
Device SoC Cheats In
3DM AnTuTu AndEBench Basemark X Geekbench 3 GFXB 2.7 Vellamo
ASUS Padfone Infinity Qualcomm Snapdragon 800 N Y N N N N Y
HTC One Qualcomm Snapdragon 600 Y Y N N N Y Y
HTC One mini Qualcomm Snapdragon 400 Y Y N N N Y Y
LG G2 Qualcomm Snapdragon 800 N Y N N N N Y
Moto RAZR i Intel Atom Z2460 N N N N N N N
Moto X Qualcomm Snapdragon S4 Pro N N N N N N N
Nexus 4 Qualcomm APQ8064 N N N N N N N
Nexus 7 Qualcomm Snapdragon 600 N N N N N N N
Samsung Galaxy S 4 Qualcomm Snapdragon 600 N Y Y N N N Y
Samsung Galaxy Note 3 Qualcomm Snapdragon 800 Y Y Y Y Y N Y
Samsung Galaxy Tab 3 10.1 Intel Atom Z2560 N Y Y N N N N
Samsung Galaxy Note 10.1 (2014 Edition) Samsung Exynos 5420 Y(1.4) Y(1.4) Y(1.4) Y(1.4) Y(1.4) N Y(1.9)
NVIDIA Shield Tegra 4 N N N N N N N

Table via AnandTech, link above.

But they used to be useful..

Benchmarks used to be the gauge of a devices brute speed and performance, and in the early days of Android, they were a somewhat useful differentiator of performance between devices, although the main argument against benchmarking is that it doesn’t mean much for real world performance — that, at least, has virtually always been the case.

Benchmarking has now turned the race to the top into a race to see who can best optimise their code to best perform under benchmarking conditions: what a disappointment. The reality is that the benchmarking scores can, within a reasonable margin, be manipulated to make a particular product look good, and thus others look worse.

Perhaps unsurprisingly, while some devices manipulate benchmarking scores, they don’t change the underlying understanding anyway. It’s no secret that Samsung’s hardware IS very powerful, and probably more powerful than a good portion of the competition. Manipulating benchmarking can’t make a crap device appear to be gold; it’s limited to making an already powerful device just look a bit better on paper.

But does it really matter?

Truthfully, no. The media (Ausdroid included) probably have a bit to answer for this situation coming up, obsessing over minute differentiation between devices in days gone by. While we don’t really focus on benchmarks anymore, it is this earlier focus which probably led to manufacturers wanting to game the system, to get a better picture of their products.

Nowadays, we’ve certainly moved away from the raw power aspect (as hardware performance is becoming fairly homogenised amongst the top-tier at least), and we now focus more on usability, design and experience. It’s all well and good to have insanely powerful hardware, but if it isn’t usable, there’s little point.

Fortunately, the Android manufacturers are alive to this move, and the focus on user experience (UX) is only increasing with each major release. We only hope that this becomes the real measure of a device’s performance, and the somewhat dated reliance on benchmarking scores goes the way of the dinosaurs.

 

Phil Tann   Journalist

Join the Ausdroid Conversation

6 Comments on "Benchmarks are meaningless when manufacturers game the system"

avatar
Sort by:   newest | oldest | most voted
Geoff Fieldew
Valued Guest

Taking another angle…

Benchmarks can be useful to compare iterations of the same device line. You can have some sort of idea about the improvements in performance and efficiency in say, Galaxy S -> Galaxy SII -> Galaxy SIII -> Galaxy S4 ->

I like benchmarks for the above scenario. I like to see the improvements each time I get a new Nexus phone or tablet. It may be rough but it’s somewhat quantifiable. I am still amazed by the wonder of technology. Benchmarks can help facilitate that.

Happy Dog
Valued Guest
Happy Dog
They aren’t cheating the scores. They are simply letting the device run at full power when it needs it. I may have a 9 second car but on the road it doesn’t need that power. Take it to the track and I want it all when I need it. Same thing applies to benchmarks. Basic apps don’t need four cores running at 2.2 GHz, but benchmarking is to see how well that chip goes at full power. And don’t tell me that the benchmark is wrong because it has used all the power. Its like saying, I’ll run a quarter… Read more »
JeniSkunk
Valued Guest
JeniSkunk
The manufacturers are not cheating the in showing the absolute maximum brute force scores their devices are capable of. What they ARE cheating in is using that info as a false claim of normal performance. The problem with you car and track scenario, Happy Dog, is that outside of restricted test conditions, all the raw power of your car cannot be accessed at all by you whether or not you need or want access to it. The manufacturers only deign to permit all the raw power of your car to be accessed only when it is being used by specific… Read more »
Happy Dog
Valued Guest
Happy Dog

That’s because there really aren’t that many applications that could really push an 800 or even a 600/S4 pro+ to the limit. So I can understand why they want to display the full power.

Nobody seems to be getting my point though. The chip is doing the work and it is doing the numbers. So THAT is fair!!!
Cheating is them taking a score and making it seem better.

We are taking away the facts that they have done a good job to display those benchmark scores. We are all so eager to hate and not applude.

Member

It is cheating as they “detect” the benchmark software and alter their behaviour accordingly. If they detected the need for the other CPUs to be online to handle the load that could be produced by any apps or combination of then that wouldn’t be cheating.

If they didn’t want to cheat then there should be an UI to the “full throttle” list where you can remove the benchmark apps and/or add your own apps to it.

The main problem with benchmarks is the cheating.

Member

Interesting. Though the AnandTech graph does show some other manufacturers boost benchmarks, it also shows that Samsung are the worst offenders, probably because they have the most models out there.

It is good to see that Moto and Nexus devices don’t play any shenanigans. I also agree with not paying any heed to benchmarks, as we all know that a S800 with 2GB of RAM is going to be faster than a S4 Pro device.

It should be all about the user experience, rather than marketing huffing and puffing and pointing to artificial results.

wpDiscuz

Check Also

Rumour: Google could be working with Xioami on a new Android One phone

Google’s Android One program hasn’t exactly set the world aflame, it’s still plugging along. A …