[DevoxxGR2025] Unmasking Benchmarking Fallacies
Georgios Andrianakis, a Quarkus engineer at Red Hat, presented a 46-minute talk at Devoxx Greece 2025, dissecting benchmarking fallacies, based on a talk by performance expert Francisco Negro.
The Benchmarketing Problem
Andrianakis introduced “benchmarketing,” where benchmarks are manipulated for marketing. Inspired by Negro’s frustration with a claim that Helidon outperformed Quarkus in a TechEmpower benchmark, he explored how data can be misrepresented. Benchmarks should be relevant, representative, equitable, repeatable, cost-effective, scalable, and transparent. A misleading article claimed Helidon’s superiority, but Negro’s investigation revealed unfair comparisons, sparking this talk to expose such fallacies.
Dissecting a Flawed Claim
Focusing on equity, Negro analyzed the TechEmpower benchmark, which tests web frameworks on tasks like JSON serialization and database queries. The claim hinged on a test where Helidon used a raw database driver (Vert.x for PostgreSQL), while Quarkus used a full object-relational mapper (ORM) like Hibernate, incurring performance penalties. Filtering for full ORM tests, Quarkus topped the charts, with Helidon absent. Comparing both without ORMs, Quarkus still outperformed. This exposed the claim’s inequity, as it wasn’t apples-to-apples, misleading readers.
Critical Thinking in Benchmarks
Andrianakis emphasized skepticism, citing Hitchens’ Razor: claims without evidence can be dismissed. Using Brendan Gregg’s USE method, Negro identified CPU saturation, not database I/O, as the bottleneck, debunking assumptions. He urged active benchmarking—monitoring errors and resources—and measuring one level deeper to understand performance. Awareness of biases, like confirmation bias, and avoiding assumptions of malice over incompetence, ensures fair evaluation of benchmark claims.