DB2's latest TPC benchmark transforms into giant robot, beats up on other benchmarks
TPC website: 343,551 queries per hour on 10,000GB of TPC-H data (US$32.89 per QphH)
Perhaps one of the most surprising omissions at the IOD conference keynote sessions is any mention of IBM's October 15th TPC-H benchmark (the same day the conference began), in which they fired up a battalion of hardware and software to achieve fastest ten terabyte data warehouse benchmark of all time. Thirty-two POWER6 machines spinning a total of more than 3,000 disks served up 343,551 queries per hour, nearly double the speed of the previous throughput record (also set by DB2). Such surreal transaction rates also pushed the price-performance ratio to record lows for that category (well, at least by a couple of pennies, but we'll get to that later). However, with all benchmarks, there are quite a few things to consider.
As with most data warehousing benchmark submissions, the 3,000+ disks mentioned above are basically 90% empty. Remember how much pushback you got from your VP or CxO when you handed him or her a quote for the storage you needed? Now think of the look you'd get if you requested eleven times as much space purely for performance reasons. If anyone out there has been able to use their Jedi mind trick powers to pull off such a feat, please tell me how you did it. I promise I'll pass your name on to IBM and EMC so they can start a bidding war to hire you as the greatest storage sales rep of all time, and I will spend my referral bonus on a gloriously expensive coffeemaker designed by aerospace engineers.
Another curious aspect of these benchmarks is the gap between what IBM recommends and what they do in DB2 benchmarks. When it comes to the TPC-H data warehouse benchmark, IBM is not eating their own dog food. If you have attended a conference presentation or watched a webcast about DB2 9, you'll know that IBM is (justifiably) proud of several engine features that can profoundly improve performance:
- Self-tuning memory management
- Deep compression
- Multi-dimensional clustering
- Materialized query tables (Okay, this one is understandable, since the TPC have slapped so many restrictions on this area that it might as well be banned altogether)
- Block-based bufferpools (Also understandable, since there's little need to split up a bufferpool for a TPC-H workload that is geared exclusively toward block-based prefetching)
Were any of those features used in IBM's October 15th benchmark? Nope. Does IBM tell customers to exploit those features for their own data warehouses? All the time. The reason for this contradiction has to do primarily with database load time, a metric that is apparently much more significant in the bizarro universe of the TPC than in real-world shops, which rarely load a multi-terabyte warehouse from scratch. MDC tables would have chugged a bit to load so much unsorted data, and deep compression requires a table reorg (and a license for nearly 13,000 value units). What you end up with is a frustrating inconsistency between the real needs of data warehouse decision makers and a skewed technical experiment that purports to help people make those decisions.
That's not to say the benchmarks are pointless. If they were, I wouldn't waste time writing about them. Buried in those interesting details is IBM's preference to disable INTRA_PARALLEL in favor of running two DPF partitions per CPU core, resulting in 256 partitions that each manage barely 50GB of data. It's also not surprising that IBM applied table partitioning to the ORDERS and LINEITEM tables. Anyone running DB2 V8.2 Enterprise Edition should be scheming to figure out how to exploit this powerful feature as they plan an upgrade to DB2 9 or 9.5, since table partitioning is included in that edition at no extra charge.
The last part I wanted to bring up is the price, since the final price-performance ratio carries so much weight. IBM offered themselves a 48% discount for much of the hardware and software, but other competing vendors pull the same stunt with their seven and eight-figure TPC-H configurations, so don't get too worked up over it. Unless you plan to spend millions of dollars up front for such a system, you are unlikely to realize that deep a discount.
Overall, the numbers realized in this benchmark are good news. IBM took their hottest new UNIX hardware and proved that DB2 can exploit all that new hotness to achieve crazy fast performance, even though IBM (for whatever reason) chose not to exploit many of DB2's best features. Maybe one day someone will design a more relevant benchmark, in which the systems running it more closely resemble reality, but in the meantime, we at least have reports like this one that we can dig through for clues.
Photo pun courtesy of B. Shirley