How to choose a database -"Lies, Damn lies and benchmarks"
Date: Wed, 26 Apr 2023 14:54:12 +0000
Message-ID: <CO1PR19MB49847586739A4DB3C9714B529B659_at_CO1PR19MB4984.namprd19.prod.outlook.com>
Oh, I have no doubt the figures are “true”; just like the TPC benchmark numbers vendors still throw around.
But consider how much work goes into designing, building, tuning and maintaining those systems.
Way back in the 1990s, several of the major telcos were processing many thousand calls/second through various systems (I was involved with one that’s still in use today, I think at time we peaked around 500K initiated calls/second across the entire US), and there are other examples (Amazon, Google) of transaction volumes that will “blow your mind”.
Systems processing transactions at those sorts of volumes may be built “on top” of databases like Oracle or yugabyte; but I think if you look “behind the curtain”, you’ll discover that ALL systems at that level are “purpose built” for the specific requirements they’re trying to address.
What Mladen and others are saying is that while performance at this level is achievable, it’s about much more than just picking the “right” database. In fact, I’m gonna go out on a bit of a limb here and assert that the choice of specific database (from among the “top” databases already mentioned) is LESS important than application design and testing.
Clay Jackson
From: Pap <oracle.developer35_at_gmail.com>
Sent: Tuesday, April 25, 2023 8:20 PM
To: Mladen Gogala <gogala.mladen_at_gmail.com>
Cc: Lok P <loknath.73_at_gmail.com>; Clay Jackson (cjackson) <Clay.Jackson_at_quest.com>; Mark W. Farnham <mwf_at_rsiz.com>; pbrunoster_at_gmail.com; dbakevlar_at_gmail.com; Oracle L <oracle-l_at_freelists.org>
Subject: Re: How to choose a database
CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.
Thank you Mladen.
I saw those figures from Oracle as somebody posted doing 1M transactions per sec achieved using shards. But also saw some blogs stating 20K TPS achieved using distributed database yugabyte DB. Are those not the true figures?
https://www.yugabyte.com/blog/mindgate-scales-payment-infrastructure/
And yes, It's an up and running system which is catering to the business. But the new system is completely written from scratch (mostly because the existing system complexity is increasing day by day) using modern techstacks microservices etc so as to cater future growth and provide required scalability, resiliency, availability etc.
On Wed, Apr 26, 2023 at 2:14 AM Mladen Gogala <gogala.mladen_at_gmail.com<mailto:gogala.mladen_at_gmail.com>> wrote:
On 4/25/23 15:07, Lok P wrote:
" For now, I am only aware that the database requirement was for a financial services project which would be hosted on AWS cloud and one RDBMS for storing and processing live users transaction data(retention upto ~3months and can go ~80TB+ in size, ~500million transaction/day) and another OLAP database for doing reporting/analytics on those and persisting those for longer periods(many years, can go till petabytes). "
500 million transactions per day? That is 5787 transactions per second. Only Oracle and DB2 can do that reliably, day after day, with no interruptions. You will also need very large machine, like HP SuperDome or IBM LinuxOne. To quote a very famous movie, you'll need a bigger boat. I have never heard on anything else in the PB range. You may want to contact Luca Canali or Jeremiah Wilton who have both worked with monstrous servers.
Not only will you need a bigger boat, you will also need a very capable SAN device, preferably something like XTremIO or NetApp Flash Array. With almost 6000 TPS, the average time for the entire transaction is 1/6 of a millisecond. In other words, you need I/O time in microseconds. The usual "log file sync" duration of 2 milliseconds will simply not do. You will need log file sync lasting 200 microseconds or less. Those are the physical prerequisites for such a configuration. You will also need to tune the application well. One full table scan or slow range scan and you can kiss 6000 TPS good bye.
Your description is pretty extreme. 6000 TPS is a lot. That is an extreme requirement which can only be achieved by the combination of specialized hardware and highly skilled application architecting. Fortunately, there is oracle-l, which can help with the timely quotes from Douglas Adams, Arthur C. Clarke and Monty Python. And of course: all your base are belong to us.
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com<https://dbwhisperer.wordpress.com/>
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Apr 26 2023 - 16:54:12 CEST