RE: How to choose a database
Date: Wed, 26 Apr 2023 14:58:15 -0400
Message-ID: <2c2701d97871$11d883a0$35898ae0$_at_rsiz.com>
Before “rightsizing” became a euphemism for laying people off, it applied to getting the correct amount of infrastructure and staff to have a reasonable safety margin on operating goals.
Several companies on the bleeding edge of adoption of “Open Systems” (which included almost all flavors of UNIX, DEC’s two main product lines, Data General, HP, a few things from IBM, Sequent, Sun, and Pyramid) hired Rightsizing, Inc. to be the referee of what later came to be called “proof of concept” demonstrations for a particular purpose. At the time they were called “benchmarks” even though they had nothing to do with standards benchmarks like TPC-x.
Vendors proposed hardware frequently. Sometimes if they believed they had a paved inside track they would propose much more than was needed. Sometimes if it was clearly competitive the smallest plausible configuration was proposed in favor of winning the bid and upselling as things got implemented and couldn’t deliver.
Folks bringing things “on premises” off of time sharing had approximately zero experience figuring out what they needed in place of their former timesharing bill.
Being experienced and having only our reputation on the line, folks could trust our answers and pretty quickly the vendors learned to do their best to propose enough but not too much and to not try something that certainly wouldn’t be used for a production environment.
It was fun while it lasted. We only played favorites in that we knew how to lay down Sun and Sequent environments to run Oracle extremely well, so their teams could build on that, especially to deploy Oracle Applications.
With time sharing and “the cloud” it’s different because most of the vendors will allow you to more or less dynamically scale, so you can run your own trials, usually with each cloud vendor helping you use their stuff the most cost effective way to accomplish your needs.
When the volume of transactions being tossed around in this thread gets real, it is probably worth hiring a referee.
This has been a wonderful thread.
From: Mladen Gogala [mailto:gogala.mladen_at_gmail.com]
Sent: Wednesday, April 26, 2023 2:23 PM
To: niall.litchfield_at_gmail.com
Cc: oracle.developer35_at_gmail.com; Lok P; Clay Jackson (cjackson); Mark W. Farnham; pbrunoster_at_gmail.com; dbakevlar_at_gmail.com; Oracle L
Subject: Re: How to choose a database
A long, long time ago in the Massachussets far, far away, there used to be a company called DEC. DEC was famous for cheating on benchmarks, in particular Dhrystone MIPS.
The morals of the story is that some companies lie and cheat. You gotta ask yourself one question only: do I feel lucky?
If you do, you can trust the marketing papers and purchase whichever database that claims to have exceeded the speed of light. Being distrustful, suspicious and paranoid, I prefer testing. Can your vendor of choice show a successful implementation on the required level? If not, you need to test. Faith based purchases rarely end well.
Mladen Gogala
On Wed, Apr 26, 2023, 11:54 <niall.litchfield_at_gmail.com> wrote:
The question you've got to ask yourself is
No not that one, although it might help :)
It's "What do I mean by a transaction?" Vendors will usually have a very specific definition based on a benchmark, but as vendors are the only people who actually run benchmarks in production that may not help. 500M transactions per day could indeed mean 6k commits a second. It might mean recording details of 500 million financial transactions a day in batches of a thousand at a time (so 6 commits a second). It might mean something else entirely, my money is on the last - especially as I doubt there are 500M contracts exchanged daily (e.g. the Nasdaq did 29m yesterday, I think the NYSE is about double that). So the vendor figures will be accurate, but they likely all refer to different measures and they certainly don't refer to your transactions.
On Wed, Apr 26, 2023 at 9:05 AM Pap <oracle.developer35_at_gmail.com> wrote:
Also cockroach DB showing up 100K transactions per seconds in the blog below.
Now my apology if the question is silly here, btw are they able to achieve these figures because they are operating on distributed shards so the point of contention is distributed? Also are they doing those transactions in batches to get those figures, because if someone inserts 500million rows using a batch size of ~1000, then that will be ~500K database calls/transactions but not ~500million calls/transactions. And it would be easier for databases to handle those 500K batched transactions easily as there will be less chatter as compared to 500million individual transactions or say row by row transactions as we say it and each of that would involve context switching/network round trips ,connection management,parse calls etc between application and database engine. Correct me if my understanding is wrong?
https://www.cockroachlabs.com/docs/stable/performance.html
On Wed, Apr 26, 2023 at 8:49 AM Pap <oracle.developer35_at_gmail.com> wrote:
Thank you Mladen.
I saw those figures from Oracle as somebody posted doing 1M transactions per sec achieved using shards. But also saw some blogs stating 20K TPS achieved using distributed database yugabyte DB. Are those not the true figures?
https://www.yugabyte.com/blog/mindgate-scales-payment-infrastructure/
And yes, It's an up and running system which is catering to the business. But the new system is completely written from scratch
(mostly because the existing system complexity is increasing day by day) using modern techstacks microservices etc so as to cater future growth and provide required scalability, resiliency, availability etc.
On Wed, Apr 26, 2023 at 2:14 AM Mladen Gogala <gogala.mladen_at_gmail.com> wrote:
On 4/25/23 15:07, Lok P wrote:
" For now, I am only aware that the database requirement was for a financial services project which would be hosted on AWS cloud and one RDBMS for storing and processing live users transaction data(retention upto ~3months and can go ~80TB+ in size, ~500million transaction/day) and another OLAP database for doing reporting/analytics on those and persisting those for longer periods(many years, can go till petabytes). "
500 million transactions per day? That is 5787 transactions per second. Only Oracle and DB2 can do that reliably, day after day, with no interruptions. You will also need very large machine, like HP SuperDome or IBM LinuxOne. To quote a very famous movie, you'll need a bigger boat. I have never heard on anything else in the PB range. You may want to contact Luca Canali or Jeremiah Wilton who have both worked with monstrous servers.
Not only will you need a bigger boat, you will also need a very capable SAN device, preferably something like XTremIO or NetApp Flash Array. With almost 6000 TPS, the average time for the entire transaction is 1/6 of a millisecond. In other words, you need I/O time in microseconds. The usual "log file sync" duration of 2 milliseconds will simply not do. You will need log file sync lasting 200 microseconds or less. Those are the physical prerequisites for such a configuration. You will also need to tune the application well. One full table scan or slow range scan and you can kiss 6000 TPS good bye.
Your description is pretty extreme. 6000 TPS is a lot. That is an extreme requirement which can only be achieved by the combination of specialized hardware and highly skilled application architecting. Fortunately, there is oracle-l, which can help with the timely quotes from Douglas Adams, Arthur C. Clarke and Monty Python. And of course: all your base are belong to us.
--
--
Niall Litchfield
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com
Oracle DBA
http://www.orawin.info
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Apr 26 2023 - 20:58:15 CEST