Re: How to choose a database
Date: Thu, 27 Apr 2023 16:53:03 +0200
Message-ID: <CALSQGrJK_tFvRoMmWgpBF6y8pVaATh1Q2WLzjzjHosrbCfFTwQ_at_mail.gmail.com>
Disclaimer: I work for Oracle.
When I hear "monolithic databases don't scale", my first question is "What
are the scalability goals that you can't meet with \"monolithic\"
databases?"
As a first step for scalability in Oracle, IF YOU REACH THE ACTUAL PLATFORM
LIMITS, is Oracle RAC, where you can scale to up to 100 nodes (that's the
number I remember).
The next thing I hear is "RAC doesn't scale", and the argument is usually
related to GC events.
The problem when you have GC events (and in general when you have
contentions on distributed systems), is that the application isn't designed
to scale.
This is the real limit where ALL the platforms, including distributed
databases, can't really help.
Having high contention on rows in a distributed system is just worse than on monoliths: data modeling is way more important than the platform scalability itself.
If you carefully design your application, you can push monolithic database
limits higher, and scale almost linearly on RAC or shared/distributed
systems.
E.g., we've managed to run tests internally in Oracle where we could easily
reach 1MTPS on a single Exadata rack. Is it representative of a real
workload? Obviously not. The test was created on purpose, to avoid any
contentions. The same goes for every vendor benchmark that claims x KTPS or
MTPS.
The difference between in-house benchmarks and BlueKai use case is that
BlueKai shows real production numbers, taken from production
measurements/dashboards. They have 52 instances running Oracle Sharding.
The theoretical limit is 1000. But they have been incredibly good over the
years to create an application that scales linearly using Oracle Sharding
as an underlying (scalable) platform.
So yeah, coming to "how to choose a database", scalability is not the first
requirement I think about. I'd rather think about robustness/reliability,
consistency requirements, development features, easiness of integration,
and how it fits into the Data Management policies and operations of the
company.
Il giorno mer 26 apr 2023 alle ore 05:20 Pap <oracle.developer35_at_gmail.com> ha scritto:
> Thank you Mladen.
> I saw those figures from Oracle as somebody posted doing 1M transactions
> per sec achieved using shards. But also saw some blogs stating 20K TPS
> achieved using distributed database yugabyte DB. Are those not the true
> figures?
>
> https://www.yugabyte.com/blog/mindgate-scales-payment-infrastructure/
>
>
> https://blogs.oracle.com/database/post/oracle-bluekai-data-management-platform-scales-to-1-million-transactions-per-second-with-oracle-database-sharding-deployed-in-oracle-cloud-infrastructure
>
> And yes, It's an up and running system which is catering to the business.
> But the new system is completely written from scratch
> (mostly because the existing system complexity is increasing day by day)
> using modern techstacks microservices etc so as to cater future growth and
> provide required scalability, resiliency, availability etc.
>
> On Wed, Apr 26, 2023 at 2:14 AM Mladen Gogala <gogala.mladen_at_gmail.com>
> wrote:
>
>> On 4/25/23 15:07, Lok P wrote:
>>
>> " *For now, I am only aware that the database requirement was for a
>> financial services project which would be hosted on AWS cloud and one RDBMS
>> for storing and processing live users transaction data(retention upto
>> ~3months and can go ~80TB+ in size, ~500million transaction/day) and
>> another OLAP database for doing reporting/analytics on those and persisting
>> those for longer periods(many years, can go till petabytes).* "
>>
>> 500 million transactions per day? That is 5787 transactions per second.
>> Only Oracle and DB2 can do that reliably, day after day, with no
>> interruptions. You will also need very large machine, like HP SuperDome or
>> IBM LinuxOne. To quote a very famous movie, you'll need a bigger boat. I
>> have never heard on anything else in the PB range. You may want to contact
>> Luca Canali or Jeremiah Wilton who have both worked with monstrous servers.
>>
>> Not only will you need a bigger boat, you will also need a very capable
>> SAN device, preferably something like XTremIO or NetApp Flash Array. With
>> almost 6000 TPS, the average time for the entire transaction is 1/6 of a
>> millisecond. In other words, you need I/O time in microseconds. The usual
>> "log file sync" duration of 2 milliseconds will simply not do. You will
>> need log file sync lasting 200 microseconds or less. Those are the physical
>> prerequisites for such a configuration. You will also need to tune the
>> application well. One full table scan or slow range scan and you can kiss
>> 6000 TPS good bye.
>>
>> Your description is pretty extreme. 6000 TPS is a lot. That is an extreme
>> requirement which can only be achieved by the combination of specialized
>> hardware and highly skilled application architecting. Fortunately, there is
>> oracle-l, which can help with the timely quotes from Douglas Adams, Arthur
>> C. Clarke and Monty Python. And of course: all your base are belong to us.
>>
>> --
>> Mladen Gogala
>> Database Consultant
>> Tel: (347) 321-1217https://dbwhisperer.wordpress.com
>>
>>
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Apr 27 2023 - 16:53:03 CEST