Re: Design related question

From: yudhi s <learnerdatabase99_at_gmail.com>
Date: Tue, 12 Dec 2023 10:02:19 +0530
Message-ID: <CAEzWdqcQwqknew+OrnPonW97TZH56RsFZxS6wMbh9okAcYJEBw_at_mail.gmail.com>



Thank you so much.
Yes I heard about Hybrid tables/unistore and we talked to snowflake partners but that seems to be production ready by last quarter next year and business doesn't want to have that dependency on snowflake to get the feature production ready. So we may then end up choosing a separate OLTP database(like Aurora postgresql) for our use case.

 I got to know a few issues with Postgresql like vacuuming in a highly transactional system and higher number of partitions causing performance issues while parsing. Not sure if these same issues exist on aurora postgresql.

On Tue, Dec 12, 2023 at 3:08 AM John Ridgway <john_at_marjohn.com> wrote:

> You can do some OLTP type loads on Snowflake via hybrid tables. Hybrid
> tables have indexes and are organized like relational tables. This feature
> might still be in private beta but your Snowflake contact should be able to
> turn it on for you. It might yield better performance for your use case.
>
>
https://www.snowflake.com/en/data-cloud/workloads/unistore/
> I would not go down the road of using a cache type database to attempt to
> get OLTP performance from an OLAP database. Caches are specific use cases
> for speeding up OLTP DBs or allowing to distribute the power of a central
> database for different regions.
>
>
> On Tue, 12 Dec 2023 01:25:20 +0530, yudhi s <learnerdatabase99_at_gmail.com>
> wrote:
>
> Thank you so much for the input.
>
> *"We have seen a lot of success with moving our OLAP workload and storing
> historical data in Snowflake. Snowflake is a columnar store database so it
> should not be treated like OLTP."*
>
> Yes, I think snowflake is really a great option for OLAP use cases.
> However, we saw few use cases in which the sub second response is required
> for some UI read queries and it didn't perform there. Even single row
> inserts too. Queries with multiple JOINS , having no indexing options ,
> seeing the compilation time itself shoots till ~500ms for simple queries.
> All these types of things make it not so suitable for OLTP use cases at the
> current point in time.
>
> So basically we want to cater our OLTP use case needs along with OLAP use
> cases. So , some colleagues ask to just have the snowflake database and
> have the Redis cache on top of it to serve OLTP use case needs, rather than
> having a dedicated oltp database like Aurora postgresql? Is such design
> advisable considering we have ~500 million transactions/Day to persist
> across multiple transaction tables making it to ~100TB's in size for ~3
> months worth of data? I don't have much idea about redis, just heard of it
> as it's an in-memory database. Not sure if it supports table joins, indexes
> etc as an OLTP database does? Appreciate your input on this.
>
> On Mon, Dec 11, 2023 at 11:15 PM John Ridgway <john_at_marjohn.com> wrote:
>
>> An X9 is the latest and greatest Exadata machine from Oracle with a lot
>> of capability. A properly tuned Exadata machine is incredible. Usually,
>> those beasts are multi-year leasing contracts that can go into the
>> millions. If configured properly, you can use dynamic scaling to achieve
>> some cost savings. Exadata machines must run on-premise or in a CoLo as
>> AWS does not support them. With applications moving to AWS, your database
>> will be further away from your apps even if somehow you get them to be in
>> the same city. In our data center, we saw < 1 ms network latency. Apps
>> moved to AWS saw about 15-20 ms latency. That doesn't sound bad but for
>> chatty apps that make lots of calls, it was disastrous. Some critical apps
>> had to move to the CoLo to be close to the database. Along with network
>> latency which is unavoidable with distance, there are firewall and egress
>> charges to consider. Your cost savings can be eaten up by needing larger
>> than anticipated bandwidth and firewall capacity. There is a costly design
>> problem with purchasing the Lamborghini type database without consideration
>> of how it fits into the overall architecture.
>> We have seen a lot of success with moving our OLAP workload and storing
>> historical data in Snowflake. Snowflake is a columnar store database so it
>> should not be treated like OLTP. It is an amazing platform too. Snowflake
>> has also seized on the method of making the OLAP processing run even better
>> by reducing the distance between app and the database with the idea of
>> Snowpark even though it all runs at the cloud provider. This allows the
>> app/reporting code to run on the same virtual machines as the database.
>>
>>
>> On Sat, 9 Dec 2023 15:56:06 +0530, yudhi s <learnerdatabase99_at_gmail.com>
>> wrote:
>>
>> Hello Listers,
>> I understand colleagues here having decades of experience in the database
>> world, We want some guidance, if the below design looks okay for our
>> customer use case.
>>
>> We currently have financial systems transaction data streams to Oracle
>> exadata(X9) on-premise. This database supports processing of 400million
>> transactions per day. And again a single transaction for us is a
>> combination of 7-8 inserts into different transaction tables with Indexes ,
>> unique constraints etc defined on those and they commit in batches(~1000
>> batch size) in the database. And this system persists data for ~6 months.
>> We do have all sorts of OLAP(daily/monthly batch reports running)
>> applications run on the same database along with some user facing UI
>> applications showing customer transactions. So it's basically currently
>> serving a hybrid workload and is one stop solution for all use cases.
>>
>> Many of the applications are moving from on premise to AWS cloud as part
>> of modernization journey and AWS being chosen cloud partner also the
>> product is expected to expand across more regions and this system is
>> expected to serve increase in the transaction volume. And also we have a
>> requirement to persist transaction data for ~10years to have those
>> available for analytics/data science use cases.
>>
>> So the team is thinking of splitting it into two parts
>> 1)OLTP type use case in which we will persist/write the transaction data
>> faster and show it to the UI related apps , in near real time/quickest
>> possible time. and this database will store Max 60-90 days of transaction
>> data. Not sure if we have an option of Oracle exadata equivalent on AWS, so
>> team planning of using/experimenting with Aurora postgres. Please correct
>> me, if there are any other options we should use otherwise?
>>
>> 2)Then move the data beyond ~90 days into another database or object
>> storage S3 which will keep it there for ~10 years and will be queryable
>> using the necessary API's. That is supposed to cater to Olap/analytics/data
>> science use cases etc.
>>
>> Is the above design is okay? and also in regards to the second point
>> above i.e. persisting the historical data (that to be in queryable state),
>> should we go for some database like snowflake or should just keep it on S3
>> as is and make those queryable through APIs. Please advice?
>>
>> Regards
>> Yudhi
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Dec 12 2023 - 05:32:19 CET

Original text of this message