Re: Snowflake on Oracle

From: Aditya Allamraju <aditya.allamraju_at_gmail.com>
Date: Tue, 18 Apr 2023 23:36:00 -0700
Message-ID: <CAAcEhbVHtSaY2nvtGZCGNMSFqaD=GTUsM49PRpBR9Ap6-KESLA_at_mail.gmail.com>



_at_Lok P

 "*However in regards to the storage, snowflake has common storage i.e. "shared everything" architecture , which is shared among all the nodes/warehouses (Like Oracle or any other common RDBMS), so how can that be scaled infinitely? Won't that suffer I/O saturation at certain point in time with high read/write activities on storage as because it's common/shared across all the compute nodes?"*

  1. Data storage is built on Cloud blob store(So it's S3 if you go by AWS). It's not limited to a few NIC cards or i/o channels or a bunch of disks. Reads and writes to Cloud blob stores can have different network paths.

"*Regarding the second point, though the name is different, isn't snowflake "search optimization service" indirectly creating indexes under the hood ?*"

2. It's a relatively new feature and yes, almost like an index. But the data structure is not exactly a B-tree like RDBMS.

I see some mention about Snowflake doing FTS. Snowflake tries to avoid FTS as much as possible and does a very good job of doing it. Like Kyle already mentioned, it's not an index. This is called micro-partitioning and helps in pruning very large tables.
Ref:
https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions#what-is-data-clustering

Thanks
Aditya

On Tue, Apr 18, 2023 at 1:57 PM kyle Hailey <kylelf_at_gmail.com> wrote:

>
> One of the motivations of Snowflake was to do things that were too hard
> for Oracle to implement because of the burden of legacy code.
> Snowflake maps regions of data by the contents they have contain in that
> region. So it's not an index per say but it is a mechanism that prunes what
> data we have to look at. I think Exadata has this, but there were other
> things Snowflake was working on.
> It's been over 10 years since I talked to Benoit (or talked to in flowing
> robes on the playa like we were the only two people alive) but I recall him
> talking about some structures that were like "of course! why doesn't Oracle
> do that" and there reason was because it wasn't viable given Oracles legacy
> code.
> Unfortunately my memories are vague.
>
>
>
> On Tue, Apr 18, 2023 at 1:07 PM Mladen Gogala <gogala.mladen_at_gmail.com>
> wrote:
>
>> On 4/18/23 15:38, yudhi s wrote:
>>
>>
>> *"2. No indexes in Snowflake :)" *
>> Regarding the second point, though the name is different, isn't snowflake
>> "search optimization service" indirectly creating indexes under the hood ?
>>
>> Nope. Snowflake is a columnar storage database. Everything is stored in
>> memory, in compressed format. Indexes are structures that live on disk. In
>> principle, you don't need indexes on Snowflake. Snowflake is not your run
>> of the mill OLTP relational database like Oracle, DB2, SQL Server,
>> PostgreSQL or MySQL. Corollary: when you are searching for stuff in a
>> table, you will read the entire table, most of the time. Here is a good
>> article, albeit not about Snowflake:
>>
>>
>> https://learn.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-design-guidance?view=sql-server-ver16#use-a-nonclustered-columnstore-index-for-real-time-analytics
>>
>> You might not need a columnstore index. Rowstore (or B-tree) tables with
>> heaps or clustered indexes perform best on queries that seek into the data,
>> searching for a particular value, or for queries on a small range of
>> values. Use rowstore indexes with transactional workloads since they tend
>> to require mostly table seeks instead of large range table scans.
>>
>> That's the difference between data warehouse and OLTP. Data warehouse is
>> usually used for computing trends and making predictions. OLTP is a
>> different animal altogether. Oracle, DB2 and SQL Server (aka "the big 3")
>> are jacks of all trades but are primarily geared toward OLTP. DW databases
>> like Snowflake or Vertica usually use full table scans to read and analyze
>> large quantities of data. That is why Oracle has created Exadata. Exadata
>> is a DW machine which speeds up full table scans. EMC Greenplum has similar
>> design, which is no wonder since both Exadata and Greenplum have the same
>> designer, who is the only retired Oracle Ace in existence.
>>
>> --
>> Mladen Gogala
>> Database Consultant
>> Tel: (347) 321-1217https://dbwhisperer.wordpress.com
>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Apr 19 2023 - 08:36:00 CEST

Original text of this message