Re: Question on IO consideration
Date: Wed, 22 Sep 2021 14:08:16 +0530
Message-ID: <CAKna9VYSUB8OnJ_m-L+gu-6T9guz10Bw41T_XwuFnqnNPQ54cA_at_mail.gmail.com>
Thank you Lothar.
We do have big partition tables having historical partitions compressed
with 'query high'. So as you mentioned, ideally they should all be by
default stored as in columnar format and it should not be required to scan
blocks in terms of row store. So then why am I not seeing the statistics ''
'cell physical IO bytes saved by columnar cache'. or is it DB or cell
server version dependent? We have some databases still on 11.2.0.4
version, not sure if that can be a factor.
Can you please shed some more light on what exact statistics can be queried from dba_hist_active_sess_history for sql_ids to link those, if any significant IO can be reduced using such technique?
On Wed, Sep 22, 2021 at 1:18 PM Lothar Flatz <l.flatz_at_bluewin.ch> wrote:
> In Addition once you use hybrid columnar compression (HCC) the data is
> stored in columnar format. (Am I mistaken? I am irritated by the fact that
> nobody mentioned that simple fact.)
> In an ETL Job there are phases where data is checked for valid keys and
> additional data is added (e.g. from dimension tables) to simplify queering.
> In such phases columnar storage is most likely to help. You often use only
> a few columns of a table. You can consider using HCC on selected tables.
> Columnar storage is likely to help less when you are doing aggregation of
> some sort. (Although nothing is sure unless you check it.)
> Some queries on DBA_HIST_ACTIVE_SESS_HISTORY should tell you were the time
> is spend.
>
> Regards
>
> Lothar
>
> Am 21.09.2021 um 23:41 schrieb Tanel Poder:
>
> Exadata storage cells (starting from cellsrv v12.1.2.1.0 / Jan 2015) can
> use fully columnar flash cache for greatly speeding up reads (the cache is
> fully columnar, not hybrid like the datafile storage).
>
> You can look into slides 17-20 in this presentation (from 2015), some
> things may have changed/improved by now:
>
>
> https://tanelpoder.com/2015/03/24/oracle-exadata-performance-latest-improvements-and-less-known-features/
>
> This would speed up queries doing lots of scanning *if* your current SQL
> performance bottleneck is about reading too many datafile blocks (and not
> somewhere else, like having a large fact-fact hash join spill to temp).
> This columnar flash caching won't speed up writes & large data loads.
>
> So, if the goal is to speed up your ETL processing, you should first
> measure (if you haven't done so already) where the response time of your
> ETL jobs is spent... and see if it's in *smart scan data retrieval*,
> where the storage cells can't somehow keep up with the data ingest demand
> of the DB layer and they waif for disk reads a lot (but I'd say it's
> unlikely... depending on how your flash cache allowance is set up. You
> might already be benefitting from the columnar flash cache, the slides have
> metrics that show how much).
>
> Otherwise, just to reduce I/O (which kind of I/O - data retrieval for
> scans or data processing for GROUP BY/HASH JOINs?), you could review your
> top SQLs' execution plans and see if a better partitionings,
> subpartitioning (and maybe even Exadata-aware indexing) scheme would help
> to allow doing less I/O. I would look into fancier things like attribute
> clustering (and zone maps) after done with the basics first.
>
> --
> Tanel Poder
> https://tanelpoder.com
>
>
>
> On Tue, Sep 21, 2021 at 2:48 PM Lok P <loknath.73_at_gmail.com> wrote:
>
>> Hello Listers, We have oracle exadata databases which are performing
>> mostly warehousing or batch type data processing. And few are hybrid i.e
>> combination of both oltp+warehousing/analytics types processing. ETL jobs
>> run on a few of these which move/read/write billions rows daily.
>> The databases are 50TB to ~150TB in size currently. Few architecture
>> team members suggested evaluating if we can use columnar database type of
>> offering for IO reduction and thus better performance considering the
>> future growth. As per my understanding , Oracle stores data in row format
>> only, so is there any other offering from Oracle for columnar datastore
>> format or columnar databases and we should evaluate that? Or is there any
>> clustering kind of technique which can be evaluated which will help reduce
>> IO? Want to understand experts' views here on this.
>>
>> Regards
>> Lok
>>
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Sep 22 2021 - 10:38:16 CEST