Re: Suggestion on compression

From: yudhi s <learnerdatabase99_at_gmail.com>
Date: Thu, 12 Jan 2023 00:15:33 +0530
Message-ID: <CAEzWdqdK9hJQQZmhuGo_cKfu-0ECE2m3eFYPAKaEgXnThjRDKQ_at_mail.gmail.com>

Thank you so much for the guidance here.

These are eye opener for me, each of the points which Jonathan mentioned. We will test these options. And I am still trying to digest the points which Mark posted, which seems like a bigger design+code change.

However regarding the 'prefix compression' my understanding was, it can't be done on a partial/historical set of index partitions rather it has to be applied on global index level i.e. across all the index partitions, so I was thinking of automatic index compression/advance compression strategy which has syntax of making it happen on specific set of partition level only. But I will test this out , I may be wrong.

Never tried "attribute clustering" before ,but what i read in docs is that it will sort the table data on specific columns/attributes, so i can think of why that will help in increasing efficiency/clustering factor of an index based on those exact columns/order and also better compression ratio etc. , but will it not degrade other
indexes clustering factor/efficiency/compression ratio at same time , as the ordering of data is strictly based on some other set of columns now?

With regards to the option of setting the "indexing off" for the historical partition after declaring partial index, I was thinking just in case if any one off queries would be using any of the "most selective index" for current and + historical partitions scan, that will take a hit. So just for the safer side I am planning to check the current_obj# of the dba_hist_active_sess_history and if any of the index partitions are not showing up here(i.e not getting read/accessed) that should be a good candidate for this option. Please correct if my approach is wrong?

On Tue, Jan 10, 2023 at 10:38 PM Clay Jackson <dmarc-noreply_at_freelists.org> wrote:

> What JL and MWF said, ESPECIALLY:
>
> “Doing cyclical operational stuff like this strikes some folks as
> complicated. Designing it so that it is relatively simple and routine in
> execution and in particular is relatively simple for operations to always
> get right is important”
>
>
>
> It seems to me that the savings here would provide more than enough
> “incentive” to set up the cycles.
>
>
>
> Also don’t forget to TEST, instrument the processes so that they can be
> monitored and have contingency plans for failures in place BEFORE they
> happen i.e – plan for running out of space in the partitioning/unhook
> process.
>
>
>
> Clay Jackson
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Mark W. Farnham
> *Sent:* Tuesday, January 10, 2023 8:54 AM
> *To:* jlewisoracle_at_gmail.com; 'Oracle L' <oracle-l_at_freelists.org>
> *Subject:* RE: Suggestion on compression
>
>
>
> *CAUTION:* This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
>
>
>
> Everything JL wrote, +1.
>
>
>
> Another approach entirely is to unhook each partition as it passes the
> commonly queried period (you might want to do that for 7 days even though
> you only need 5 to facilitate remaining sane on a weekly operational
> frequency.) It is possible that no partitioning at all and a daily cycle is
> practical (probably using the alter synonym trick at midnight), leaving you
> with both a daily and week old schedule (one for shifting where insert
> update delete is possible [daily], the other for changing from indexes
> optimal for recent data queries to 30 day queries).
>
>
>
> The unhook would be to a cleverly and consistently named table, probably
> <thing>_MMDDYY or something similar. Then your monthly queries become union
> all queries of the seven partitions and the youngest 23 unhooked tables,
> which come up as effectively parallel 30 with serial on each of the 30
> individuals.
>
>
>
> A reason WHY you might want to do this is if fewer indexes are needed for
> the “monthly” queries and/or slightly different indexes that might be a
> drag on insert, update, and delete might make queries on the disconnected
> partitions that have become tables much more efficient.
>
>
>
> A reason WHY you might want to do this is if you don’t really need the
> detail for some (or all!) of your monthly queries, so permanently creating
> permanently cast aggregations as new tables becomes part of the unhook
> process. IF ALL the monthly queries are aggregations, you might consider a
> schedule less than 365 days for discarding detail tables from the online
> database. (Unloaded into something easy to reload, of course, in line with
> your overall information retention and purging schedule. You might need a
> detail row to prove something you are never routinely interested in
> reporting on.)
>
>
>
> A reason WHY you might NOT even think about this is if you have a
> non-period based uniqueness index across all time (which I doubt, since you
> write you are limited to 365-ish days).
>
> A reason WHY you might NOT want to pursue this is if your ad hoc 30 day
> window queries that require different aggregations or indexes frequently
> erupt.
>
>
>
> There certainly may be additional pro or con reasons beyond my terse
> enumeration.
>
>
>
> Probably the general health and friendliness (to the CBO) of an object
> with 7 partitions is better than one with 365 partitions.
>
> Possibly the total online storage size required drops because either you
> don’t need all the indexes or you might need many fewer “partitions” on
> line or you only need to keep aggregates older than 7 days on line. The
> “fan-in” of any possible aggregations has an enormous correlation on
> whether or not dramatic reductions in size are possible.
>
> Possibly old investigative queries on offline detail transaction can be
> done on a different hardware resource.
>
>
>
> Doing cyclical operational stuff like this strikes some folks as
> complicated. Designing it so that it is relatively simple and routine in
> execution and in particular is relatively simple for operations to always
> get right is important.
>
>
>
> Good luck,
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [
> mailto:oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>] *On
> Behalf Of *Jonathan Lewis
> *Sent:* Tuesday, January 10, 2023 5:32 AM
> *To:* Oracle L
> *Subject:* Re: Suggestion on compression
>
>
>
>
>
> Do you have a decent-sized test system? It doesn't matter what the
> manuals say about how things work (and how well things work), you need to
> test for YOUR specific setup because you may have some extreme case or
> boundary condition that causes the generic implementation to break.
>
>
>
> Having said that (and using 45 to give you a safety margin over the 30
> days you say need need as "current".
>
>
>
> a) Set MOST of your indexes to "indexing partial", then for (table)
> partitions older that 45 days set indexing off.
>
> b) I wouldn't set index compression to automatic, generall it's a
> declaration that you don't know your data and if you're worried about the
> impact of this specific table then you should know the data very well. I
> would set specific indexes to fixed size compression.
>
> c) Why "query high" when you've implied that you hardly ever read data
> older than 30 days, why not archive high? You may want to play a little
> safer and say (e.g. archive high over 180 days, query high from 45 to 180
> days). (Query high might be a little CPU intensive, so you might want to
> use query low rather than query high, but you'll have to do the
> cost/benefit analysis for your system.)
>
> d) At the point you move from non-compressed to compressed you could
> consider include an "attribute clustering" clause - this might improve the
> performance of some queries, and might improve the compression slightly.
>
> e) Monitor dba_tab_modifications for the table (and its partitions) so
> that you can see how many changes are reported for compressed partitions -
> each change is likely to lead to a chained row, so you need to come up with
> an algorithm for choosing to rebuild any partition that has had "too many"
> changes.
>
>
>
> Regards
>
> Jonathan Lewis
>
>
>
>
>
>
>
> On Wed, 4 Jan 2023 at 07:01, yudhi s <learnerdatabase99_at_gmail.com> wrote:
>
> Hello Experts, Happy new year to all.
>
>
> Considering Oracle database version 19C(19.15 to be specific and its
> Exadata-X8 machine). The database having majority of the transaction tables
> are daily range partitioned on a business date column holding truncated
> date values(without any time component). These tables are holding ~1 year+
> worth of data i.e. ~365+ partitions.
>
> The DMLS/write only happens on the live partitions or say today's day
> data(Say Nth partition).
>
> The querying or reading happens mainly last ~5days data or latest N-5
> partitions. However during monthend process(which runs once in a month) ,
> it reads/scans a month worth of data i.e. all the partitions > sysdate-30
> or latest N-30 partitions. Rest of the data/partitions are read/written in
> adhoc basis with minimal frequency.
>
> Now, we are seeing a space crunch in the box and also considering future
> growth we have been asked to compress all the possible table/indexes
> partitions. My question is , in the above scenario, will it be a good
> approach..
>
> 1)To just HCC compress all the historical table partitions i.e. all the
> partitions with partition position <N-30th using 'compress for query high'
> option.
> 2)And for all <N-30th local index partitions with 'compress advanced low'
> option. As "ADVANCE" index compression will let oracle choose the best
> index prefix key in local partition level and leaf block level
> automatically rather than us to worry about that.
>
> Or any other strategy would be best in the above situation for these
> transaction tables?
>
> As I understand the writing will not be a concern as we wont be touching
> the live partitions, however in case the reading(i.e. index scan, cell
> smart scan etc.) will happen on the compressed table/index data , it can
> change the current performance figures. Is my above understanding correct
> here?
>
>
>
> Regards
>
> Yudhi
>
>

--
http://www.freelists.org/webpage/oracle-l

Received on Wed Jan 11 2023 - 19:45:33 CET

This message: [ Message body ]
Next message: Iggy Fernandez: "Example of a SQL function that returns a non-NULL value even if one of its parameters is NULL"
Previous message: Mikhail Velikikh: "Re: 21c preinstall RPM package on RHEL 8"
Maybe in reply to: yudhi s: "Suggestion on compression"
In reply to Clay Jackson: "RE: Suggestion on compression"
Next in thread: Jonathan Lewis: "Re: Suggestion on compression"
Reply: Jonathan Lewis: "Re: Suggestion on compression"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message