Re: Suggestion on compression
Date: Thu, 12 Jan 2023 00:15:33 +0530
Message-ID: <CAEzWdqdK9hJQQZmhuGo_cKfu-0ECE2m3eFYPAKaEgXnThjRDKQ_at_mail.gmail.com>
Thank you so much for the guidance here.
These are eye opener for me, each of the points which Jonathan mentioned.
We will test these options. And I am still trying to digest the points
which Mark posted, which seems like a bigger design+code change.
However regarding the 'prefix compression' my understanding was, it can't
be done on a partial/historical set of index partitions rather it has to be
applied on global index level i.e. across all the index partitions, so I
was thinking of automatic index compression/advance compression strategy
which has syntax of making it happen on specific set of partition level
only. But I will test this out , I may be wrong.
Never tried "attribute clustering" before ,but what i read in docs is that
it will sort the table data on specific columns/attributes, so i can think
of why that will help in increasing efficiency/clustering factor of an
index based on those exact columns/order and also better compression ratio
etc. , but will it not degrade other
With regards to the option of setting the "indexing off" for the historical
partition after declaring partial index, I was thinking just in case if any
one off queries would be using any of the "most selective index" for
current and + historical partitions scan, that will take a hit. So just for
the safer side I am planning to check the current_obj# of the
dba_hist_active_sess_history and if any of the index partitions are not
showing up here(i.e not getting read/accessed) that should be a good
candidate for this option. Please correct if my approach is wrong?
On Tue, Jan 10, 2023 at 10:38 PM Clay Jackson <dmarc-noreply_at_freelists.org>
wrote:
> What JL and MWF said, ESPECIALLY:
indexes clustering factor/efficiency/compression ratio at same time , as
the ordering of data is strictly based on some other set of columns now?
>
> “Doing cyclical operational stuff like this strikes some folks as
> complicated. Designing it so that it is relatively simple and routine in
> execution and in particular is relatively simple for operations to always
> get right is important”
>
>
>
> It seems to me that the savings here would provide more than enough
> “incentive” to set up the cycles.
>
>
>
> Also don’t forget to TEST, instrument the processes so that they can be
> monitored and have contingency plans for failures in place BEFORE they
> happen i.e – plan for running out of space in the partitioning/unhook
> process.
>
>
>
> Clay Jackson
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Mark W. Farnham
> *Sent:* Tuesday, January 10, 2023 8:54 AM
> *To:* jlewisoracle_at_gmail.com; 'Oracle L' <oracle-l_at_freelists.org>
> *Subject:* RE: Suggestion on compression
>
>
>
> *CAUTION:* This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
>
>
>
> Everything JL wrote, +1.
>
>
>
> Another approach entirely is to unhook each partition as it passes the
> commonly queried period (you might want to do that for 7 days even though
> you only need 5 to facilitate remaining sane on a weekly operational
> frequency.) It is possible that no partitioning at all and a daily cycle is
> practical (probably using the alter synonym trick at midnight), leaving you
> with both a daily and week old schedule (one for shifting where insert
> update delete is possible [daily], the other for changing from indexes
> optimal for recent data queries to 30 day queries).
>
>
>
> The unhook would be to a cleverly and consistently named table, probably
> <thing>_MMDDYY or something similar. Then your monthly queries become union
> all queries of the seven partitions and the youngest 23 unhooked tables,
> which come up as effectively parallel 30 with serial on each of the 30
> individuals.
>
>
>
> A reason WHY you might want to do this is if fewer indexes are needed for
> the “monthly” queries and/or slightly different indexes that might be a
> drag on insert, update, and delete might make queries on the disconnected
> partitions that have become tables much more efficient.
>
>
>
> A reason WHY you might want to do this is if you don’t really need the
> detail for some (or all!) of your monthly queries, so permanently creating
> permanently cast aggregations as new tables becomes part of the unhook
> process. IF ALL the monthly queries are aggregations, you might consider a
> schedule less than 365 days for discarding detail tables from the online
> database. (Unloaded into something easy to reload, of course, in line with
> your overall information retention and purging schedule. You might need a
> detail row to prove something you are never routinely interested in
> reporting on.)
>
>
>
> A reason WHY you might NOT even think about this is if you have a
> non-period based uniqueness index across all time (which I doubt, since you
> write you are limited to 365-ish days).
>
> A reason WHY you might NOT want to pursue this is if your ad hoc 30 day
> window queries that require different aggregations or indexes frequently
> erupt.
>
>
>
> There certainly may be additional pro or con reasons beyond my terse
> enumeration.
>
>
>
> Probably the general health and friendliness (to the CBO) of an object
> with 7 partitions is better than one with 365 partitions.
>
> Possibly the total online storage size required drops because either you
> don’t need all the indexes or you might need many fewer “partitions” on
> line or you only need to keep aggregates older than 7 days on line. The
> “fan-in” of any possible aggregations has an enormous correlation on
> whether or not dramatic reductions in size are possible.
>
> Possibly old investigative queries on offline detail transaction can be
> done on a different hardware resource.
>
>
>
> Doing cyclical operational stuff like this strikes some folks as
> complicated. Designing it so that it is relatively simple and routine in
> execution and in particular is relatively simple for operations to always
> get right is important.
>
>
>
> Good luck,
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [
> mailto:oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>] *On
> Behalf Of *Jonathan Lewis
> *Sent:* Tuesday, January 10, 2023 5:32 AM
> *To:* Oracle L
> *Subject:* Re: Suggestion on compression
>
>
>
>
>
> Do you have a decent-sized test system? It doesn't matter what the
> manuals say about how things work (and how well things work), you need to
> test for YOUR specific setup because you may have some extreme case or
> boundary condition that causes the generic implementation to break.
>
>
>
> Having said that (and using 45 to give you a safety margin over the 30
> days you say need need as "current".
>
>
>
> a) Set MOST of your indexes to "indexing partial", then for (table)
> partitions older that 45 days set indexing off.
>
> b) I wouldn't set index compression to automatic, generall it's a
> declaration that you don't know your data and if you're worried about the
> impact of this specific table then you should know the data very well. I
> would set specific indexes to fixed size compression.
>
> c) Why "query high" when you've implied that you hardly ever read data
> older than 30 days, why not archive high? You may want to play a little
> safer and say (e.g. archive high over 180 days, query high from 45 to 180
> days). (Query high might be a little CPU intensive, so you might want to
> use query low rather than query high, but you'll have to do the
> cost/benefit analysis for your system.)
>
> d) At the point you move from non-compressed to compressed you could
> consider include an "attribute clustering" clause - this might improve the
> performance of some queries, and might improve the compression slightly.
>
> e) Monitor dba_tab_modifications for the table (and its partitions) so
> that you can see how many changes are reported for compressed partitions -
> each change is likely to lead to a chained row, so you need to come up with
> an algorithm for choosing to rebuild any partition that has had "too many"
> changes.
>
>
>
> Regards
>
> Jonathan Lewis
>
>
>
>
>
>
>
> On Wed, 4 Jan 2023 at 07:01, yudhi s <learnerdatabase99_at_gmail.com> wrote:
>
> Hello Experts, Happy new year to all.
>
>
> Considering Oracle database version 19C(19.15 to be specific and its
> Exadata-X8 machine). The database having majority of the transaction tables
> are daily range partitioned on a business date column holding truncated
> date values(without any time component). These tables are holding ~1 year+
> worth of data i.e. ~365+ partitions.
>
> The DMLS/write only happens on the live partitions or say today's day
> data(Say Nth partition).
>
> The querying or reading happens mainly last ~5days data or latest N-5
> partitions. However during monthend process(which runs once in a month) ,
> it reads/scans a month worth of data i.e. all the partitions > sysdate-30
> or latest N-30 partitions. Rest of the data/partitions are read/written in
> adhoc basis with minimal frequency.
>
> Now, we are seeing a space crunch in the box and also considering future
> growth we have been asked to compress all the possible table/indexes
> partitions. My question is , in the above scenario, will it be a good
> approach..
>
> 1)To just HCC compress all the historical table partitions i.e. all the
> partitions with partition position <N-30th using 'compress for query high'
> option.
> 2)And for all <N-30th local index partitions with 'compress advanced low'
> option. As "ADVANCE" index compression will let oracle choose the best
> index prefix key in local partition level and leaf block level
> automatically rather than us to worry about that.
>
> Or any other strategy would be best in the above situation for these
> transaction tables?
>
> As I understand the writing will not be a concern as we wont be touching
> the live partitions, however in case the reading(i.e. index scan, cell
> smart scan etc.) will happen on the compressed table/index data , it can
> change the current performance figures. Is my above understanding correct
> here?
>
>
>
> Regards
>
> Yudhi
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Jan 11 2023 - 19:45:33 CET