Re: Improving query performance further

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Sun, 16 Oct 2022 13:38:08 +0100
Message-ID: <CAGtsp8kUreNRhMazcf-e1YUVoo_HiPp7y4zBhpOifCi9pDc1kA_at_mail.gmail.com>



Another idea that occurs to me following on from your comment about the daily truncate and reload.

Do you update the min_val or max_val columns of the table once you've loaded the data, do you insert or delete rows. (Maybe you've answer the last question already, but I haven't gone back to re-read all the posts).

If the key components of the data are effectively read-only as soon as you've loaded it then you could create two (local) bitmap indexes, one on (min_val) one on (max_val). Since you average about 2 rows per value the size of the indexes would be about the same whether they're B-tree (compreessed) or bitmap, but the bitmap might be a little smaller.

With those indexes in place the reduction in block visits for range scans might be more than enough to offset the CPU time needed to handle the bitmap merges and OR's.

(The indexes should also be built with pctfree zero if they don't change during the day),

Regards
Jonathan Lewis

On Sun, 16 Oct 2022 at 10:41, yudhi s <learnerdatabase99_at_gmail.com> wrote:

> Thank you so much for the suggestions.
> Not sure how it will impact histograms but yes, as Lok mentioned, all the
> data stored in this column min_val and max_val appears to be Number only
> and the bind values i saw were also not holding any characters in them, but
> the input bind coming from application to the query , is of data type
> varchar2 and also the datatype of the columns are defined varchar2 in the
> table. Also, checked the input bind values captured and the data pattern,
> almost 60% of the input binds are closer to min value and ~40% are closer
> to max value but again most of the binds are still resulting >50K rows in
> either of those filters.
>
> And if i get it correct, individual histograms on these two columns should
> suffice (e.g. 'for all columns size repeat, for columns size 254/2048
> min_val, max_val') rather than a column group histogram for the optimizer.
> With regards to caching , as here the majority of io/time is spent only
> while accessing the index partition only, so putting the table in cache may
> not help much but index should. Please correct me if my understanding is
> wrong. Also I was looking for a quick fix as to hint it some way to cache
> the objects , but it looks like we have cache hints available for tables
> but not for Indexes or index partitions. It has to be put in "keep" cache
> which is not possible through any hints etc.
>
> On Sat, Oct 15, 2022 at 2:04 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>>
>> I did point out that the strategy would be fairly fragile, but there are
>> a couple of points in its favour if the optimizer decides to use adaptive
>> cursor sharing.
>>
>> a) This is 19c so the histogram could have 2,048 buckets (though the
>> default of 254 might be sufficient anyway)
>>
>> b) When the low_value and high_value both look like numbers and the input
>> is numeric the optimizer behaves as it's dealing with numbers and that will
>> help with the arithmetic. I've got a note about that on my blog, I think,
>> but can't find it at present.
>>
>> c) Histograms model skew - and in this case the skew is in the size of
>> range for a given number of rows; histograms don't have to be about
>> capturing "rows per value" (see:
>> https://jonathanlewis.wordpress.com/2016/09/26/pk-histogram/ ). In this
>> case I think the OP has indicated that the data is more dense round the
>> extreme values and sparse in the middle.
>>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>> On Fri, 14 Oct 2022 at 22:33, Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Also Jonathan, as the OP mentioned there exists ~120K distinct values
>>> in both the min and max columns and that to those are varchar2 type
>>> holding numbers. So the histograms on each of those with max ~254 buckets
>>> may not be able to give 100% correct estimation for these ranges prdicates
>>> for all the binds and thus endup impacting the query response.
>>>
>>>
>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Oct 16 2022 - 14:38:08 CEST

Original text of this message