Re: Improving query performance further
Date: Tue, 18 Oct 2022 14:15:11 +0530
Message-ID: <CAEzWdqfPPCfWyQw7aaU3cK9h0=Tjcg+h-fuxvDS6-WK22ft3AA_at_mail.gmail.com>
Thank you Jonathan.
Yes, if you are pointing to domain indexes then yes, Andy has also pointed to that same thing, in this thread. But I have not much exposure working in domain indexes and am still not able to figure out how to frame the filter predicated here, as a domain index to help this query.
On Mon, 17 Oct, 2022, 3:38 pm Jonathan Lewis, <jlewisoracle_at_gmail.com> wrote:
>
> I should have predicted that problem with bitmap AND before you posted the
> result. Given the pattern of the data the indexes were going to be roughly
> the same size and the query would have to access a total number of index
> blocks that was roughly the same as the size of one complete index - and do
> a big CPU intensive merge. I forgot to allow for the large number of
> block accesses and was only thinking about the CPU cost of the merge.
>
> Essentially it's very similar to optimizing "see if there's a town
> anywhere south and east of here" - but I think you said you'd already
> tried modelling it through spatial techniques and couldn't get it to run
> any faster.
>
> Regards
> Jonathan Lewis
>
>
>
> On Sun, 16 Oct 2022 at 20:14, yudhi s <learnerdatabase99_at_gmail.com> wrote:
>
>> Yes its truncate +load table with no Update and delete happening on it.
>> We tried it with a sample table and similar data pattern on a
>> lower environment. The bitmap index with pct_free "0" performance
>> appears to be worse compared to the composite b-tree index on (part_col,
>> min_val,max_val). Tried both by using index_join and without it also.
>>
>> INDEX_TYPE LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS
>> NORMAL 1116 150001 4929
>> 150001
>> BITMAP 754 126350
>> 149999 149999
>> BITMAP 754 126352
>> 150001 150001
>>
>> Below is the sql monitor comparison of both the plans.
>>
>> https://gist.github.com/databasetech0073/b576a301943ecf6f04bc2ab3249e316a
>>
>>
>>
>> On Sun, Oct 16, 2022 at 6:08 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>> wrote:
>>
>>>
>>> Another idea that occurs to me following on from your comment about the
>>> daily truncate and reload.
>>>
>>> Do you update the min_val or max_val columns of the table once you've
>>> loaded the data, do you insert or delete rows. (Maybe you've answer the
>>> last question already, but I haven't gone back to re-read all the posts).
>>>
>>> If the key components of the data are effectively read-only as soon as
>>> you've loaded it then you could create two (local) bitmap indexes, one on
>>> (min_val) one on (max_val). Since you average about 2 rows per value the
>>> size of the indexes would be about the same whether they're B-tree
>>> (compreessed) or bitmap, but the bitmap might be a little smaller.
>>>
>>> With those indexes in place the reduction in block visits for range
>>> scans might be more than enough to offset the CPU time needed to handle the
>>> bitmap merges and OR's.
>>>
>>> (The indexes should also be built with pctfree zero if they don't change
>>> during the day),
>>>
>>> Regards
>>> Jonathan Lewis
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, 16 Oct 2022 at 10:41, yudhi s <learnerdatabase99_at_gmail.com>
>>> wrote:
>>>
>>>> Thank you so much for the suggestions.
>>>> Not sure how it will impact histograms but yes, as Lok mentioned, all
>>>> the data stored in this column min_val and max_val appears to be Number
>>>> only and the bind values i saw were also not holding any characters in
>>>> them, but the input bind coming from application to the query , is of data
>>>> type varchar2 and also the datatype of the columns are defined varchar2 in
>>>> the table. Also, checked the input bind values captured and the data
>>>> pattern, almost 60% of the input binds are closer to min value and ~40% are
>>>> closer to max value but again most of the binds are still resulting >50K
>>>> rows in either of those filters.
>>>>
>>>> And if i get it correct, individual histograms on these two columns
>>>> should suffice (e.g. 'for all columns size repeat, for columns size
>>>> 254/2048 min_val, max_val') rather than a column group histogram for the
>>>> optimizer. With regards to caching , as here the majority of io/time is
>>>> spent only while accessing the index partition only, so putting the table
>>>> in cache may not help much but index should. Please correct me if my
>>>> understanding is wrong. Also I was looking for a quick fix as to hint it
>>>> some way to cache the objects , but it looks like we have cache hints
>>>> available for tables but not for Indexes or index partitions. It has to be
>>>> put in "keep" cache which is not possible through any hints etc.
>>>>
>>>> On Sat, Oct 15, 2022 at 2:04 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>> I did point out that the strategy would be fairly fragile, but there
>>>>> are a couple of points in its favour if the optimizer decides to use
>>>>> adaptive cursor sharing.
>>>>>
>>>>> a) This is 19c so the histogram could have 2,048 buckets (though the
>>>>> default of 254 might be sufficient anyway)
>>>>>
>>>>> b) When the low_value and high_value both look like numbers and the
>>>>> input is numeric the optimizer behaves as it's dealing with numbers and
>>>>> that will help with the arithmetic. I've got a note about that on my blog,
>>>>> I think, but can't find it at present.
>>>>>
>>>>> c) Histograms model skew - and in this case the skew is in the size of
>>>>> range for a given number of rows; histograms don't have to be about
>>>>> capturing "rows per value" (see:
>>>>> https://jonathanlewis.wordpress.com/2016/09/26/pk-histogram/ ). In
>>>>> this case I think the OP has indicated that the data is more dense round
>>>>> the extreme values and sparse in the middle.
>>>>>
>>>>>
>>>>> Regards
>>>>> Jonathan Lewis
>>>>>
>>>>>
>>>>> On Fri, 14 Oct 2022 at 22:33, Lok P <loknath.73_at_gmail.com> wrote:
>>>>>
>>>>>> Also Jonathan, as the OP mentioned there exists ~120K distinct values
>>>>>> in both the min and max columns and that to those are varchar2 type
>>>>>> holding numbers. So the histograms on each of those with max ~254 buckets
>>>>>> may not be able to give 100% correct estimation for these ranges prdicates
>>>>>> for all the binds and thus endup impacting the query response.
>>>>>>
>>>>>>
>>>>>>
-- http://www.freelists.org/webpage/oracle-lReceived on Tue Oct 18 2022 - 10:45:11 CEST