Re: Nested loop cost looks too high on 19c

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Tue, 31 May 2022 20:23:17 +0100
Message-ID: <CAGtsp8=xxm24uV6KJVDnZt8GJRQGhyR1ORXL4gPPss5E9rzuBw_at_mail.gmail.com>



You are correct - that global num_distinct is driving the difference in the arithmetic

Your next step would be to check the sample_size reported by the two systems. One may have been using a small sample while the other was using the approximate_ndv with 100% sample. Then check if you're using incremental statistics (and how it's configured) in both systems. If you're using incremental in 19c but not in 11g then the num_distinct in 11g would almost inevitably be larger than that in 19c which would be using synopses to get a better global
num_distinct. Then check rows and num_distinct partition by partition.

Regards
Jonathan Lewis

On Tue, 31 May 2022 at 19:37, Pap <oracle.developer35_at_gmail.com> wrote:

> Thank You Jonathan.
>
> The line no-3 in the plan is the join between table TAB_PART and TSFS and
> the estimation
>
> on 11.2 is coming as 3(cardinality of TAB_PART)*154(cardinality of TSFS)
> =462 which is close to ~455
>
> however on 19c its 176(cardinality of TAB_PART)*209(cardinality of TSFS)=
> 36784 which is close to ~36692
>
> So here the key factor which is influencing the figure drastically is
> cardinality estimation of TAB_PART i.e 3 in 11.2 vs 176 in 19C and which is
> determined by the predicate "TAB_PART.TX_ID=TO_NUMBER(:B2)". Here TAB_PART
> is a range partitioned table and on this table stats is being gathered on
> individual partitions which is getting rolled up by oracle itself. We don't
> use incremental stats collection yet on this table. That is why the
> global_stats column is showing as 'NO'.
>
> Now comparing the table and column statistics of this between 11.2 and 19c
> databases below, the cardinality = density*num_rows, so here the difference
> in num_distinct/density for this column at the global level is playing a
> key role. I am sure the data pattern of column TX_ID is closely the same in
> both , so I was wondering why there is such a big difference in calculation
> of num_distinct/density by stats gathering process in both the versions? Or
> is there really a difference in how the num_distinct/density gets rolled up
> to global level from individual partitions in these two versions and thus
> moving to 19C with this may impact other queries too?
>
> *ON 11.2:- *
>
> *from dba_tables:- *
>
>
> *TABLE_NAME NUM_ROWS BLOCKS*
>
> TAB_PART 68889040371 1132560197
>
> *from dba_tab_col_statistics:- *
>
>
> *TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY
> NUM_NULLS HISTOGRAM GLOBAL_STATS DENSITY*68889040371*
>
> TAB_PART TX_ID 23151767844 4.3193245834968E-11 0
> NONE NO 2.97554125607964
>
> *ON 19C:- *
>
> *from dba_tables:- *
>
> *TABLE_NAME NUM_ROWS BLOCKS*
>
> TAB_PART 64791566202 1055609782
>
> *from dba_tab_col_statistics:-*
>
>
> *TABLE_NAME COLUMN_NAME NUM_DISTINCT DENSITY
> NUM_NULLS HISTOGRAM GLOBAL_STATS DENSITY*64791566202*
>
> TAB_PART TX_ID 368368793 2.71467078374362E-9 0
> NONE NO 175.88777180156
>
>
> On Tue, May 31, 2022 at 10:23 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>> In the 11g plan:
>> Operation 2 has a cost of 768K because operation 3 is telling it that it
>> will call operation 10 455 times at a cost of 1683 each time.
>> 455 * 1683 = 765,765, then add a bit for the line itself using CPU.
>>
>> In the 19c hinted plan
>> Operation 2 has a cost of t1M because operation 3 is telling it that it
>> will call operation 10 36693 times at a cost of 1677 each time.
>> 36693 * 1677 = 61,534,161
>>
>> Thanks to self-caching and other details the multiplication isn't always
>> this close - but it's a good guideline in many cases.
>>
>> Your first step should be to work out why the cardinality estimate of
>> line 3 is so much higher.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>> On Mon, 30 May 2022 at 16:11, Pap <oracle.developer35_at_gmail.com> wrote:
>>
>>> Hello, While testing the behavior of queries on one of the 19.11.0.0.0
>>> database. Seeing the same query which running in quick time on version
>>> 11.2.0.4 is running longer on 19C with different path. But when forced the
>>> outline of the 11.2 path the cost comes higher in the 19C database , which
>>> i was expecting but not this much high it's 760k vs 61million . And when i
>>> checked the path the cost of nested loop operation is too high on 19C at
>>> plan_line_id-2. So wanted to understand if this is expected behavior or
>>> hitting any bug here?
>>>
>>> Below is the sql and i have added its execution path with version 11.2,
>>> 19c and forced 11.2 path on 19c database. Also i have added the sql monitor
>>> for each of those in below location.
>>>
>>> https://gist.github.com/oracle9999/69651125d9a3942e8fb261669611e7aa
>>>
>>>
>>> select *
>>>
>>> FROM TAB_PART TAB_PART, FEES fees
>>>
>>> WHERE fees.SB_ID IN (SELECT DISTINCT RS_ID FROM TSFS WHERE OS_ID =
>>> TAB_PART.SB_ID)
>>>
>>> AND fees.B_STS = 'XXX'
>>>
>>> AND TAB_PART.SM_ID = fees.OB_ID
>>>
>>> AND TAB_PART.TX_ID = :b2;
>>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue May 31 2022 - 21:23:17 CEST

Original text of this message