Re: Cause behind execution plan change

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Wed, 21 Jul 2021 16:21:47 +0100
Message-ID: <CAGtsp8kQqt7CZ7o2MiZGRBGxv8-18-FCrK+QV6m8OdcDsk4Q3g_at_mail.gmail.com>



In simple terms the reason why you got a plan change is because you told Oracle you wanted it to pick a really bad execution plan, and it wasn't table to tell the difference between a really bad plan and a totally appalling plan.

You've dictated a join order, and insisted on nested loop joins all the way through - and that's exactly what you've got. Look at operation 5 of the plans - in one case you get 7M rows (compared to an estimated 5,558) which drops on the next join to 386K; on the other you get 13M rows (compared to an estimated 259) which drops to 760K on the next join.

Look at the cost of the tablescan of RTNID - it's "free", so it's not surprising that 259 tablescans of RTNID give a lower cost than 5,558 indexed access to the table.

First suggestion: get rid of the hints completely and let the optimizer do its own thing.

If that doesn't work well then take note of the fact that both plans report ACTUAL 4,000 rows (approximately) for the tabelscan of FT, but the bad plan estimates 197. Get rid of all the hints and add a cardinalit hint /*+ cardinality(ft 4000) */ to the query to make the optimizer estimate 4,000 rows for that table. Failing that put dynamic sampling hints up to level 4 for the little tables.

Regards
Jonathan Lewis

On Tue, 20 Jul 2021 at 13:37, Pap <oracle.developer35_at_gmail.com> wrote:

> Thank You Jonathan and Lok. Attaching again the query along with the
> outline and note section.
>
> I am seeing one usage of INTERNAL_FUNCTION around FFT.STCD but in the new
> plan(post function change), I am seeing two more usage of INTERNAL_FUNCTION
> around the ND.NE column. These columns are the same with respect to the
> data type in both sides of the predicate, why are these appearing and if
> anyway these are responsible for some wrong estimation?
>
> I had checked the dba_hist_sqlstat but didn't see any profiles attached
> for the old sql and als checked the plan from display_awr and the note
> section was only showing below i.e. usage of dynamic sampling only and
> nothing regarding sql profile or plan baselines either. But then when I
> query dba_sql_plan_baselines manually with the sql_text like '%...sample
> query text...%', I saw one entry there with ACCEPTED and ENABLED both
> columns set as 'YES'. And also the signature is matching with the query
> force_matching_signature. And I can see the last_executed column was also
> showing the date close to when we introduced the new modified sql into
> prod. So it seems this was the one getting used for old sql/query but the
> note section does not state that.
>
> So is it true that it may be possible that the note section of the
> display_awr function won't show the usage of profile/baseline but still it
> may be used by that query internally?
>
> Note
> -----
> - dynamic statistics used: dynamic sampling (level=2)
>
> On Mon, Jul 19, 2021 at 12:47 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>>
>> You shouldn't be using the ORDERED hint, by the way, you should learn how
>> to use the LEADING() hint.
>> And since you've dictated the join order for this query FT does not need
>> to be in the USE_NL() hint because it's the first table in the join order
>> so it's not going to appear as the second table in any of the joins. (See:
>> https://jonathanlewis.wordpress.com/2017/01/13/use_nl-hint/ , and for
>> the equivalent comment on the use_hash() hint see:
>> https://jonathanlewis.wordpress.com/2013/09/07/hash-joins/ )
>>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>> On Sun, 18 Jul 2021 at 20:35, Pap <oracle.developer35_at_gmail.com> wrote:
>>
>>> Hello listers, It's version 12.1.0.2.0 of oracle. We have done a change
>>> to the code inside the function which gets called from the SELECT query.
>>> But as its just been used in the SELECT part of the query ideally it should
>>> not change sql_id of the query and also the plan, but we also add one new
>>> additional input parameter(i.e. :B3 below) to the function call and thus
>>> sql_id got changed which is understood. But something which we are not able
>>> to understand is , why did the plan change occurred after this change?
>>>
>>> Attached is both the plans i.e the one it used to take in the past vs
>>> the current one which it's now taking. From the plan it does look like ,
>>> its cardinality estimation of global temporary table FT which causes the
>>> difference, as it puts table RTNID in index access vs FTS access in a
>>> nested loop. But the old query(before function change) was not taking the
>>> bad plan ever, but it started taking after function change. So wondering
>>> how a new input parameter addition to a function which is not part of the
>>> WHERE clause, can cause this sort of impact and how to fix it?
>>>
>>> In this query, all the tables are global temporary tables except FFT,
>>> which is a list partition table with partition key as CKEY.
>>>
>>> INSERT INTO RTF(...)
>>> SELECT /*+ ordered use_nl(ft FFT nd curr)*/ ND.NE, ND.NID, CUR.SCD,
>>> FT.FXID, FT.TFXID,
>>> fun1 (FFT.AMT, FT.STS, FT.PDT, :B3, TRUNC ( :B2), 'S'),
>>> fun1 (FFT.AMT, FT.STS, FT.PDT, :B3, TRUNC ( :B2), 'F'),
>>> TRUNC ( :B1),
>>> ND.MCID
>>> FROM FT , FFT , RTNID ND, RDCUR CUR
>>> WHERE FT.FFXID = FFT.FXID
>>> AND FT.ACK = FFT.CK
>>> AND FFT.CKEY = ND.NKEY
>>> AND ND.NE IN ('XX', 'YY', 'ZZ')
>>> AND FFT.STCD IN ('X', 'Y')
>>> AND FFT.CKEY = CUR.CKEY
>>>
>>>
>>>
>>>
>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Jul 21 2021 - 17:21:47 CEST

Original text of this message