Re: Cause behind execution plan change

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Wed, 21 Jul 2021 21:53:31 -0400
Message-ID: <d6e2575d-a0c2-09df-1706-c93159a00281_at_gmail.com>


You don't have to use dynamic sampling, it's rather expensive and takes time. You can also use /*+ cardinality(FT, 500)  */ or whatever the number of rows you expect in the FT table. The same for the other GTT. Cardinality hints are a shortcut, they do almost the same thing as the dynamic sampling but using far less processing. The price for that cheaper processing is lower accuracy. However, you get the flexibility of fixing the plans because once given the cardinality hints, the CBO is not going to second guess you and check the number of rows anyway. My personal guideline is to use dynamic sampling for very complex queries and cardinality hint for simple queries.

On 7/21/21 2:23 PM, Pap wrote:
Thank you Jonathan. 
I think , I missed to notice one key point here , i.e. both FT and RTNID table are global temporary tables. So in that case, as you mentioned, may be higher level of dynamic sampling(/*+ dynamic_sampling(4)*/) will be enough making the estimation accurate and to get rid of the hints.  But in any case as you suggest we will first try to just get rid of existing hints and see how it goes with default dynamic sampling i.e. level-2. If it doesn't help we will go for forced dynamic_sampling hint(/*+ dynamic_sampling(4)*/) to the query. 

Regards
Pap 

On Wed, Jul 21, 2021 at 8:52 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:
In simple terms the reason why you got a plan change is because you told Oracle you wanted it to pick a really bad execution plan, and it wasn't table to tell the difference between a really bad plan and a totally appalling plan.

You've dictated a join order, and insisted on nested loop joins all the way through - and that's exactly what you've got.
Look at operation 5 of the plans - in one case you get 7M rows (compared to an estimated 5,558)  which drops on the next join to 386K; on the other you get 13M rows (compared to an estimated 259) which drops to 760K on the next join.

Look at the cost of the tablescan of RTNID - it's "free", so it's not surprising that 259 tablescans of RTNID give a lower cost than 5,558 indexed access to the table.

First suggestion: get rid of the hints completely and let the optimizer do its own thing.

If that doesn't work well then take note of the fact that both plans report ACTUAL 4,000 rows (approximately) for the tabelscan of FT, but the bad plan estimates 197. Get rid of all the hints and add a cardinalit hint /*+ cardinality(ft 4000) */ to the query to make the optimizer estimate 4,000 rows for that table. Failing that put dynamic sampling hints up to level 4 for the little tables.

Regards
Jonathan Lewis





On Tue, 20 Jul 2021 at 13:37, Pap <oracle.developer35_at_gmail.com> wrote:
Thank You Jonathan and Lok. Attaching again the query along with the outline and note section. 

I am seeing one usage of INTERNAL_FUNCTION around FFT.STCD but in the new plan(post function change), I am seeing two more usage of INTERNAL_FUNCTION around the ND.NE column. These columns are the same with respect to the data type in both sides of the predicate, why are these appearing and if anyway these are responsible for some wrong estimation?

I had checked the dba_hist_sqlstat but didn't see any profiles attached for the old sql and als checked the plan from display_awr and the note section was only showing below i.e. usage of dynamic sampling only and nothing regarding sql profile or plan baselines either. But then when I query dba_sql_plan_baselines manually with the sql_text like '%...sample query text...%', I saw one entry there with ACCEPTED and ENABLED both columns set as 'YES'. And also the signature is matching with the query force_matching_signature. And I can see the last_executed column was also showing the date close to when we introduced the new modified sql into prod. So it seems this was the one getting used for old sql/query but the note section does not state that.

 So is it true that it may be possible that the note section of the display_awr function won't show the usage of profile/baseline but still it may be used by that query internally?

 Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

On Mon, Jul 19, 2021 at 12:47 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

You shouldn't be using the ORDERED hint, by the way, you should learn how to use the LEADING() hint.
And since you've dictated the join order  for this query FT does not need to be in the USE_NL() hint because it's the first table in the join order so it's not going to appear as the second table in any of the joins. (See: https://jonathanlewis.wordpress.com/2017/01/13/use_nl-hint/ , and for the equivalent comment on the use_hash() hint see: https://jonathanlewis.wordpress.com/2013/09/07/hash-joins/ )


Regards
Jonathan Lewis



On Sun, 18 Jul 2021 at 20:35, Pap <oracle.developer35_at_gmail.com> wrote:
Hello listers,  It's version 12.1.0.2.0 of oracle. We have done a change to the code inside the function which gets called from the SELECT query. But as its just been used in the SELECT part of the query ideally it should not change sql_id of the query and also the plan, but we also add one new additional input parameter(i.e. :B3 below) to the function call and thus sql_id got changed which is understood. But something which we are not able to understand is , why did the plan change occurred after this change?

Attached is both the plans i.e the one it used to take in the past vs the current one which it's now taking. From the plan it does look like , its cardinality estimation of global temporary table FT which causes the difference, as it puts table RTNID in index access vs FTS access in a nested loop. But the old query(before function change) was not taking the bad plan ever, but it started taking after function change. So wondering how a new input parameter addition to a function which is not part of the WHERE clause, can cause this sort of impact and how to fix it?

In this query, all the tables are global temporary tables except FFT, which is a list partition table with partition key as CKEY.

INSERT INTO RTF(...)
SELECT /*+ ordered use_nl(ft FFT nd curr)*/   ND.NE, ND.NID,  CUR.SCD, FT.FXID, FT.TFXID,
          fun1 (FFT.AMT, FT.STS,  FT.PDT,    :B3, TRUNC ( :B2),   'S'),
          fun1 (FFT.AMT,  FT.STS,  FT.PDT,   :B3,   TRUNC ( :B2), 'F'),
          TRUNC ( :B1),
          ND.MCID
     FROM FT , FFT , RTNID ND, RDCUR CUR
    WHERE     FT.FFXID = FFT.FXID
          AND FT.ACK = FFT.CK
          AND FFT.CKEY = ND.NKEY
          AND ND.NE IN ('XX', 'YY', 'ZZ')
          AND FFT.STCD IN ('X', 'Y')
          AND FFT.CKEY = CUR.CKEY




-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com
-- http://www.freelists.org/webpage/oracle-l Received on Thu Jul 22 2021 - 03:53:31 CEST

Original text of this message