Re: ASH reporting on SQL spending most of time on CPU

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Fri, 27 Aug 2021 12:41:26 +0100
Message-ID: <CAGtsp8kc90qnwAdYDAnOi=w0Qn2V_+aQ7uvR1DQD7fFUOB7+uQ_at_mail.gmail.com>



Two basic points

  1. If you use the same table more than once in a single query don't use the same alias every time append a counter to the base alias e.g. msm, msm2, ct, ct2. This makes it easier to read and reduces the risk of a logic error getting into subqueries. It also makes the predicate information easier to read.
  2. There's a very clear indication in the SQL Monitor report that the optimizer has picked a very bad plan because of a very bad cardinality estimate, and as a starting point this means you probably have some stats on the tables that do not represent the truth. I see that you have dynamic_sampling at level 6 so maybe you've deliberately left some tables with no stats, so you may not be allowed to do anything about gathering suitable stats.

The clue about stats is that the hash join at operation 14 that joins to the customer table has an estimate of 4114 that drops to an estimate of one after the join to cust_terr, when the join to cust_terr drops the actual row coutns from 494,000 to 61,000. The estimate of 1 is the reason why the optimizer then thinks a nested loop anti-join is a good idea for the "NOT IN" transformation when (perhaps) a HASH anti-join would be far better.

Another indicator is that the anti join has eliminated all 61,000 rows (and checking furhter down the plan you've got through 6,300 customers out of 5M, so that plan has a long time to run.

SO: a quick and dirty test -
how long would it take to produce the result of the main query if you got rid of the subquery but added ct.cust_id to the select list, and how many rows.
how long would it take to produce the FULL set of data from the subquery if you ran it standalone, again adding the c.cust_id to the list but eliminating the correlation predicate (c.cust_id = ct.cust_id). I think it should be possible to structure the query to run in a little more than the time it takes to produce those two results.

One thing that puzzles me - why do you have a NOT IN subquery with a correlation predicate rather than a NOT EXISTS ?

Regards
Jonathan Lewis

(P.S. as a really quick ahd dirty test, try adding the hints /*+ unnest(UNNEST_INNERJ_DISTINCT_VIEW NO_MERGE NO_PUSH_PRED) */ to the subquery and see if that changes the plan. I think it might not have any effect because of that correlation predicate)

On Wed, 25 Aug 2021 at 12:12, Goti <aryan.goti_at_gmail.com> wrote:

> Thanks Laurentiu for responding. Please find the below link which contains
> the sql monitor report. This was taken 5 minutes after it was started.The
> SQL ran for more than 5 hours and hence we had to cancel that. I don't see
> SQL in RTSM history too. I am not sure how far this will be useful.
>
>
> https://gist.githubusercontent.com/aryangoti/eae0776d88d46361c2c60d0d6b083680/raw/ec1ac6a6359ca3f99c6f7216850dfa09103b6d78/gistfile1.txt
>
> Thanks,
> Goti
>
>
> On Wed, Aug 25, 2021 at 4:07 PM Laurentiu Oprea <
> laurentiu.oprea06_at_gmail.com> wrote:
>
>> Hello,
>>
>> Can you attach a sql monitor report?
>>
>>
>> În mie., 25 aug. 2021 la 12:58, Goti <aryan.goti_at_gmail.com> a scris:
>>
>>> Hi All,
>>>
>>> We have a SQL performance issue and as per ASH the SQL is spending most
>>> of its time on CPU. Not sure which operation is taking more time. Can
>>> someone guide me how to troubleshoot what is causing the SQL to spend time
>>> on the CPU?
>>>
>>> SQL_ID EVENT SESSION MODULE
>>> PCT
>>> ------------------ ------------------------------------ -------
>>> ------------------------------------ -------
>>> fyfnh1pnxjg2d ON CPU PL/SQL
>>> Developer 91.9% <<<<<<<<<<<<
>>>
>>>
>>> OBJ OTYPE OID ONAME
>>> SQL_ID CLASS PCT
>>> --------------------------- --------------------------- ----------
>>> ------------------------------------ ------------------ ------------------
>>> ----------
>>> XIE1CUST_TERR INDEX PARTITION 106842
>>> XIE1CUST_TERR fyfnh1pnxjg2d data block
>>> 0%
>>> XFK1PARTITION_TERR_POSTAL INDEX PARTITION 107430
>>> XFK1PARTITION_TERR_POSTAL fyfnh1pnxjg2d data block
>>> .2%
>>> XFK1PARTITION_TERR_POSTAL INDEX PARTITION 107435
>>> XFK1PARTITION_TERR_POSTAL fyfnh1pnxjg2d
>>> 95.5%
>>> XIE1CUST_TERR INDEX PARTITION 106843
>>> XIE1CUST_TERR fyfnh1pnxjg2d data block
>>> 1.6%
>>> XIE1CUST_TERR INDEX PARTITION 106841
>>> XIE1CUST_TERR fyfnh1pnxjg2d data block
>>> .1%
>>> XIE4CUST_TERR INDEX PARTITION 106911
>>> XIE4CUST_TERR fyfnh1pnxjg2d free list
>>> 0%
>>>
>>> fyfnh1pnxjg2d 0%
>>> XIE1CUST_TERR INDEX PARTITION 106839
>>> XIE1CUST_TERR fyfnh1pnxjg2d data block
>>> 2.3%
>>> XIE1CUST_TERR INDEX PARTITION 106838
>>> XIE1CUST_TERR fyfnh1pnxjg2d data block
>>> .3%
>>>
>>> Execution Plan Details:
>>>
>>> https://gist.githubusercontent.com/aryangoti/7f0bc85cbe6df372e488deecbfdf30ef/raw/40598552b0a5c47c6857c86b06eabbca2d5daf8a/gistfile1.txt
>>>
>>> Thanks,
>>> Goti
>>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Aug 27 2021 - 13:41:26 CEST

Original text of this message