Re: FW: Error related to hash memory
Date: Mon, 6 Dec 2021 09:15:11 +0000
Message-ID: <CAGtsp8n1_KtEEU-55HXq5AM1Kn0stozOFfMfGTUdzvVL7ZaEZg_at_mail.gmail.com>
On Fri, 3 Dec 2021 at 20:40, Lok P <loknath.73_at_gmail.com> wrote:
>
> Thank You so much Jonathan. I just tried to have that cardinality hint in
> the immediate upper block of the 'group by operation' as you pointed out
> and did a ''set autotrace traceonly explain. The plan does show the same
> "hash group by '' though. But the estimated tempspace is increased
> significantly. So does it mean that even if it's not changing to 'sort
> group by' but optimizer will adequately maintain the memory size of the
> hash table (say <2GB) in that 'hash group by' operation during run time,
> and won't break the query with that same Ora error?
>
> Also Jonathan, Is my understanding correct on the point :- To make the
> 'hash group by' operate in parallel, if we opt for 'enable parallel DML'
> option, in that case , along with alter 'enable paralle_dml in session
> level' It will also need the function fun1 used in the query as scalar
> subquery, to be defined as parallel_enable. Is this understanding correct?
>
> Posted below only specific part of the plan to make it look simple:-
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 1076833838
>
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> | Id | Operation | Name
> | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
> TQ |IN-OUT| PQ Distrib |
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> | 14 | FAST DUAL |
> | 1 | | | 2 (0)| 00:00:01 | | |
> | | |
> | 15 | FAST DUAL |
> | 1 | | | 2 (0)| 00:00:01 | | |
> | | |
> | 16 | FAST DUAL |
> | 1 | | | 2 (0)| 00:00:01 | | |
> | | |
> | 17 | FAST DUAL |
> | 1 | | | 2 (0)| 00:00:01 | | |
> | | |
> | 18 | PX COORDINATOR |
> | | | | | | | |
> | | |
> | 19 | PX SEND QC (RANDOM) | :TQ10005
> | 200M| 1602G| | 83M (1)|278:05:44 | | |
> Q1,05 | P->S | QC (RAND) |
> | 20 | HASH GROUP BY |
> | 200M| 1602G| 1525G| 83M (1)|278:05:44 | | |
> Q1,05 | PCWP | |
> | 21 | PX RECEIVE |
> | 200M| 1602G| | 83M (1)|278:05:44 | | |
> Q1,05 | PCWP | |
> | 22 | PX SEND HASH | :TQ10004
> | 200M| 1602G| | 83M (1)|278:05:44 | | |
> Q1,04 | P->P | HASH |
> | 23 | HASH GROUP BY |
> | 200M| 1602G| 1525G| 83M (1)|278:05:44 | | |
> Q1,04 | PCWP | |
> | 24 | VIEW |
> | 200M| 1602G| | 16 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 25 | NESTED LOOPS OUTER |
> | 1 | 717 | | 16 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 26 | NESTED LOOPS OUTER |
> | 1 | 673 | | 15 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 27 | NESTED LOOPS OUTER |
> | 1 | 626 | | 13 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 28 | NESTED LOOPS OUTER |
> | 1 | 577 | | 7 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> |* 29 | HASH JOIN |
> | 1 | 515 | | 6 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 30 | PX RECEIVE |
> | 1 | 370 | | 4 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 31 | PX SEND HASH | :TQ10003
> | 1 | 370 | | 4 (0)| 00:00:01 | | |
> Q1,03 | P->P | HASH |
> |* 32 | HASH JOIN BUFFERED |
> | 1 | 370 | | 4 (0)| 00:00:01 | | |
> Q1,03 | PCWP | |
> | 33 | BUFFER SORT |
> | | | | | | | |
> Q1,03 | PCWC | |
> | 34 | PX RECEIVE |
> | 1 | 197 | | 2 (0)| 00:00:01 | | |
> Q1,03 | PCWP | |
> | 35 | PX SEND HASH | :TQ10000
> | 1 | 197 | | 2 (0)| 00:00:01 | | |
> | S->P | HASH |
> | 36 | TABLE ACCESS STORAGE FULL | TTSFA
> | 1 | 197 | | 2 (0)| 00:00:01 | | |
> | | |
> | 37 | PX RECEIVE |
> | 1 | 173 | | 2 (0)| 00:00:01 | | |
> Q1,03 | PCWP | |
> | 38 | PX SEND HASH | :TQ10002
> | 1 | 173 | | 2 (0)| 00:00:01 | | |
> Q1,02 | P->P | HASH |
> | 39 | PX BLOCK ITERATOR |
> | 1 | 173 | | 2 (0)| 00:00:01 | | |
> Q1,02 | PCWC | |
> |* 40 | TABLE ACCESS STORAGE FULL | TTTD
> | 1 | 173 | | 2 (0)| 00:00:01 | | |
> Q1,02 | PCWP | |
> | 41 | BUFFER SORT |
> | | | | | | | |
> Q1,04 | PCWC | |
> | 42 | PX RECEIVE |
> | 1 | 145 | | 2 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 43 | PX SEND HASH | :TQ10001
> | 1 | 145 | | 2 (0)| 00:00:01 | | |
> | S->P | HASH |
> |* 44 | TABLE ACCESS STORAGE FULL | TTFE
> | 1 | 145 | | 2 (0)| 00:00:01 | | |
> | | |
> |* 45 | TABLE ACCESS BY INDEX ROWID | TLSV
> | 1 | 62 | | 2 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> |* 46 | INDEX RANGE SCAN | TLSV_IX2
> | 1 | | | 1 (0)| 00:00:01 | | |
> Q1,04 | PCWP | |
> | 47 | PARTITION RANGE ITERATOR |
> | 1 | 49 | | 22 (0)| 00:00:01 | KEY | KEY |
> Q1,04 | PCWP | |
> ..................
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
>
>
> On Fri, Dec 3, 2021 at 8:24 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>> Comments inline.
>> Regards
>> Jonathan Lewis
>>
>>
>> >>> As you said exactly there are many decodes/cases in the query and
>> scalar subqueries just before 'group by' applies. There is a
>> function(Pkg1.Fun1) used which is not parallel enabled. Also parallel hints
>> used in the Select part of the query. And this query does insert+append
>> into a global temporary table. Also in the FROM clause all are global
>> temporary tables except TFA, TFMA and TSLV.
>>
>>>
>>> >>> In regards to having scalar subqueries operated after group by ,
>>> would that not make the results different?
>>>
>> In some cases yes, in others no. The example I linked to doesn't
>> highlight the threat you're thinking of since it doesn't AGGREGATE by
>> department id before joining to get the department name; but if there had
>> been an original "aggregate by department name" and I had aggregated by
>> department ID and then joined to get the department name the result would
>> have been different if two departments had had the same name. So there is a
>> uniqueness requirement that you have to think about (8 times) in your
>> case.You may find that none of your scalar subqueries can be moved in the
>> same way. Another variation on the theme (which, again, may not be
>> appropriate because it changes the result) is to see if you can aggregate
>> any of the three tables on their join columns (in a no_merge view) before
>> doing the join. (Oracle can do this automatically in later versions, but
>> not - I think - in 11.g).
>>
>>
>>> >>> I am still trying to understand how to find the exact opt_estimate
>>> hints with a proper query block to make the optimizer estimate ~168million
>>> before "hash group by".
>>>
>> I've added a /*+ cardinality() */ hint to the start of your code
>> that I think is in the correct place. It's easier that remembering the
>> syntax for the opt_estimate()
>>
>> >>> And by setting the "event 32699" will that make the error disappear
>>> OR is it for debugging to find out the reason around the Ora- 32690?
>>>
>> This is simply to dump a trace file, not to work around the
>> problem.
>>
>>
>>> INSERT /*+ APPEND */
>>> INTO GTT_TABLE (.....)
>>> SELECT ......, SUM (crt), SUM (art), SUM (iamt), SUM
>>> (amt),...... ~46 different columns
>>> FROM (SELECT /*+ cardinality(168000000) */ ....33 columns,
>>>>
>>>>
>>
>>
>> <https://drive.google.com/u/0/settings/storage?hl=en-GB&utm_medium=web&utm_source=gmail&utm_campaign=storage_meter&utm_content=storage_normal>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>> On Fri, 3 Dec 2021 at 13:53, Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Thank You So Much Jonathan and Mark.
>>> As you said exactly there are many decodes/cases in the query and
>>> scalar subqueries just before 'group by' applies. There is a
>>> function(Pkg1.Fun1) used which is not parallel enabled. Also parallel hints
>>> used in the Select part of the query. And this query does insert+append
>>> into a global temporary table. Also in the FROM clause all are global
>>> temporary tables except TFA, TFMA and TSLV.
>>>
>>> It's a big Insert query and I tried to not put all the individual
>>> columns but just key functions(like decode, case etc). It groups ~46
>>> columns in that 'hash group by' but before that it fetches+transforms, ~55+
>>> columns data. I will try to make that function parallel enabled + enable
>>> parallel DML for the query and then hopefully that "hash group by" can be
>>> operated in parallel.
>>>
>>> Apology if these are silly questions,
>>> In regards to having scalar subqueries operated after group by , would
>>> that not make the results different?
>>> I am still trying to understand how to find the exact opt_estimate hints
>>> with a proper query block to make the optimizer estimate ~168million before
>>> "hash group by". And by setting the "event 32699" will that make the error
>>> disappear OR is it for debugging to find out the reason around the Ora-
>>> 32690?
>>>
>>>
>>> INSERT /*+ APPEND */
>>> INTO GTT_TABLE (.....)
>>> SELECT ......, SUM (crt), SUM (art), SUM (iamt), SUM
>>> (amt),...... ~46 different columns
>>> FROM (SELECT ....33 columns,
>>> DECODE ( Amt1, 1, (SELECT Pkg1.Fun1 (...)
>>> FROM DUAL), NULL),
>>> ..13 columns..,
>>> (SELECT DECODE (SUBSTR (Col_data,1,INSTR (Col_data,
>>> '*',1, 1)- 1),'XX', SUBSTR (Col_data,INSTR (Col_data,'*',1, 1)+ 1,( INSTR
>>> (Col_data,'*',1,2)- INSTR (Col_data,'*', 1,1))- 1))FROM DUAL)/ 100 Amt,
>>> (SELECT DECODE (MPCD,'YY', SUBSTR (Col_data,INSTR
>>> (Col_data, '*',1, 2) + 1,( INSTR (Col_data,'*',1, 3)- INSTR (Col_data,
>>> '*',1,2))- 1)) FROM DUAL) aty,,
>>> (SELECT DECODE (MPCD, 'YY', SUBSTR (Col_data, INSTR
>>> (Col_data, '*', 1,3) + 1, ( INSTR (Col_data,'*', 1, 4)- INSTR (Col_data,
>>> '*', 1, 3))- 1)) FROM DUAL) vnm,
>>> (SELECT DECODE (SUBSTR (Col_data, 1, INSTR
>>> (Col_data, '*', 1, 1) - 1), 'M1', SUBSTR (Col_data, INSTR (Col_data, '*',
>>> 1, 4)+ 1,( INSTR (Col_data, '*',1,5)- INSTR (Col_data, '*', 1, 4)) - 1),
>>> NULL) FROM DUAL) INM,
>>> (SELECT DECODE (MCID,9999, MDID, 9999, MDID, DECODE
>>> (SUBSTR (Col_data, 1, INSTR (Col_data,'*', 1,1) - 1),'M1', SUBSTR
>>> (Col_data,INSTR (Col_data, '*',1,2) + 1, ( INSTR (Col_data, '*',1,3) -
>>> INSTR (Col_data, '*', 1,2))-1))) FROM DUAL)
>>> FROM (SELECT /*+ parallel(TTTD,4) leading(TTSFA TTTD)*/
>>> DECODE (TTFE.C1, 'BBB', TTFE.CRT, NULL)
>>> .......9 columns..
>>> DECODE (TFMA.MACD, 'XX', TFMA.text1, 'XX',
>>> TFMA.text2, NULL) RDFI,
>>> ... 18 columns..,
>>> DECODE (TFMA.MACD, 'XX', TFMA.text1, NULL)
>>> cbcd,
>>> CASE
>>> WHEN INSTR ( :B7, PTCD) = 0 THEN
>>> TTSFA.ILVL
>>> ELSE
>>> CASE
>>> WHEN (SELECT PSS FROM tmfs FS
>>> WHERE FS.ptsid = TTFE.fsid) = 'N'
>>> THEN CASE
>>> WHEN (SELECT FEE.PSS FROM FEE
>>> FEE WHERE FEE.ptsid = TTFE.fsid AND FEE.PTMCD = TTTD.MPCD) =
>>> 'Y'
>>> THEN TTSFA.ILVL
>>> ELSE
>>> CASE
>>> WHEN INSTR ( :B6, PTCD)
>>> > 0
>>> THEN DECODE
>>> (TTTD.MPCD,'XX', 'XX', 'XX', 'XXX','XX', 'XX','XX', 'XX', NULL)
>>> WHEN INSTR ( :B5, PTCD)
>>> > 0
>>> THEN DECODE (TTTD.MPCD,
>>> 'XX', 'XX', TTSFA.ILVL)
>>> END
>>> END
>>> ELSE
>>> CASE
>>> WHEN (SELECT PSS FROM tmfs
>>> FS WHERE FS.ptsid = TTFE.fsid) = 'Y'
>>> THEN CASE
>>> WHEN (SELECT FEE.PSS
>>> FROM FEE FEE WHERE FEE.ptsid =TTFE.fsid AND FEE.PTMCD = TTTD.MPCD)
>>> ='N'
>>> THEN CASE
>>> WHEN INSTR
>>> (:B6,PTCD) > 0
>>> THEN DECODE
>>> (TTTD.MPCD, 'XX', 'XX','VI', 'XX', 'XX', 'XX', 'XX', 'XX', NULL)
>>> WHEN INSTR (
>>> :B5,PTCD) >0
>>> THEN DECODE
>>> (TTTD.MPCD, 'XX', 'XX', TTSFA.ILVL)
>>> END
>>> ELSE
>>> TTSFA.ILVL
>>> END
>>> END
>>> END
>>> END
>>> ILVL,
>>> DECODE (TTFE.C1, 'XX', TTFE.CRT, 'XX',
>>> TTFE.CRT, 0) CRT,
>>> DECODE (TTFE.C1, 'XX', TTFE.ARATE, 'XX',
>>> TTFE.amrt, 0) ARATE,
>>> CASE
>>> WHEN TTFE.C1 = 'XX' OR TTFE.C1 = 'XX'
>>> THEN ROUND ( TTFE.CRT+ ( NVL
>>> (TTTD.samt, TTTD.TAMT) * TTFE.amrt), 2) * -1
>>> ELSE 0
>>> END
>>> IN_AMT,
>>> CASE
>>> WHEN (SELECT COUNT (*) FROM TLSV
>>> WHERE LNM = :B9 AND DESCR = :B8 AND LV = TTFE.C1) >0 AND
>>> TTFE.TAMT <> 0
>>> THEN TTFE.AMT * ( NVL (TTTD.samt,
>>> TTTD.TAMT)/ TTFE.TAMT)
>>> ELSE 0
>>> END
>>> as_amt,
>>> CASE
>>> WHEN TLSV.LV IS NOT NULL
>>> THEN (TTFE.AMT / TTFE.FCNT)
>>> END
>>> DAMT,
>>> TFA.D_CHG / 100 D_CHG,
>>> TFPA.PCD|| '*'| TRIM (TFPA.text1)|| '*'||
>>> TFPA.TEXT3|| '*'|| TFPA.TEXT4|| '*'|| TFPA.TEXT5|| '*' Col_data,
>>> DECODE (TTSFA.CT_AMT, 0, NULL, NULL, NULL,
>>> 1) CO_AMT,
>>> DECODE (MACD, 'XX', TFMA.text2, NULL)
>>> S_CD,
>>> DECODE (MACD, 'XX', TFMA.TEXT5, NULL) ,
>>> DECODE (MACD, 'XX', TFMA.TEXT6, NULL) BC,
>>> DECODE (MACD, 'XX', TFMA.TEXT8, NULL)
>>> MID,
>>> DECODE (MACD, 'XX', TFMA.TEXT7, NULL)
>>> MTYP,
>>> DECODE ( MACD, 'XX', TO_CHAR ( TO_DATE (TRIM
>>> (TFMA.TEXT9),'yyyymmdd'), 'mm/dd/yyyy'), NULL) S_DT,
>>> DECODE (TFMA.text1, NULL, 'N', 'Y') ,
>>> TTTD.DT_MTHD
>>> FROM TTSFA , TTTD , TFA , TFMA, TTFE,
>>> (SELECT .. ~7 columns.. FROM TFA WHERE
>>> PCD IN ('XX', 'XX')) TFPA,
>>> (SELECT LV FROM TLSV WHERE LNM =
>>> 'XXXX' AND DESCR = 'XXXXXXXXX') TLSV
>>> WHERE TTSFA.did = TTTD.did AND TFPA.SDT(+) =
>>> TTTD.SDT AND TFPA.TDID(+) = TTTD.TDID
>>> AND TLSV.LV(+) = TTFE.C1 AND TTSFA.FDT
>>> = TTTD.SDT AND TTSFA.OSID = TTTD.SID AND TTTD.TDID = TFA.TDID(+)
>>> AND TTTD.SDT = TFA.SDT(+) AND TFA.UBD(+) =
>>> :B4 AND TTTD.TDID = TFMA.TDID(+) AND TTTD.SDT = TFMA.SDT(+)
>>> AND TFMA.MACD(+) = DECODE (TTTD.MPCD, 'XX',
>>> :B3, :B2) AND TTTD.DEP_IND = :B1
>>> AND TTSFA.did = TTTD.did AND TTSFA.SMID =
>>> TTTD.SMID AND TTFE.EID = TTSFA.did AND TTFE.IDT = TTSFA.SDT
>>> AND TTFE.SID = TTSFA.SID AND TTFE.OTYP
>>> = 'TTFE' AND TTFE.OID = TTTD.SMID AND TTFE.BST = 'XX'
>>> AND ( ( PTCD = 'XX' AND TTFE.fsid IS
>>> NOT NULL)
>>> OR (PTCD <> 'XX'))
>>> AND ( ( PTCD = 'XX' AND TTFE.fsid IS
>>> NOT NULL)
>>> OR (PTCD <> 'XX'))
>>> AND ( ( PTCD = 'XX' AND TTFE.fsid IS
>>> NOT NULL)
>>> OR (PTCD <> 'XX')) AND ( ( PTCD =
>>> 'XX' AND TTFE.fsid IS NOT NULL)
>>> OR (PTCD <> 'XX'))
>>> AND ( ( PTCD = 'XX' AND TTFE.fsid IS
>>> NOT NULL)
>>> OR (PTCD <> 'XX'))
>>> AND ( ( PTCD = 'XX' AND TTFE.fsid IS
>>> NOT NULL)
>>> OR (PTCD <> 'XX'))))
>>> GROUP BY .... ~46 columns
>>>
>>>
>>>
>>> On Thu, Dec 2, 2021 at 11:17 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>>>
>>>> all the stuff JL mentioned, plus a little bit that didn’t make it to
>>>> oracle-l before, probably because my remember to snip circuit is failing:
>>>>
>>>>
>>>>
>>>> Things I would try:
>>>>
>>>>
>>>>
>>>> 1) force parallel local for the query
>>>>
>>>> 2) if 1) alone fails, do 1) and on the instance of choice run a
>>>> scan of a non-indexed column of key tables of the query SIMPLY on the
>>>> instance of choice
>>>>
>>>> 3) if 1-2) fails, try a prelude of enumerating the disjoint
>>>> partition enumerators from “43 | PARTITION RANGE ITERATOR “ and
>>>> generate the union all query of the separate partitions individually, thus
>>>> making each hash memory set requirement smaller, doing these serially,
>>>> rather than in parallel (which might not be possible in 11 anyway) because
>>>> your problem is memory, not elapsed time. IF a single one of your
>>>> partitions is nearly everything, it still will probably go splat, but if
>>>> they are relatively flat each one being significantly smaller might avoid
>>>> the error. IF figuring out the enumerated list of partition predicates is
>>>> fast, this also has a **chance** to make everything faster, especially
>>>> if the CBO finds a better plan for a single partition at a time but also
>>>> because each hash area creation and probe might be a fraction of the
>>>> current single one. IF you have time to experiment, this **might** be
>>>> a consistent winner on both speed and footprint, but your mileage may vary.
>>>>
>>>> 4) forcing a sort group by **might** also work and would be easier
>>>> for most folks to code up.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* Mark W. Farnham [mailto:mwf_at_rsiz.com]
>>>> *Sent:* Thursday, December 02, 2021 10:04 AM
>>>> *To:* 'loknath.73_at_gmail.com'; 'Pap'
>>>> *Cc:* 'Oracle L'
>>>> *Subject:* RE: Error related to hash memory
>>>>
>>>>
>>>>
>>>> Things I would try:
>>>>
>>>> <snip>
>>>>
>>>>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Dec 06 2021 - 10:15:11 CET