Re: Tuning Update query

From: Lok P <loknath.73_at_gmail.com>
Date: Tue, 4 May 2021 12:43:04 +0530
Message-ID: <CAKna9Va6LADN9J3xY=4TfLGWgey4eaa7_ctGJCQ9UUnfbs9P3A_at_mail.gmail.com>

Thank you William. I have tried combining all the five different UPDATES into one merge query using CASE as you suggested. I will try to run a test and will compare the performances of all the below 5 UPDATE queries VS this single modified merge query.

I want to test the other approach as suggested by Jonathan, i.e. INSERT APPEND one as that should be fastest considering almost zero REDO+UNDO generation. But I am not able to visualize that , if it will need five different GTT's to be populated for each of the UPDATES to reach the final result set i.e. 5 different INSERT APPEND queries?

Modified UPDATE queries into One MERGE query****************

MERGE INTO GTT_TAB TMP USING (SELECT col1,col2,..col11,CODE, PART_DT,TID

FROM PRODUCT_TAB P)P ON ( P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT ) WHEN MATCHED THEN UPDATE SET TMP.C1=CASE WHEN TMP.CIND = 'Y' and P.CODE = 'XX' THEN P.COL1 else TMP.C1,

.....

TMP.C11=CASE WHEN TMP.CIND = 'Y' and P.CODE = 'XX' THEN P.COL11 else TMP.C11,

TMP.EASTATUS= CASE WHEN P.CODE = 'YY' THEN P.COL14 else TMP.EASTATUS,

TMP.rnum= CASE WHEN P.CODE = 'MM' THEN P.COL1 else TMP.rnum,

TMP.FS_REQ= CASE WHEN P.CODE = 'ZZ' THEN 'ZZ' || COL4 else TMP.FS_REQ,

TMP.FS_RES= CASE WHEN P.CODE = 'ZZ' THEN P.COL5 else TMP.FS_RES,

TMP.ANUM = CASE WHEN P.CODE = 'AA' and P.PART_DT BETWEEN :B2 AND :B1 and AND P.CODE = 'AA' AND P.col5 IS NOT NULL THEN P.COL5 else TMP.ANUM ;

Current five UPDATE Queries in the existing code******************

UPDATE GTT_TAB TMP SET (c1, c2, c3, c4,c5.. c11) =

( (SELECT col1,col2,..col11 FROM PRODUCT_TAB P WHERE P.TID
= TMP.TID AND TMP.CIND = 'Y' AND P.PART_DT = TMP.PART_DT AND P.CODE

'XX'));

UPDATE GTT_TAB TMP SET EASTATUS = (SELECT COL14 FROM PRODUCT_TAB P WHERE P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT AND P.CODE = 'YY') ;

UPDATE GTT_TAB SET RNUM =(SELECT COL1 FROM METHOD_TAB WHERE PART_DT
= GTT_TAB.PART_DT AND TID = GTT_TAB.TID AND MCODE = 'MM')
UPDATE GTT_TAB TMP SET (FS_REQ, FS_RES) = ( (SELECT 'ZZ' || COL4, COL5 FROM PRODUCT_TAB P WHERE P.TID
= TMP.TID AND P.PART_DT = TMP.PART_DT AND P.CODE = 'ZZ'))
UPDATE GTT_TAB TMP

  SET ANUM =    (SELECT col5       FROM PRODUCT_TAB P    WHERE    P.TID =
TMP.TID     AND P.PART_DT = TMP.PART_DT     AND P.PART_DT BETWEEN :B2 AND

:B1

                 AND P.CODE = 'AA'

                 AND P.col5 IS NOT NULL)

WHERE EXISTS (SELECT col5 FROM PRODUCT_TAB P WHERE P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT AND P.PART_DT BETWEEN :B2 AND :B1

                 AND P.CODE = 'AA'

                 AND P.col5 IS NOT NULL)

On Mon, May 3, 2021 at 7:41 PM William Robertson < william_at_williamrobertson.net> wrote:

> The ORA-01779 error means you would need a unique index or constraint on
> product_tab (part_dt, tid) to be able to use this syntax. (Strangely, I did
> not get the error in 21c. Maybe there is a new feature I am missing.)
>
> You may be able to combine the updates into a smaller number of merges
> even when different rules apply, using CASE expressions so that each column
> is only set to a new value when the relevant conditions are met, otherwise
> set it to itself.
>
> *when* *matched* *then* *update*
>     *set* tmp.c1 = *case when* x* = *y *then* p.col1 *else* tmp.c1 *end*
> *       , *tmp.c2 = *case when* a* =* b *then* p.col2 *else* tmp.c2 *end*
> *;*
>
>
> William
>
> On 3 May 2021, at 12:57, Lok P <loknath.73_at_gmail.com> wrote:
>
> Thank You Jonathan and Lothar. I was trying to modify one of the sample
> updates(as below) as per your suggested method.
>
> UPDATE GTT_TAB TMP
>   SET (c1, c2, c3, c4,c5.. c11) =
>          ( (SELECT col1,col2,..col11
>               FROM PRODUCT_TAB P
>              WHERE    P.TID = TMP.TID
>                    AND TMP.CIND = 'Y'
>                    AND P.PART_DT = TMP.PART_DT
>                    AND P.CODE = 'XX'))
>
> Jonathan, let me know if my understanding is correct on the suggested
> points
> " a) See if you can minimise the number of update ste[ps by updating from
> a JOIN of several source tables"
>
> So if I get the above point correct then I was trying to modify the UPDATE
> as below , but I am getting  ORA-01779 while running to see the plan. So
> does it mean that the GTT has to have a unique key present in it to have
> this method work?
>
> UPDATE (SELECT P.COL1...P.col11,TMP.C1.. TMP.C11
>               FROM PRODUCT_TAB P,GTT_TAB TMP
>              WHERE    P.TID = TMP.TID
>                    AND TMP.CIND = 'Y'
>                    AND P.PART_DT = TMP.PART_DT
>                    AND P.CODE = 'XX'
> )      SET    C1=COL1,    ...     C11=COL11;
>
> ERROR at line 10:
>
> ORA-01779: cannot modify a column which maps to a non key-preserved table
>
>
> Regarding below point ,
> "b) Rewrite the code to step through a series of GTT doing
>         insert /*+ append */ into next_gtt select from previous_gtt join
> {source tables}"
>
> Do you mean i have to replace the UPDATE with INSERT /*+APPEND*/...
> queries but as it cant be done in the same GTT , so i have to create
> multiple GTT's for each UPDATES, so as to replace them with INSERT APPEND
> queries?
>
>
> Lothar , as you suggested , i tried modifying the same above query as
> below MERGE statement. Though i was not able to see the plan on production
> because its a DML, so i tried running it on DEV and as the data volume was
> not same so with some force cardinality hints, i do see a HASH JOIN
> operation, but not seeing bloom filter right away. Maybe   , we will see it
> in prod. And also i remember , there was a bug on 11.2.0.4 which was
> restricting the cell smart scan in case of DML queries and work around for
> that was to set _serial_direct_read=always. Not sure if we have to do that
> here in case of MERGE statement to go for cell offloading. Also if it's
> worth modifying each of these 4-5 different UPDATES to one MERGE statement
> each?  I was thinking if combining them to one is possible and will give
> benefit.
>
> MERGE INTO GTT_TAB TMP
> USING (SELECT col1,col2,..col11,CODE, PART_DT,TID
>               FROM PRODUCT_TAB P WHERE P.CODE = 'XX')P
>              On (    P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT   )
> WHEN MATCHED THEN UPDATE
>  SET TMP.C1=P.COL1,
>        ...
>        TMP.C11=P.col11
> where TMP.CIND = 'Y'
>
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 2005806797
>
>
>
> -------------------------------------------------------------------------------------------------------------------------------
>
> | Id | Operation                    | Name                 | Rows | Bytes
> |TempSpc| Cost (%CPU)| Time    | Pstart| Pstop |
>
> -------------------------------------------------------------------------------------------------------------------------------
> |  0 | MERGE STATEMENT              |                      |    1 | 1339 |
>      |  256K (1)| 00:51:23 |      |      |
> |  1 | MERGE                       | GTT_TAB              |      |      |
>      |           |      |      |      |
> |  2 |  VIEW                       |                      |      |      |
>      |           |      |      |      |
> |* 3 |   HASH JOIN                 |                      |    1 |  639 |
>  732M|  256K (1)| 00:51:23 |      |      |
> |  4 |    PARTITION RANGE ALL      |                      | 8000K|  640M|
>      | 5894  (3)| 00:01:11 |    1 |  331 |
> |* 5 |     TABLE ACCESS STORAGE FULL| PRODUCT_TAB          | 8000K|  640M|
>      | 5894  (3)| 00:01:11 |    1 |  331 |
> |  6 |    TABLE ACCESS STORAGE FULL | GTT_TAB              | 8000K| 4234M|
>      |  206  (0)| 00:00:03 |      |      |
>
> -------------------------------------------------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>   3 - access("TID"="TMP"."TID" AND "PART_DT"="TMP"."PART_DT")
>   5 - storage("P"."CODE"='XX')
>       filter("P"."CODE"='XX')
>
>
> Regards
> Pap
>
> On Mon, May 3, 2021 at 2:25 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>> If you're updating many columns of every row in a large (global
>> temporary) table using lots of separate statements that use the "set  ..
>> subquery" strategy you are introducing 4 problems.
>>
>> a) Updating lots of columns means you will probably start doing lots of
>> row migration - and you can't set pctfree on GTTs to avoid this, it WILL be
>> 10%
>>
>> b) You will be generating a huge amount of undo
>>
>> c) If you aren't including a "where exists" predicate on the source table
>> that matches the predicate on the set subquery you will be updating rows to
>> null that don't need to be updated - the monitor report echoed by Mladen
>> showed 17M rows updated, but only 8M executions of the  subquery - which
>> probably means 9M rows in the GTT updated to NULL.
>>
>> In your specific case there's (probably) a logic flaw in the update
>> statement. You have a "where tmp.cind='Y'" in the subquery. This means
>> (effectively) set the values to NULL when tmp.cind ='Y', NOT 'don't update
>> the rows where tmp.cind='Y'.  If you want the update to be selective then
>> you need the predicate as part of the where clause on the update.
>>
>> Finally -
>> If this is a table of roughly 17M rows (rather than the 8billion of the
>> estimate) then the fastest strategy would probably be to
>> a) See if you can minimise the number of update ste[ps by updating from a
>> JOIN of several source tables
>> b) Rewrite the code to step through a series of GTT doing
>>         insert /*+ append */ into next_gtt select from previous_gtt join
>> {source tables}
>>
>> Using insert /*+ append */ with gtts will:
>> a) use bulk processing rather that the row by row processing for filter
>> subqueries
>> b) miimise BOTH undo and redo.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>>
>> On Sun, 2 May 2021 at 19:53, Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Hello, We have a plsql procedure in which it  updates multiple columns
>>> of the global temporary table after loading it in different steps. It's
>>> updated for different filter criteria and each of these UPDATES are running
>>> for hours even if they are simple ones. I am trying to see if it's
>>> possible to do it in an efficient fashion by combining all different
>>> updates into one UPDATE statement or any other possible method? One initial
>>> thought was to do it as part of the INSERT itself rather than updating
>>> it through multiple statements at the later stage , but then I see the
>>> data load/INSERT is happening for multiple if/else conditions which may not
>>> be straight enough.
>>>
>>> Attached is the sql monitor for few of the long running UPDATES. And i
>>> am seeing this is not really spending time for updating rows but rather in
>>> finding the eligible rows for update i.e. the SELECT part of it is actually
>>> taking time. So wanted to understand how these UPDATES can be done faster?
>>>
>>>
>>>
>>> Its version 11.2.0.4 of Oracle Exadata-X5.
>>>
>>> Regards
>>>
>>> Lok
>>>
>>
>

--
http://www.freelists.org/webpage/oracle-l

Received on Tue May 04 2021 - 09:13:04 CEST

This message: [ Message body ]
Next message: Jonathan Lewis: "Re: Tuning Update query"
Previous message: Jonathan Lewis: "Re: Tuning Update query"
In reply to: William Robertson: "Re: Tuning Update query"
Next in thread: Jonathan Lewis: "Re: Tuning Update query"
Reply: Jonathan Lewis: "Re: Tuning Update query"
Reply: Mark W. Farnham: "RE: Tuning Update query"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message