Re: Tuning Update query

From: Lok P <loknath.73_at_gmail.com>
Date: Mon, 3 May 2021 17:27:57 +0530
Message-ID: <CAKna9Vas8KoDbjf8LcMaUXncYvC1W0C=1NFS6Sirj6ChO4WoyQ_at_mail.gmail.com>



Thank You Jonathan and Lothar. I was trying to modify one of the sample updates(as below) as per your suggested method.

UPDATE GTT_TAB TMP   SET (c1, c2, c3, c4,c5.. c11) =

         ( (SELECT col1,col2,..col11

              FROM PRODUCT_TAB P              WHERE P.TID = TMP.TID

                   AND TMP.CIND = 'Y'

                   AND P.PART_DT = TMP.PART_DT

                   AND P.CODE = 'XX'))

Jonathan, let me know if my understanding is correct on the suggested points *" a) See if you can minimise the number of update ste[ps by updating from a JOIN of several source tables" *

So if I get the above point correct then I was trying to modify the UPDATE as below , but I am getting ORA-01779 while running to see the plan. So does it mean that the GTT has to have a unique key present in it to have this method work?

UPDATE (SELECT P.COL1...P.col11,TMP.C1.. TMP.C11

              FROM PRODUCT_TAB P,GTT_TAB TMP              WHERE P.TID = TMP.TID

                   AND TMP.CIND = 'Y'

                   AND P.PART_DT = TMP.PART_DT

                   AND P.CODE = 'XX'

)      SET    C1=COL1,    ...     C11=COL11;

ERROR at line 10:

ORA-01779: cannot modify a column which maps to a non key-preserved table

Regarding below point ,
*"b) Rewrite the code to step through a series of GTT doing* * insert /*+ append */ into next_gtt select from previous_gtt join {source tables}"*

Do you mean i have to replace the UPDATE with INSERT /*+APPEND*/... queries but as it cant be done in the same GTT , so i have to create multiple GTT's for each UPDATES, so as to replace them with INSERT APPEND queries?

Lothar , as you suggested , i tried modifying the same above query as below MERGE statement. Though i was not able to see the plan on production because its a DML, so i tried running it on DEV and as the data volume was not same so with some force cardinality hints, i do see a HASH JOIN operation, but not seeing bloom filter right away. Maybe , we will see it in prod. And also i remember , there was a bug on 11.2.0.4 which was restricting the cell smart scan in case of DML queries and work around for that was to set _serial_direct_read=always. Not sure if we have to do that here in case of MERGE statement to go for cell offloading. Also if it's worth modifying each of these 4-5 different UPDATES to one MERGE statement each? I was thinking if combining them to one is possible and will give benefit.

MERGE INTO GTT_TAB TMP USING (SELECT col1,col2,..col11,CODE, PART_DT,TID

              FROM PRODUCT_TAB P WHERE P.CODE = 'XX')P              On ( P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT )

WHEN MATCHED THEN UPDATE  SET TMP.C1=P.COL1,        .....

       TMP.C11=P.col11

where TMP.CIND = 'Y'

Execution Plan


Plan hash value: 2005806797


| Id | Operation                    | Name                 | Rows | Bytes
|TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
|  0 | MERGE STATEMENT              |                      |    1 | 1339
|      |  256K (1)| 00:51:23 |      |      |

|  1 | MERGE                       |
GTT_TAB              |      |      |      |           |      |      |      |

|  2
|  VIEW                       |                      |      |      |
   |           |      |      |      |

|* 3 |   HASH JOIN                 |                      |    1 |  639
|  732M|  256K (1)| 00:51:23 |      |      |

|  4 |    PARTITION RANGE
ALL      |                      | 8000K|  640M|      | 5894  (3)| 00:01:11
| 1 | 331 |
|* 5 |     TABLE ACCESS STORAGE FULL|
PRODUCT_TAB          | 8000K|  640M|      | 5894  (3)| 00:01:11 |    1
| 331 |

| 6 | TABLE ACCESS STORAGE FULL |

GTT_TAB              | 8000K| 4234M|      |  206  (0)| 00:00:03
|      |      |

-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


  3 - access("TID"="TMP"."TID" AND "PART_DT"="TMP"."PART_DT")

  5 - storage("P"."CODE"='XX')

      filter("P"."CODE"='XX')

Regards
Pap

On Mon, May 3, 2021 at 2:25 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

> If you're updating many columns of every row in a large (global temporary)
> table using lots of separate statements that use the "set .. subquery"
> strategy you are introducing 4 problems.
>
> a) Updating lots of columns means you will probably start doing lots of
> row migration - and you can't set pctfree on GTTs to avoid this, it WILL be
> 10%
>
> b) You will be generating a huge amount of undo
>
> c) If you aren't including a "where exists" predicate on the source table
> that matches the predicate on the set subquery you will be updating rows to
> null that don't need to be updated - the monitor report echoed by Mladen
> showed 17M rows updated, but only 8M executions of the subquery - which
> probably means 9M rows in the GTT updated to NULL.
>
> In your specific case there's (probably) a logic flaw in the update
> statement. You have a "where tmp.cind='Y'" in the subquery. This means
> (effectively) set the values to NULL when tmp.cind ='Y', NOT 'don't update
> the rows where tmp.cind='Y'. If you want the update to be selective then
> you need the predicate as part of the where clause on the update.
>
> Finally -
> If this is a table of roughly 17M rows (rather than the 8billion of the
> estimate) then the fastest strategy would probably be to
> a) See if you can minimise the number of update ste[ps by updating from a
> JOIN of several source tables
> b) Rewrite the code to step through a series of GTT doing
> insert /*+ append */ into next_gtt select from previous_gtt join
> {source tables}
>
> Using insert /*+ append */ with gtts will:
> a) use bulk processing rather that the row by row processing for filter
> subqueries
> b) miimise BOTH undo and redo.
>
> Regards
> Jonathan Lewis
>
>
>
>
>
> On Sun, 2 May 2021 at 19:53, Lok P <loknath.73_at_gmail.com> wrote:
>
>> Hello, We have a plsql procedure in which it updates multiple columns
>> of the global temporary table after loading it in different steps. It's
>> updated for different filter criteria and each of these UPDATES are running
>> for hours even if they are simple ones. I am trying to see if it's
>> possible to do it in an efficient fashion by combining all different
>> updates into one UPDATE statement or any other possible method? One initial
>> thought was to do it as part of the INSERT itself rather than updating
>> it through multiple statements at the later stage , but then I see the
>> data load/INSERT is happening for multiple if/else conditions which may not
>> be straight enough.
>>
>> Attached is the sql monitor for few of the long running UPDATES. And i am
>> seeing this is not really spending time for updating rows but rather in
>> finding the eligible rows for update i.e. the SELECT part of it is actually
>> taking time. So wanted to understand how these UPDATES can be done faster?
>>
>> Its version 11.2.0.4 of Oracle Exadata-X5.
>>
>> Regards
>>
>> Lok
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon May 03 2021 - 13:57:57 CEST

Original text of this message