Re: Tuning Update query
Date: Tue, 4 May 2021 12:43:04 +0530
Message-ID: <CAKna9Va6LADN9J3xY=4TfLGWgey4eaa7_ctGJCQ9UUnfbs9P3A_at_mail.gmail.com>
Thank you William. I have tried combining all the five different UPDATES into one merge query using CASE as you suggested. I will try to run a test and will compare the performances of all the below 5 UPDATE queries VS this single modified merge query.
I want to test the other approach as suggested by Jonathan, i.e. INSERT APPEND one as that should be fastest considering almost zero REDO+UNDO generation. But I am not able to visualize that , if it will need five different GTT's to be populated for each of the UPDATES to reach the final result set i.e. 5 different INSERT APPEND queries?
- Modified UPDATE queries into One MERGE query****************
MERGE INTO GTT_TAB TMP USING (SELECT col1,col2,..col11,CODE, PART_DT,TID
FROM PRODUCT_TAB P)P ON ( P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT ) WHEN MATCHED THEN UPDATE SET TMP.C1=CASE WHEN TMP.CIND = 'Y' and P.CODE = 'XX' THEN P.COL1 else TMP.C1,
.....
TMP.C11=CASE WHEN TMP.CIND = 'Y' and P.CODE = 'XX' THEN P.COL11 else TMP.C11,
TMP.EASTATUS= CASE WHEN P.CODE = 'YY' THEN P.COL14 else TMP.EASTATUS,
TMP.rnum= CASE WHEN P.CODE = 'MM' THEN P.COL1 else TMP.rnum,
TMP.FS_REQ= CASE WHEN P.CODE = 'ZZ' THEN 'ZZ' || COL4 else TMP.FS_REQ,
TMP.FS_RES= CASE WHEN P.CODE = 'ZZ' THEN P.COL5 else TMP.FS_RES,
TMP.ANUM = CASE WHEN P.CODE = 'AA' and P.PART_DT BETWEEN :B2 AND :B1 and AND P.CODE = 'AA' AND P.col5 IS NOT NULL THEN P.COL5 else TMP.ANUM ;
- Current five UPDATE Queries in the existing code******************
UPDATE GTT_TAB TMP SET (c1, c2, c3, c4,c5.. c11) =
( (SELECT col1,col2,..col11 FROM PRODUCT_TAB P WHERE P.TID
= TMP.TID AND TMP.CIND = 'Y' AND P.PART_DT = TMP.PART_DT AND P.CODE
- 'XX'));
UPDATE GTT_TAB TMP SET EASTATUS = (SELECT COL14 FROM PRODUCT_TAB P WHERE P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT AND P.CODE = 'YY') ;
UPDATE GTT_TAB SET RNUM =(SELECT COL1 FROM METHOD_TAB WHERE PART_DT
= GTT_TAB.PART_DT AND TID = GTT_TAB.TID AND MCODE = 'MM')
UPDATE GTT_TAB TMP
SET (FS_REQ, FS_RES) = ( (SELECT 'ZZ' || COL4, COL5 FROM
PRODUCT_TAB P WHERE P.TID
= TMP.TID AND P.PART_DT = TMP.PART_DT AND P.CODE = 'ZZ'))
UPDATE GTT_TAB TMP
SET ANUM = (SELECT col5 FROM PRODUCT_TAB P WHERE P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT AND P.PART_DT BETWEEN :B2 AND:B1
AND P.CODE = 'AA' AND P.col5 IS NOT NULL)
WHERE EXISTS (SELECT col5 FROM PRODUCT_TAB P WHERE P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT AND P.PART_DT BETWEEN :B2 AND :B1
AND P.CODE = 'AA' AND P.col5 IS NOT NULL)
On Mon, May 3, 2021 at 7:41 PM William Robertson < william_at_williamrobertson.net> wrote:
> The ORA-01779 error means you would need a unique index or constraint on > product_tab (part_dt, tid) to be able to use this syntax. (Strangely, I did > not get the error in 21c. Maybe there is a new feature I am missing.) > > You may be able to combine the updates into a smaller number of merges > even when different rules apply, using CASE expressions so that each column > is only set to a new value when the relevant conditions are met, otherwise > set it to itself. > > *when* *matched* *then* *update* > *set* tmp.c1 = *case when* x* = *y *then* p.col1 *else* tmp.c1 *end* > * , *tmp.c2 = *case when* a* =* b *then* p.col2 *else* tmp.c2 *end* > *;* > > > William > > On 3 May 2021, at 12:57, Lok P <loknath.73_at_gmail.com> wrote: > > Thank You Jonathan and Lothar. I was trying to modify one of the sample > updates(as below) as per your suggested method. > > UPDATE GTT_TAB TMP > SET (c1, c2, c3, c4,c5.. c11) = > ( (SELECT col1,col2,..col11 > FROM PRODUCT_TAB P > WHERE P.TID = TMP.TID > AND TMP.CIND = 'Y' > AND P.PART_DT = TMP.PART_DT > AND P.CODE = 'XX')) > > Jonathan, let me know if my understanding is correct on the suggested > points > " a) See if you can minimise the number of update ste[ps by updating from > a JOIN of several source tables" > > So if I get the above point correct then I was trying to modify the UPDATE > as below , but I am getting ORA-01779 while running to see the plan. So > does it mean that the GTT has to have a unique key present in it to have > this method work? > > UPDATE (SELECT P.COL1...P.col11,TMP.C1.. TMP.C11 > FROM PRODUCT_TAB P,GTT_TAB TMP > WHERE P.TID = TMP.TID > AND TMP.CIND = 'Y' > AND P.PART_DT = TMP.PART_DT > AND P.CODE = 'XX' > ) SET C1=COL1, ... C11=COL11; > > ERROR at line 10: > > ORA-01779: cannot modify a column which maps to a non key-preserved table > > > Regarding below point , > "b) Rewrite the code to step through a series of GTT doing > insert /*+ append */ into next_gtt select from previous_gtt join > {source tables}" > > Do you mean i have to replace the UPDATE with INSERT /*+APPEND*/... > queries but as it cant be done in the same GTT , so i have to create > multiple GTT's for each UPDATES, so as to replace them with INSERT APPEND > queries? > > > Lothar , as you suggested , i tried modifying the same above query as > below MERGE statement. Though i was not able to see the plan on production > because its a DML, so i tried running it on DEV and as the data volume was > not same so with some force cardinality hints, i do see a HASH JOIN > operation, but not seeing bloom filter right away. Maybe , we will see it > in prod. And also i remember , there was a bug on 11.2.0.4 which was > restricting the cell smart scan in case of DML queries and work around for > that was to set _serial_direct_read=always. Not sure if we have to do that > here in case of MERGE statement to go for cell offloading. Also if it's > worth modifying each of these 4-5 different UPDATES to one MERGE statement > each? I was thinking if combining them to one is possible and will give > benefit. > > MERGE INTO GTT_TAB TMP > USING (SELECT col1,col2,..col11,CODE, PART_DT,TID > FROM PRODUCT_TAB P WHERE P.CODE = 'XX')P > On ( P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT ) > WHEN MATCHED THEN UPDATE > SET TMP.C1=P.COL1, > ... > TMP.C11=P.col11 > where TMP.CIND = 'Y' > > > Execution Plan > ---------------------------------------------------------- > Plan hash value: 2005806797 > > > > ------------------------------------------------------------------------------------------------------------------------------- > > | Id | Operation | Name | Rows | Bytes > |TempSpc| Cost (%CPU)| Time | Pstart| Pstop | > > ------------------------------------------------------------------------------------------------------------------------------- > | 0 | MERGE STATEMENT | | 1 | 1339 | > | 256K (1)| 00:51:23 | | | > | 1 | MERGE | GTT_TAB | | | > | | | | | > | 2 | VIEW | | | | > | | | | | > |* 3 | HASH JOIN | | 1 | 639 | > 732M| 256K (1)| 00:51:23 | | | > | 4 | PARTITION RANGE ALL | | 8000K| 640M| > | 5894 (3)| 00:01:11 | 1 | 331 | > |* 5 | TABLE ACCESS STORAGE FULL| PRODUCT_TAB | 8000K| 640M| > | 5894 (3)| 00:01:11 | 1 | 331 | > | 6 | TABLE ACCESS STORAGE FULL | GTT_TAB | 8000K| 4234M| > | 206 (0)| 00:00:03 | | | > > ------------------------------------------------------------------------------------------------------------------------------- > > Predicate Information (identified by operation id): > --------------------------------------------------- > 3 - access("TID"="TMP"."TID" AND "PART_DT"="TMP"."PART_DT") > 5 - storage("P"."CODE"='XX') > filter("P"."CODE"='XX') > > > Regards > Pap > > On Mon, May 3, 2021 at 2:25 PM Jonathan Lewis <jlewisoracle_at_gmail.com> > wrote: > >> If you're updating many columns of every row in a large (global >> temporary) table using lots of separate statements that use the "set .. >> subquery" strategy you are introducing 4 problems. >> >> a) Updating lots of columns means you will probably start doing lots of >> row migration - and you can't set pctfree on GTTs to avoid this, it WILL be >> 10% >> >> b) You will be generating a huge amount of undo >> >> c) If you aren't including a "where exists" predicate on the source table >> that matches the predicate on the set subquery you will be updating rows to >> null that don't need to be updated - the monitor report echoed by Mladen >> showed 17M rows updated, but only 8M executions of the subquery - which >> probably means 9M rows in the GTT updated to NULL. >> >> In your specific case there's (probably) a logic flaw in the update >> statement. You have a "where tmp.cind='Y'" in the subquery. This means >> (effectively) set the values to NULL when tmp.cind ='Y', NOT 'don't update >> the rows where tmp.cind='Y'. If you want the update to be selective then >> you need the predicate as part of the where clause on the update. >> >> Finally - >> If this is a table of roughly 17M rows (rather than the 8billion of the >> estimate) then the fastest strategy would probably be to >> a) See if you can minimise the number of update ste[ps by updating from a >> JOIN of several source tables >> b) Rewrite the code to step through a series of GTT doing >> insert /*+ append */ into next_gtt select from previous_gtt join >> {source tables} >> >> Using insert /*+ append */ with gtts will: >> a) use bulk processing rather that the row by row processing for filter >> subqueries >> b) miimise BOTH undo and redo. >> >> Regards >> Jonathan Lewis >> >> >> >> >> >> On Sun, 2 May 2021 at 19:53, Lok P <loknath.73_at_gmail.com> wrote: >> >>> Hello, We have a plsql procedure in which it updates multiple columns >>> of the global temporary table after loading it in different steps. It's >>> updated for different filter criteria and each of these UPDATES are running >>> for hours even if they are simple ones. I am trying to see if it's >>> possible to do it in an efficient fashion by combining all different >>> updates into one UPDATE statement or any other possible method? One initial >>> thought was to do it as part of the INSERT itself rather than updating >>> it through multiple statements at the later stage , but then I see the >>> data load/INSERT is happening for multiple if/else conditions which may not >>> be straight enough. >>> >>> Attached is the sql monitor for few of the long running UPDATES. And i >>> am seeing this is not really spending time for updating rows but rather in >>> finding the eligible rows for update i.e. the SELECT part of it is actually >>> taking time. So wanted to understand how these UPDATES can be done faster? >>> >>> >>> >>> Its version 11.2.0.4 of Oracle Exadata-X5. >>> >>> Regards >>> >>> Lok >>> >> >
-- http://www.freelists.org/webpage/oracle-lReceived on Tue May 04 2021 - 09:13:04 CEST