Re: Tuning Update query
Date: Mon, 3 May 2021 16:16:08 +0100
Message-ID: <CAGtsp8kebJFT39DjVrV2H3MxxqtE8QNxsn893mnA66KG1Q5doQ_at_mail.gmail.com>
William,
I haven't got around to setting up an always free 21c online yet, but if you have a little time it would interesting to know if the update at the end of https://jonathanlewis.wordpress.com/2012/10/26/running-total/ that joins a table to a analytic select from itself now works, or whether some other restriction appears.
Regards
Jonathan Lewis
On Mon, 3 May 2021 at 15:11, William Robertson <william_at_williamrobertson.net> wrote:
> The ORA-01779 error means you would need a unique index or constraint on
> product_tab (part_dt, tid) to be able to use this syntax. (Strangely, I did
> not get the error in 21c. Maybe there is a new feature I am missing.)
>
> You may be able to combine the updates into a smaller number of merges
> even when different rules apply, using CASE expressions so that each column
> is only set to a new value when the relevant conditions are met, otherwise
> set it to itself.
>
> *when* *matched* *then* *update*
> *set* tmp.c1 = *case when* x* = *y *then* p.col1 *else* tmp.c1 *end*
> * , *tmp.c2 = *case when* a* =* b *then* p.col2 *else* tmp.c2 *end*
> *;*
>
>
> William
>
> On 3 May 2021, at 12:57, Lok P <loknath.73_at_gmail.com> wrote:
>
> Thank You Jonathan and Lothar. I was trying to modify one of the sample
> updates(as below) as per your suggested method.
>
> UPDATE GTT_TAB TMP
> SET (c1, c2, c3, c4,c5.. c11) =
> ( (SELECT col1,col2,..col11
> FROM PRODUCT_TAB P
> WHERE P.TID = TMP.TID
> AND TMP.CIND = 'Y'
> AND P.PART_DT = TMP.PART_DT
> AND P.CODE = 'XX'))
>
> Jonathan, let me know if my understanding is correct on the suggested
> points
> " a) See if you can minimise the number of update ste[ps by updating from
> a JOIN of several source tables"
>
> So if I get the above point correct then I was trying to modify the UPDATE
> as below , but I am getting ORA-01779 while running to see the plan. So
> does it mean that the GTT has to have a unique key present in it to have
> this method work?
>
> UPDATE (SELECT P.COL1...P.col11,TMP.C1.. TMP.C11
> FROM PRODUCT_TAB P,GTT_TAB TMP
> WHERE P.TID = TMP.TID
> AND TMP.CIND = 'Y'
> AND P.PART_DT = TMP.PART_DT
> AND P.CODE = 'XX'
> ) SET C1=COL1, ... C11=COL11;
>
> ERROR at line 10:
>
> ORA-01779: cannot modify a column which maps to a non key-preserved table
>
>
> Regarding below point ,
> "b) Rewrite the code to step through a series of GTT doing
> insert /*+ append */ into next_gtt select from previous_gtt join
> {source tables}"
>
> Do you mean i have to replace the UPDATE with INSERT /*+APPEND*/...
> queries but as it cant be done in the same GTT , so i have to create
> multiple GTT's for each UPDATES, so as to replace them with INSERT APPEND
> queries?
>
>
> Lothar , as you suggested , i tried modifying the same above query as
> below MERGE statement. Though i was not able to see the plan on production
> because its a DML, so i tried running it on DEV and as the data volume was
> not same so with some force cardinality hints, i do see a HASH JOIN
> operation, but not seeing bloom filter right away. Maybe , we will see it
> in prod. And also i remember , there was a bug on 11.2.0.4 which was
> restricting the cell smart scan in case of DML queries and work around for
> that was to set _serial_direct_read=always. Not sure if we have to do that
> here in case of MERGE statement to go for cell offloading. Also if it's
> worth modifying each of these 4-5 different UPDATES to one MERGE statement
> each? I was thinking if combining them to one is possible and will give
> benefit.
>
> MERGE INTO GTT_TAB TMP
> USING (SELECT col1,col2,..col11,CODE, PART_DT,TID
> FROM PRODUCT_TAB P WHERE P.CODE = 'XX')P
> On ( P.TID = TMP.TID AND P.PART_DT = TMP.PART_DT )
> WHEN MATCHED THEN UPDATE
> SET TMP.C1=P.COL1,
> ...
> TMP.C11=P.col11
> where TMP.CIND = 'Y'
>
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 2005806797
>
>
>
> -------------------------------------------------------------------------------------------------------------------------------
>
> | Id | Operation | Name | Rows | Bytes
> |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
>
> -------------------------------------------------------------------------------------------------------------------------------
> | 0 | MERGE STATEMENT | | 1 | 1339 |
> | 256K (1)| 00:51:23 | | |
> | 1 | MERGE | GTT_TAB | | |
> | | | | |
> | 2 | VIEW | | | |
> | | | | |
> |* 3 | HASH JOIN | | 1 | 639 |
> 732M| 256K (1)| 00:51:23 | | |
> | 4 | PARTITION RANGE ALL | | 8000K| 640M|
> | 5894 (3)| 00:01:11 | 1 | 331 |
> |* 5 | TABLE ACCESS STORAGE FULL| PRODUCT_TAB | 8000K| 640M|
> | 5894 (3)| 00:01:11 | 1 | 331 |
> | 6 | TABLE ACCESS STORAGE FULL | GTT_TAB | 8000K| 4234M|
> | 206 (0)| 00:00:03 | | |
>
> -------------------------------------------------------------------------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
> 3 - access("TID"="TMP"."TID" AND "PART_DT"="TMP"."PART_DT")
> 5 - storage("P"."CODE"='XX')
> filter("P"."CODE"='XX')
>
>
> Regards
> Pap
>
> On Mon, May 3, 2021 at 2:25 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>> If you're updating many columns of every row in a large (global
>> temporary) table using lots of separate statements that use the "set ..
>> subquery" strategy you are introducing 4 problems.
>>
>> a) Updating lots of columns means you will probably start doing lots of
>> row migration - and you can't set pctfree on GTTs to avoid this, it WILL be
>> 10%
>>
>> b) You will be generating a huge amount of undo
>>
>> c) If you aren't including a "where exists" predicate on the source table
>> that matches the predicate on the set subquery you will be updating rows to
>> null that don't need to be updated - the monitor report echoed by Mladen
>> showed 17M rows updated, but only 8M executions of the subquery - which
>> probably means 9M rows in the GTT updated to NULL.
>>
>> In your specific case there's (probably) a logic flaw in the update
>> statement. You have a "where tmp.cind='Y'" in the subquery. This means
>> (effectively) set the values to NULL when tmp.cind ='Y', NOT 'don't update
>> the rows where tmp.cind='Y'. If you want the update to be selective then
>> you need the predicate as part of the where clause on the update.
>>
>> Finally -
>> If this is a table of roughly 17M rows (rather than the 8billion of the
>> estimate) then the fastest strategy would probably be to
>> a) See if you can minimise the number of update ste[ps by updating from a
>> JOIN of several source tables
>> b) Rewrite the code to step through a series of GTT doing
>> insert /*+ append */ into next_gtt select from previous_gtt join
>> {source tables}
>>
>> Using insert /*+ append */ with gtts will:
>> a) use bulk processing rather that the row by row processing for filter
>> subqueries
>> b) miimise BOTH undo and redo.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>>
>> On Sun, 2 May 2021 at 19:53, Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Hello, We have a plsql procedure in which it updates multiple columns
>>> of the global temporary table after loading it in different steps. It's
>>> updated for different filter criteria and each of these UPDATES are running
>>> for hours even if they are simple ones. I am trying to see if it's
>>> possible to do it in an efficient fashion by combining all different
>>> updates into one UPDATE statement or any other possible method? One initial
>>> thought was to do it as part of the INSERT itself rather than updating
>>> it through multiple statements at the later stage , but then I see the
>>> data load/INSERT is happening for multiple if/else conditions which may not
>>> be straight enough.
>>>
>>> Attached is the sql monitor for few of the long running UPDATES. And i
>>> am seeing this is not really spending time for updating rows but rather in
>>> finding the eligible rows for update i.e. the SELECT part of it is actually
>>> taking time. So wanted to understand how these UPDATES can be done faster?
>>>
>>>
>>>
>>> Its version 11.2.0.4 of Oracle Exadata-X5.
>>>
>>> Regards
>>>
>>> Lok
>>>
>>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon May 03 2021 - 17:16:08 CEST