Re: Same query with no plan change or volume but runs for hours vs minutes

From: Chris Taylor <christopherdtaylor1994_at_gmail.com>
Date: Tue, 5 Dec 2023 08:33:05 -0600
Message-ID: <CAP79kiSMdVXGk3u632oNzMnrrfek5xJKOmOWCj7FgX2cOBkK1Q_at_mail.gmail.com>



Any chance this is hospital/patient data? Because I've worked (and ran into) something similar in the past.

Our solution was to change the insert target into a temp table, gather stats on the temp table, and do a partition exchange into the main table. (Rebuild any global index partitions if they exist)

This was tremendously fast as it removes the delete and all the undo for that. And doesn't touch the main table until the partition exchange.

This doesn't explain the "why" of course, but in the long run it should be more maintainable and more stable/performant.

Thanks,
Chris

On Tue, Dec 5, 2023 at 2:41 AM Lok P <loknath.73_at_gmail.com> wrote:

> The sql monitor for the complete execution with DONE status of the bad run
> is as below.
>
> https://gist.github.com/oraclelearner/6720861b373814614fe3b0669aee6901
>
> One thing I was wrong about and just found that the same set of rows
> (almost ~30million) gets inserted as part of another process a day before
> into this same partition. And then as part of the current day's run , first
> there runs a delete query which deletes almost all the rows(~30million)
> from that partition and then this insert query runs and inserts all of the
> rows again with few additional rows (~32million). But again this pattern
> happens daily ,so unsure of the reason why it causes slowness on certain
> days only. Yet to investigate , if it's the correct business logic or we
> can avoid the delete somehow.
>
> And also as "Jonathan" pointed on the "gc current grant busy" stats, i do
> see stats gather was kicking on the same object from another nopde while
> this INSERT was executing and that stats gather finishes in ~10-15minutes.
> But hopefully that won't cause such an anomaly in flagging "L1 bitmaps" for
> the index blocks.
>
> Another thing i observed is, the value of the object_id and
> data_object_ids of table partition are all same but the for the PK index
> i.e dba_ind_partition is pointing to different object_id and
> data_object_ids in dba_objects. So I was thinking of any such structural
> change occurring on the index partition causing the insert to run longer.
> But I see similar differences between object_id and data_object_ids for all
> the index partitions, so that is not adding up to the theory. and also i
> didn't find any job or related query in the dba_hist_sqltext, which must be
> doing index rebuild/shrink/coalesce etc. Also did not find any such INSERT
> which must be happening at same time on the same object.
>
> On Thu, Nov 30, 2023 at 9:24 PM Timur Akhmadeev <timur.akhmadeev_at_gmail.com>
> wrote:
>
>> Hi,
>>
>> Since it is reproducible I suggest you take a few stacks of the database
>> process with oradebug short_stack and then check them at
>> orafun.info/stack
>> If you can't do short stacks then can you post SQL Monitor of the bad run
>> after it is fully completed and its status is DONE.
>>
>>
>>
>> On Mon, Nov 27, 2023 at 11:15 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> The insert query normally runs for <15minutes somedays runs for ~6-7hrs
>>> with exactly the same plan and approx same volume. And all the resources
>>> seem to be spent on step "LOAD TABLE CONVENTIONAL" and its "ON CPU".
>>> Wondering where it's spending time during the long execution?
>>>
>>> While the long run was happening , we tried to fetch the stats from
>>> v$sesstat for a ~5minutes duration and collected the non zero stats. Not
>>> able to find out anything odd. Could you please guide me here, what must be
>>> the cause of this sudden long execution on some days for this same query
>>> with almost the same volume?
>>>
>>> I have posted the sql monitors from both the runs below and also in the
>>> below github linc. And also published the nonzero output from the v$sesstat
>>> for the long execution for a ~5 minutes delta duration.We do see some stats
>>> like "KTFB alloc**" , is that normal or some issue is causing these
>>> stats to pop up?
>>>
>>> https://gist.github.com/oraclelearner/722a77ef5ebce23396d82e2835958303
>>>
>>> ******** Sql monitor from long run ********
>>> Global Information
>>> ------------------------------
>>> Status : EXECUTING
>>>
>>> Instance ID : 1
>>>
>>> SQL Execution ID : 16777216
>>>
>>> Execution Started : 11/27/2023 03:30:16
>>>
>>> First Refresh Time : 11/27/2023 03:30:19
>>>
>>> Last Refresh Time : 11/27/2023 10:06:13
>>>
>>> Duration : 23759s
>>>
>>> PLSQL Entry Ids (Object/Subprogram) : 161298,9
>>>
>>> PLSQL Current Ids (Object/Subprogram) : 161298,15
>>>
>>>
>>> Global Stats
>>>
>>> ===========================================================================================================================================================================
>>> | Elapsed | Cpu | IO | Application | Concurrency | Cluster |
>>> PL/SQL | Other | Buffer | Read | Read | Uncompressed | Offload |
>>> Offload | Cell |
>>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Waits(s) |
>>> Time(s) | Waits(s) | Gets | Reqs | Bytes | Bytes | Elig Bytes |
>>> Returned Bytes | Offload |
>>>
>>> ===========================================================================================================================================================================
>>> | 23758 | 23570 | 4.88 | 1.28 | 4.78 | 58 |
>>> 41 | 119 | 18G | 1M | 719GB | 359GB | 715GB |
>>> 4GB | 99.42% |
>>>
>>> ===========================================================================================================================================================================
>>>
>>> SQL Plan Monitoring Details (Plan Hash Value=557779869)
>>>
>>> ==================================================================================================================================================================================================================================
>>> | Id | Operation | Name
>>> | Rows | Cost | Time | Start | Execs | Rows | Read | Read |
>>> Mem | Activity | Activity Detail |
>>> Progress |
>>> | | |
>>> | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes |
>>> | (%) | (# samples) |
>>> |
>>>
>>> ==================================================================================================================================================================================================================================
>>> | -> 0 | INSERT STATEMENT |
>>> | | | 23761 | +3 | 1 | 0 | | |
>>> . | 0.16 | Cpu (35) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | cell smart table scan (2) |
>>> |
>>> | -> 1 | LOAD TABLE CONVENTIONAL | PBDRA_DTL
>>> | | | 23762 | +2 | 1 | 0 | 499K | 4GB |
>>> . | 98.42 | gc current block 2-way (4) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | gc current grant 2-way (23) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | gc current grant busy (18) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | latch: cache buffers chains (5) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | log file switch completion (1) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | undo segment extension (2) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | Cpu (22695) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | cell single block physical read: RDMA (8) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | cell single block physical read: xrmem cache (34) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | enq: FB - contention (1) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | ges message buffer allocation (2) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | cell single block physical read: flash cache (3) |
>>> |
>>> | -> 2 | SEQUENCE | PBDRA_SEQ
>>> | | | 23761 | +3 | 1 | 18M | | |
>>> . | 0.50 | Cpu (109) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | reliable message (6) |
>>> |
>>> | -> 3 | HASH JOIN |
>>> | 68259 | 1M | 23761 | +3 | 1 | 18M | | |
>>> 5MB | 0.02 | Cpu (5) |
>>> |
>>> | 4 | TABLE ACCESS STORAGE FULL | PRFS
>>> | 78 | 2 | 1 | +3 | 1 | 63 | 1 | 40960 |
>>> . | | |
>>> |
>>> | -> 5 | PARTITION RANGE ITERATOR |
>>> | 152K | 1M | 23761 | +3 | 1 | 18M | | |
>>> . | | |
>>> |
>>> | -> 6 | TABLE ACCESS STORAGE FULL | PBABR_DTL
>>> | 152K | 1M | 23761 | +3 | 215 | 18M | 740K | 715GB |
>>> 7MB | 0.33 | Cpu (73) |
>>> 100% |
>>> | | |
>>> | | | | | | | | |
>>> | | gcs drm freeze in enter server mode (1) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | reliable message (1) |
>>> |
>>> | | |
>>> | | | | | | | | |
>>> | | cell smart table scan (1) |
>>> |
>>>
>>> ==================================================================================================================================================================================================================================
>>>
>>>
>>> ********* Quick Run ****************
>>>
>>> Global Information
>>> ------------------------------
>>> Status : DONE
>>>
>>> Instance ID : 1
>>>
>>> SQL Execution ID : 16777216
>>>
>>> Execution Started : 11/24/2023 04:39:04
>>>
>>> First Refresh Time : 11/24/2023 04:39:07
>>>
>>> Last Refresh Time : 11/24/2023 04:55:37
>>>
>>> Duration : 993s
>>>
>>> PLSQL Entry Ids (Object/Subprogram) : 161298,9
>>>
>>> PLSQL Current Ids (Object/Subprogram) : 161298,15
>>>
>>>
>>> Global Stats
>>>
>>> ===========================================================================================================================================================================
>>> | Elapsed | Cpu | IO | Application | Concurrency | Cluster |
>>> PL/SQL | Other | Buffer | Read | Read | Uncompressed | Offload |
>>> Offload | Cell |
>>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Waits(s) |
>>> Time(s) | Waits(s) | Gets | Reqs | Bytes | Bytes | Elig Bytes |
>>> Returned Bytes | Offload |
>>>
>>> ===========================================================================================================================================================================
>>> | 993 | 942 | 5.27 | 0.05 | 0.01 | 0.02 |
>>> 68 | 46 | 316M | 2M | 867GB | 443GB | 861GB |
>>> 7GB | 99.21% |
>>>
>>> ===========================================================================================================================================================================
>>>
>>> SQL Plan Monitoring Details (Plan Hash Value=557779869)
>>>
>>> =====================================================================================================================================================================================
>>> | Id | Operation | Name |
>>> Rows | Cost | Time | Start | Execs | Rows | Read | Read |
>>> Mem | Activity | Activity Detail |
>>> | | | |
>>> (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes |
>>> (Max) | (%) | (# samples) |
>>>
>>> =====================================================================================================================================================================================
>>> | 0 | INSERT STATEMENT | |
>>> | | 991 | +3 | 1 | 0 | | |
>>> . | | |
>>> | 1 | LOAD TABLE CONVENTIONAL | PBDRA_DTL |
>>> | | 991 | +3 | 1 | 0 | 874K | 7GB |
>>> . | | |
>>> | 2 | SEQUENCE | PBDRA_SEQ |
>>> | | 992 | +2 | 1 | 30M | | |
>>> . | | |
>>> | 3 | HASH JOIN | |
>>> 66851 | 1M | 991 | +3 | 1 | 30M | | |
>>> 5MB | | |
>>> | 4 | TABLE ACCESS STORAGE FULL | PRFS |
>>> 78 | 2 | 1 | +3 | 1 | 63 | 2 | 49152 |
>>> 1MB | | |
>>> | 5 | PARTITION RANGE ITERATOR | |
>>> 149K | 1M | 991 | +3 | 1 | 32M | | |
>>> . | | |
>>> | 6 | TABLE ACCESS STORAGE FULL | PBABR_DTL |
>>> 149K | 1M | 991 | +3 | 286 | 32M | 892K | 861GB |
>>> 7MB | | |
>>>
>>> =====================================================================================================================================================================================
>>>
>>
>>
>> --
>> Regards
>> Timur Akhmadeev
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Dec 05 2023 - 15:33:05 CET

Original text of this message