Re: Increased runtime and 4 xids for one insert
Date: Wed, 10 Apr 2024 15:24:06 +0000
Message-ID: <VI1PR02MB647767128EE0A4985D7E5348A1062_at_VI1PR02MB6477.eurprd02.prod.outlook.com>
select xid, sql_exec_id, sql_exec_start, coalesce(qc_session_id, session_id)
, coalesce(qc_session_serial#, session serial#)
, count(distinct instance_number), count(DISTINCT session_id),count(*)
, to_char(min(sample_time),'dd.mm hh24:mi:ss') mint
, to_char(max(sample_time),'dd.mm hh24:mi:ss') maxt
from dba_hist_active_sess_history
where sql_id='fdq9c3q9jfavc'
and sql_exec_id is not null and sql_exec_start is not null
... etc...
omitted DISTINCT from count of session ids. you said it's non-parallel, non rac, but best to show this in the data.
From: Dominic Brooks <dombrooks_at_hotmail.com> Sent: 10 April 2024 16:11
To: Oracle List <oracle-l_at_freelists.org>; novak_petr_at_yahoo.de <novak_petr_at_yahoo.de> Subject: Re: Increased runtime and 4 xids for one insert
The formatting of the ASH data is hard to digest.
A couple of points of order on ASH queries here -
- I would always include sql_exec_id as
sql_id + sql_exec_id + sql_exec_start should be unique to an instance.
1.
if it's a parallel insert then you should be able to tell from QC_SESSION_ID
2.
I nearly always end up deliberately omitting rows where sql_id and sql_exec_id are null unless I'm looking for a parsing issue.
So my initial aggregation query would be something like this:
select xid, sql_exec_id, sql_exec_start, coalesce(qc_session_id, session_id)
, coalesce(qc_session_serial#, session serial#)
, count(distinct instance_number), count(session_id),count(*)
, to_char(min(sample_time),'dd.mm hh24:mi:ss') mint
, to_char(max(sample_time),'dd.mm hh24:mi:ss') maxt
from dba_hist_active_sess_history
where sql_id='fdq9c3q9jfavc'
and sql_exec_id is not null and sql_exec_start is not null
...
which also tells me whether it's parallel or not and if parallel how many sessions were sampled participating.
if it's a parallel execution then DBMS_XPLAN should be able to tell you why it's parallel if the code is not explicitly requesting parallel.
If it is parallel then perhaps you want the insert to be parallelised as well?
At least part of the answers might be visible in your own (unaggregated) ASH data. If there was blocking, etc and anything which would account for timing discrepancies etc.
11 million rows taking 3.5 hours sound way too long in the first place.
An ever increasing execution time indicates an ever increasing amount of work.
Either a symptom/consequence of what was done before (or not done or failed) Leaving behind consequences for the next.
UNDO can be a strong indicator here of either something like delayed block cleanout or concurrent executions of this (or something else) against the same objects.
However please also note that ASH data columns can be cleared out lazily. So it's highly likely that CURRENT_FILE# is only set/reset when you're waiting on an IO event. So if you join across all ASH data but the event is not an IO event, you might well end up with a large mispresentation of what was going on.
Other things - perhaps some suboptimal implementation elsewhere - e.g. an ETL process which just does a DELETE from T_SOURCE rather than a TRUNCATE leaving more and more empty blocks to be scanned the next time.
What ASH can often not be able to tell you is WHY something is being done. Some samples of session stats can be useful here - particularly in the case of why something is reading UNDO significantly, if that's what was happening.
Try picking an execution out of sql monitoring (preferably in memory from gv$sql_monitor or else from dba_hist_reports) . At least you should get a holistic summary of where the time went.
Hope this helps.
Cheers,
Dominic
From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on behalf of Petr Novak <dmarc-noreply_at_freelists.org> Sent: 10 April 2024 13:34
To: Oracle List <oracle-l_at_freelists.org> Subject: Increased runtime and 4 xids for one insert
Hallo,
there is problem with increasing execution time of simple Insert statement. (insert into T_TARGET select * from T_SOURCE) Execution time increased overproportionally to increased number of rows. 11 millions rows took 3,5 hours , 38 million rows 3 days. T_TARGET is 1,5 TB range partitioned table , 1469 partitions and 5 local indexes (same partitioned key as the table).
for shorter execution time there was one XID, but for long execution time 4 XIDs ??
select xid, SQL_EXEC_START,SESSION_ID,SESSION_SERIAL#,count(*) ,to_char(min(sample_time),'dd.mm hh24:mi:ss') mint, to_char(max(sample_time),'dd.mm hh24:mi:ss') maxt from dba_hist_active_sess_history where sql_id='fdq9c3q9jfavc' and sample_time>to_date('2404052000','yymmddhh24mi') and sample_time< to_date('2404082100','yymmddhh24mi') group by xid,SQL_EXEC_START,SESSION_ID,SESSION_SERIAL# order by SQL_EXEC_START,min(sample_time);
XID SQL_EXEC_START SESSION_ID SESSION_SERIAL# COUNT(*) Min(IntervalEnd) Max(IntervalEnd)
---------------- ------------------- -------------- --------------- -------------- -------------------- -------------------- 05.04.2024 20:13:40 1141 11657 8 05.04 20:13:44 05.04 20:14:15 2E00180051D62200 05.04.2024 20:13:40 1141 11657 12532 05.04 20:14:25 06.04 14:04:03 05.04.2024 20:13:40 1064 35873 12 05.04 20:39:31 06.04 01:42:10 05.04.2024 20:13:40 1016 54350 8 05.04 20:49:35 05.04 22:16:08 05.04.2024 20:13:40 1103 5986 16 05.04 21:00:10 06.04 02:49:36 05.04.2024 20:13:40 986 61332 6 05.04 21:34:39 06.04 04:26:03 20001C0008633B00 06.04.2024 14:04:10 1141 11657 24486 06.04 14:04:13 08.04 00:55:20 06.04.2024 14:04:10 1057 63794 6 06.04 14:08:29 07.04 01:10:29 06.04.2024 14:04:10 1026 41466 8 06.04 14:21:17 06.04 23:58:37 06.04.2024 14:04:10 1103 51053 16 06.04 14:36:49 06.04 22:38:23 15001100CD95C100 08.04.2024 00:55:22 1141 11657 8986 08.04 00:55:31 08.04 13:42:29 08.04.2024 00:55:22 1026 1396 8 08.04 00:59:06 08.04 03:59:52 08.04.2024 00:55:22 1104 28322 8 08.04 01:05:14 08.04 02:51:25 08.04.2024 00:55:22 1005 29042 8 08.04 01:21:07 08.04 03:53:02 08.04.2024 00:55:22 1075 54721 4 08.04 02:05:30 08.04 02:20:00 03001200F4F1B600 08.04.2024 13:42:30 1141 11657 4178 08.04 13:42:39 08.04 19:39:05 08.04.2024 13:42:30 325 35023 10 08.04 13:48:48 08.04 16:16:38 08.04.2024 13:42:30 296 7915 2 08.04 14:30:27 08.04 14:30:27 08.04.2024 13:42:30 369 46540 4 08.04 14:55:12 08.04 16:05:22 08.04.2024 13:42:30 395 14967 8 08.04 15:38:13 08.04 18:20:44 03001200F4F1B600 1141 11657 16 08.04 19:39:15 08.04 19:40:27
In dba_hist_undostat are same 4 execution start times for this query (maxquerylen,maxquerysqlid). During the execution I checked the transaction_start and last_call_et in the session 1141 and it loooked like TX started hours later since last_call_et. TX start was one of these SQL_EXEC_STARTs In dba_hist_sqlstat only one execution is shown, most time was spend with CPU.
most ASH samples are related to UNDO TS.
select tablespace_name,count(*)
from dba_hist_active_sess_history ,dba_data_files where sql_id='fdq9c3q9jfavc'
and CURRENT_FILE#=file_id
and sample_time>to_date('2404052000','yymmddhh24mi') and sample_time< to_date('2404082100','yymmddhh24mi')
group by tablespace_name order by count(*) desc;
TABLESPACE_NAME COUNT(*) ------------------------------ -------------- UNDOTBS1 32746 RB_STAGING_TAB 17576
There are also some intervals with increased commit times (2->100ms) , but for me it looks like another issue.
I have two main questions :
- how it is possible to have for one execution 4 XIDs ?
- where the statements spends the time ?
Any ideas ?
Best Regards,
Petr
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Apr 10 2024 - 17:24:06 CEST