Re: How to debug resource busy error
Date: Tue, 3 May 2022 00:59:09 +0530
Message-ID: <CAEjw_fhvaKdes6KXVjfU7yKTtikrYc9Jg=8QCzq=E4KbaZufGg_at_mail.gmail.com>
Thank You.
_at_Laurentiu, I have executed the dashtop query with the suggested inputs and below is the output from that. But I can't see any query in the output which can take such a lock on the base table and can cause this error.
_at_Stefan, Considering future course of action, i was planning to set the ddl_lock_timeout in the procedure code through the alter session command. However as you mentioned that culprit session can well be captured using trace. So i was wondering , if any of the below tracing is enough to catch the culprit session/sql details or we should specifically go for "systemstate" dump i.e the one as you suggested? I never generated or read the systemstate dump, so trying to understand if other simple traces as below will also help catching the culprit sessions/sqls etc?
alter system set events "00054 trace name systemstate level 266, lifetime 1";
Get the trace....
ALTER system SET EVENTS '00054 TRACE NAME CONTEXT OFF';
****OR
ALTER system SET EVENTS '00054 trace name errorstack level 3';
Get the trace ...
ALTER system SET EVENTS '00054 TRACE NAME CONTEXT OFF';
****OR tracing the exact session
begin
dbms_monitor.session_trace_enable( session_id => &m_sid, serial_num => &m_serial,
waits => true, bind => true, plan_stat => 'all_executions' );
end;
/
- allow the Ora-00054 failure to happen for the session
begin
dbms_monitor.session_trace_disable(session_id => &m_sid,serial_num => &m_serial);
end;
/
[image: User: "image.png"]
On Mon, May 2, 2022 at 6:39 PM Laurentiu Oprea <laurentiu.oprea06_at_gmail.com> wrote:
> Hello Pap,
>
> I do see a probable cause for your error so before doing a guess
> completely on feeling, lets get some data :
>
> check the data of: _at_dashtop
> sql_opname,sql_id,sql_exec_start,in_parse session_type='FOREGROUND'
> starttime endtime
> https://github.com/tanelpoder/tpt-oracle/blob/master/ash/dashtop.sql
> BTW: endtime should be the moment you receive the error, starttime can be
> 15 minutes before
>
> At least I`ll make a guess based on feeling + some data :)
>
> În lun., 2 mai 2022 la 14:16, Pap <oracle.developer35_at_gmail.com> a scris:
> >> Thank you so much Laurentiu and Dominic. >> >> I tried checking the dash_wait_chains script for 15minutes window , just >> by passing first parameter as session_type='FOREGROUND' and but not seeing >> much blocking waits as in below output. And also as i don't have the sql_id >> for the exact ALTER statement captured as it must be very fast and thus not >> getting captured in ASH/AWR, so was not able to pass the second parameter >> i.e sql_id filter. >> >> To Dominic's point , i think this ALTER sql_id wont be captured as its >> very fast and failing immediately when its seeing the lock from other >> session. And also we are struggling to catch the culprit session too. So as >> suggested, will try to set the ddl_lock_timeout to ~300 i.e 5 minutes so as >> to capture enough blocking session against the ALTER statement logged in >> the ASH/AWR to get the details around the culprit session. And considering >> the data-load itself take ~1 hrs+ so this amount of additional wait time(5 >> mins) should be fine. That should possibly also fix the issue if the >> blocking lock/culprit session is just holding it for very small time and >> releasing it immediately after. >> >> >> [image: User: "image.png"] >> >> >> >> >> On Mon, 2 May 2022, 12:03 pm Dominic Brooks, <dombrooks_at_hotmail.com> >> wrote: >> >>> Problem is that the error is likely to be immediate. >>> >>> If you increase/set the param ddl_lock_timeout in your load code before >>> you do enable constraints then >>> a) you might wait sufficient time not to fail, ie your blocker finishes >>> before the timeout end and/or >>> b) you should capture some information in ASH about the blocking >>> scenario. >>> >>> Cheers, >>> Dominic >>> >>> Sent from my iPhone >>> >>> On 2 May 2022, at 06:55, Laurentiu Oprea <laurentiu.oprea06_at_gmail.com> >>> wrote: >>> >>> >>> Hello Pap, >>> >>> To diagnose your issue ASH should do the trick. >>> >>> Here is an example(script dash_wait_chains) : >>> >>> https://tanelpoder.com/2013/11/06/diagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2/ >>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftanelpoder.com%2F2013%2F11%2F06%2Fdiagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2%2F&data=05%7C01%7C%7C97ae9142e13241b6ef5308da2c005496%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637870677178768226%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=RXOJ6ean3%2FHVZiEqVguXpBwUZWt7sQi4OjPAPTmACX8%3D&reserved=0> >>> >>> Thank you. >>> >>> În lun., 2 mai 2022 la 05:58, Pap <oracle.developer35_at_gmail.com> a >>> scris: >>> >>>> Hi, We are facing Ora-0054 error on specific time of the month for a >>>> job and from the line number noted in the error log its pointing to the >>>> alter statement which enables reference constraints after a direct path >>>> data load and commit. But we tried monitoring it a few times during >>>> the error period/run time and surprisingly we are not seeing any lock(from >>>> v$locked_object) on the base table(say e.g. MAIN_TAB in below example) but >>>> still the job failed and it succeeded after a couple of rerun attempts. >>>> But it seems like , some process runs and takes lock(for may be very >>>> small time) and makes this 'ALTER' failure and we were unable to find >>>> that out. >>>> >>>> So I wanted to understand , how to get the culprit session/sql/job so >>>> that we can have some dependency set so that this failure won't occur? >>>> Is there any possible trace to set which will help us to debug this >>>> scenario and get the details around the culprit session? Or somehow we >>>> can track it from any of the ASH/AWR views for which we have enough >>>> retention in place? >>>> >>>> *****The error log pointing to one of the below statements >>>> failure inside the procedure ******* >>>> >>>> ALTER TABLE SCHEMA1.MAIN_TAB ENABLE NOVALIDATE CONSTRAINT MAIN_TAB_R01; >>>> >>>> ALTER TABLE SCHEMA1.MAIN_TAB ENABLE NOVALIDATE CONSTRAINT MAIN_TAB_R03; >>>> >>>> ORA-00054: resource busy and acquire with NOWAIT specified or timeout >>>> expired >>>> >>>> ********Table structure*************** >>>> >>>> CREATE TABLE SCHEMA1.MAIN_TAB >>>> >>>> ( FFKEY NUMBER CONSTRAINT MAIN_TAB_C01 NOT NULL, >>>> >>>> SCKEY NUMBER CONSTRAINT MAIN_TAB_C02 NOT NULL, >>>> >>>> ACKEY NUMBER, >>>> >>>> CKEY NUMBER CONSTRAINT MAIN_TAB_C04 NOT NULL, >>>> >>>> CSTKEY NUMBER CONSTRAINT MAIN_TAB_C05 NOT NULL, >>>> >>>> EPC_KEY NUMBER CONSTRAINT MAIN_TAB_C06 NOT NULL, >>>> >>>> CRRKEY NUMBER CONSTRAINT MAIN_TAB_C07 NOT NULL >>>> >>>> ); >>>> >>>> ALTER TABLE SCHEMA1.MAIN_TAB ADD ( CONSTRAINT MAIN_TAB_R01 FOREIGN KEY >>>> (CKEY) REFERENCES SCHEMA1.REF_TAB1 (CKEY) ENABLE NOVALIDATE, >>>> >>>> CONSTRAINT MAIN_TAB_R03 FOREIGN KEY (SCKEY) REFERENCES >>>> SCHEMA1.REF_TAB2 (SCKEY) ENABLE NOVALIDATE); >>>> >>>> ****************** >>>> >>>> CREATE TABLE SCHEMA1.REF_TAB1( CKEY NUMBER CONSTRAINT REF_TAB1_C01 >>>> NOT NULL ); >>>> >>>> CREATE UNIQUE INDEX SCHEMA1.REF_TAB1_PK ON SCHEMA1.REF_TAB1(CKEY); >>>> >>>> ALTER TABLE SCHEMA1.REF_TAB1 ADD ( CONSTRAINT REF_TAB1_PK PRIMARY >>>> KEY (CKEY) USING INDEX SCHEMA1.REF_TAB1_PK ENABLE VALIDATE); >>>> >>>> ******************* >>>> >>>> CREATE TABLE SCHEMA1.REF_TAB2( SCKEY NUMBER CONSTRAINT REF_TAB2_C01 >>>> NOT NULL ); >>>> >>>> CREATE UNIQUE INDEX SCHEMA1.REF_TAB2_PK ON SCHEMA1.REF_TAB2(SCKEY); >>>> >>>> ALTER TABLE SCHEMA1.REF_TAB2 ADD ( CONSTRAINT REF_TAB2_PK PRIMARY >>>> KEY (SCKEY) USING INDEX SCHEMA1.REF_TAB2_PK ENABLE VALIDATE); >>>> >>>> ****************** >>>> >>>> >>>>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon May 02 2022 - 21:29:09 CEST