Re: How to debug resource busy error

From: Pap <oracle.developer35_at_gmail.com>
Date: Tue, 3 May 2022 00:59:09 +0530
Message-ID: <CAEjw_fhvaKdes6KXVjfU7yKTtikrYc9Jg=8QCzq=E4KbaZufGg_at_mail.gmail.com>



Thank You.

_at_Laurentiu, I have executed the dashtop query with the suggested inputs and below is the output from that. But I can't see any query in the output which can take such a lock on the base table and can cause this error.

_at_Stefan, Considering future course of action, i was planning to set the ddl_lock_timeout in the procedure code through the alter session command. However as you mentioned that culprit session can well be captured using trace. So i was wondering , if any of the below tracing is enough to catch the culprit session/sql details or we should specifically go for "systemstate" dump i.e the one as you suggested? I never generated or read the systemstate dump, so trying to understand if other simple traces as below will also help catching the culprit sessions/sqls etc?

alter system set events "00054 trace name systemstate level 266, lifetime 1";

Get the trace....

ALTER system SET EVENTS '00054 TRACE NAME CONTEXT OFF';

****OR

ALTER system SET EVENTS '00054 trace name errorstack level 3';

Get the trace ...

ALTER system SET EVENTS '00054 TRACE NAME CONTEXT OFF';

****OR tracing the exact session

begin

    dbms_monitor.session_trace_enable( session_id => &m_sid, serial_num => &m_serial,

        waits => true, bind => true, plan_stat => 'all_executions' );

end;

/

  • allow the Ora-00054 failure to happen for the session

begin

dbms_monitor.session_trace_disable(session_id => &m_sid,serial_num => &m_serial);

end;

/

[image: User: "image.png"]

On Mon, May 2, 2022 at 6:39 PM Laurentiu Oprea <laurentiu.oprea06_at_gmail.com> wrote:

> Hello Pap,

>

> I do see a probable cause for your error so before doing a guess
> completely on feeling, lets get some data :
>

> check the data of: _at_dashtop
> sql_opname,sql_id,sql_exec_start,in_parse session_type='FOREGROUND'
> starttime endtime
> https://github.com/tanelpoder/tpt-oracle/blob/master/ash/dashtop.sql
> BTW: endtime should be the moment you receive the error, starttime can be
> 15 minutes before
>

> At least I`ll make a guess based on feeling + some data :)
>

> În lun., 2 mai 2022 la 14:16, Pap <oracle.developer35_at_gmail.com> a scris:
>
>> Thank you so much Laurentiu and Dominic.
>>
>> I tried checking the dash_wait_chains script for 15minutes window , just
>> by passing first parameter as session_type='FOREGROUND' and but not seeing
>> much blocking waits as in below output. And also as i don't have the sql_id
>> for the exact ALTER statement captured as it must be very fast and thus not
>> getting captured in ASH/AWR, so was not able to pass the second parameter
>> i.e sql_id filter.
>>
>> To Dominic's point , i think this ALTER sql_id wont be captured as its
>> very fast and failing immediately when its seeing the lock from other
>> session. And also we are struggling to catch the culprit session too. So as
>> suggested, will try to set the ddl_lock_timeout to ~300 i.e 5 minutes so as
>> to capture enough blocking session against the ALTER statement logged in
>> the ASH/AWR to get the details around the culprit session. And considering
>> the data-load itself take ~1 hrs+ so this amount of additional wait time(5
>> mins) should be fine. That should possibly also fix the issue if the
>> blocking lock/culprit session is just holding it for very small time and
>> releasing it immediately after.
>>
>>
>> [image: User: "image.png"]
>>
>>
>>
>>
>> On Mon, 2 May 2022, 12:03 pm Dominic Brooks, <dombrooks_at_hotmail.com>
>> wrote:
>>
>>> Problem is that the error is likely to be immediate.
>>>
>>> If you increase/set the param ddl_lock_timeout in your load code before
>>> you do enable constraints then
>>> a) you might wait sufficient time not to fail, ie your blocker finishes
>>> before the timeout end and/or
>>> b) you should capture some information in ASH about the blocking
>>> scenario.
>>>
>>> Cheers,
>>> Dominic
>>>
>>> Sent from my iPhone
>>>
>>> On 2 May 2022, at 06:55, Laurentiu Oprea <laurentiu.oprea06_at_gmail.com>
>>> wrote:
>>>
>>> 
>>> Hello Pap,
>>>
>>> To diagnose your issue ASH should do the trick.
>>>
>>> Here is an example(script dash_wait_chains) :
>>>
>>> https://tanelpoder.com/2013/11/06/diagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2/
>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftanelpoder.com%2F2013%2F11%2F06%2Fdiagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2%2F&data=05%7C01%7C%7C97ae9142e13241b6ef5308da2c005496%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637870677178768226%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=RXOJ6ean3%2FHVZiEqVguXpBwUZWt7sQi4OjPAPTmACX8%3D&reserved=0>
>>>
>>> Thank you.
>>>
>>> În lun., 2 mai 2022 la 05:58, Pap <oracle.developer35_at_gmail.com> a
>>> scris:
>>>
>>>> Hi, We are facing Ora-0054 error on specific time of the month for a
>>>> job and from the line number noted in the error log its pointing to the
>>>> alter statement which enables reference constraints after a direct path
>>>> data load and commit. But we tried monitoring it a few times during
>>>> the error period/run time and surprisingly we are not seeing any lock(from
>>>> v$locked_object) on the base table(say e.g. MAIN_TAB in below example) but
>>>> still the job failed and it succeeded after a couple of rerun attempts.
>>>> But it seems like , some process runs and takes lock(for may be very
>>>> small time) and makes this 'ALTER' failure and we were unable to find
>>>> that out.
>>>>
>>>> So I wanted to understand , how to get the culprit session/sql/job so
>>>> that we can have some dependency set so that this failure won't occur?
>>>> Is there any possible trace to set which will help us to debug this
>>>> scenario and get the details around the culprit session? Or somehow we
>>>> can track it from any of the ASH/AWR views for which we have enough
>>>> retention in place?
>>>>
>>>> *****The error log pointing to one of the below statements
>>>> failure inside the procedure *******
>>>>
>>>> ALTER TABLE SCHEMA1.MAIN_TAB ENABLE NOVALIDATE CONSTRAINT MAIN_TAB_R01;
>>>>
>>>> ALTER TABLE SCHEMA1.MAIN_TAB ENABLE NOVALIDATE CONSTRAINT MAIN_TAB_R03;
>>>>
>>>> ORA-00054: resource busy and acquire with NOWAIT specified or timeout
>>>> expired
>>>>
>>>> ********Table structure***************
>>>>
>>>> CREATE TABLE SCHEMA1.MAIN_TAB
>>>>
>>>> ( FFKEY  NUMBER CONSTRAINT MAIN_TAB_C01 NOT NULL,
>>>>
>>>>  SCKEY   NUMBER CONSTRAINT MAIN_TAB_C02 NOT NULL,
>>>>
>>>>  ACKEY    NUMBER,
>>>>
>>>>  CKEY     NUMBER CONSTRAINT MAIN_TAB_C04 NOT NULL,
>>>>
>>>>  CSTKEY     NUMBER CONSTRAINT MAIN_TAB_C05 NOT NULL,
>>>>
>>>>  EPC_KEY    NUMBER CONSTRAINT MAIN_TAB_C06 NOT NULL,
>>>>
>>>>  CRRKEY     NUMBER CONSTRAINT MAIN_TAB_C07 NOT NULL
>>>>
>>>>  );
>>>>
>>>> ALTER TABLE SCHEMA1.MAIN_TAB ADD ( CONSTRAINT MAIN_TAB_R01  FOREIGN KEY
>>>> (CKEY)  REFERENCES SCHEMA1.REF_TAB1 (CKEY) ENABLE NOVALIDATE,
>>>>
>>>>  CONSTRAINT MAIN_TAB_R03  FOREIGN KEY (SCKEY)  REFERENCES
>>>> SCHEMA1.REF_TAB2 (SCKEY) ENABLE NOVALIDATE);
>>>>
>>>> ******************
>>>>
>>>>  CREATE TABLE SCHEMA1.REF_TAB1( CKEY NUMBER CONSTRAINT REF_TAB1_C01
>>>> NOT NULL );
>>>>
>>>> CREATE UNIQUE INDEX SCHEMA1.REF_TAB1_PK ON SCHEMA1.REF_TAB1(CKEY);
>>>>
>>>> ALTER TABLE SCHEMA1.REF_TAB1 ADD ( CONSTRAINT REF_TAB1_PK PRIMARY
>>>> KEY (CKEY) USING INDEX SCHEMA1.REF_TAB1_PK ENABLE VALIDATE);
>>>>
>>>> *******************
>>>>
>>>> CREATE TABLE SCHEMA1.REF_TAB2( SCKEY  NUMBER CONSTRAINT REF_TAB2_C01
>>>> NOT NULL );
>>>>
>>>> CREATE UNIQUE INDEX SCHEMA1.REF_TAB2_PK ON SCHEMA1.REF_TAB2(SCKEY);
>>>>
>>>> ALTER TABLE SCHEMA1.REF_TAB2 ADD ( CONSTRAINT REF_TAB2_PK PRIMARY
>>>> KEY (SCKEY) USING INDEX SCHEMA1.REF_TAB2_PK ENABLE VALIDATE);
>>>>
>>>> ******************
>>>>
>>>>
>>>>


--
http://www.freelists.org/webpage/oracle-l
Received on Mon May 02 2022 - 21:29:09 CEST

Original text of this message