Re: How to debug resource busy error

From: Laurentiu Oprea <laurentiu.oprea06_at_gmail.com>
Date: Mon, 2 May 2022 23:15:49 +0300
Message-ID: <CA+riqSUs_joziEYkaScUMfZXoo2jttKQ56qKQPR7sUunXTS8vQ_at_mail.gmail.com>



Thanks. Is strange I don't see any query in parsing phase here based on the chains output. The theory in my mind was that you have a parallel query (can be even a select) for which parsing takes significant time and which include the table you want to enable the constraints thus blocking them but looks like is not the case so tracing should be your friend here.

Good luck.

On Mon, May 2, 2022, 22:29 Pap <oracle.developer35_at_gmail.com> wrote:

> Thank You.
>
> _at_Laurentiu, I have executed the dashtop query with the suggested inputs
> and below is the output from that. But I can't see any query in the output
> which can take such a lock on the base table and can cause this error.
>
> _at_Stefan, Considering future course of action, i was planning to set the
> ddl_lock_timeout in the procedure code through the alter session command.
> However as you mentioned that culprit session can well be captured using
> trace. So i was wondering , if any of the below tracing is enough to
> catch the culprit session/sql details or we should specifically go for
> "systemstate" dump i.e the one as you suggested? I never generated or read
> the systemstate dump, so trying to understand if other simple traces as
> below will also help catching the culprit sessions/sqls etc?
>
> alter system set events "00054 trace name systemstate level 266, lifetime
> 1";
>
> Get the trace....
>
> ALTER system SET EVENTS '00054 TRACE NAME CONTEXT OFF';
>
> ****OR
>
> ALTER system SET EVENTS '00054 trace name errorstack level 3';
>
> Get the trace ...
>
> ALTER system SET EVENTS '00054 TRACE NAME CONTEXT OFF';
>
> ****OR tracing the exact session
>
> begin
>
> dbms_monitor.session_trace_enable( session_id => &m_sid, serial_num
> => &m_serial,
>
> waits => true, bind => true, plan_stat => 'all_executions'
> );
>
> end;
>
> /
>
> -- allow the Ora-00054 failure to happen for the session
>
> begin
>
> dbms_monitor.session_trace_disable(session_id => &m_sid,serial_num =>
> &m_serial);
>
> end;
>
> /
>
>
> [image: User: "image.png"]
>
>
>
>
>
>
>
>
> On Mon, May 2, 2022 at 6:39 PM Laurentiu Oprea <
> laurentiu.oprea06_at_gmail.com> wrote:
>
>> Hello Pap,
>>
>> I do see a probable cause for your error so before doing a guess
>> completely on feeling, lets get some data :
>>
>> check the data of: _at_dashtop
>> sql_opname,sql_id,sql_exec_start,in_parse session_type='FOREGROUND'
>> starttime endtime
>> https://github.com/tanelpoder/tpt-oracle/blob/master/ash/dashtop.sql
>> BTW: endtime should be the moment you receive the error, starttime can be
>> 15 minutes before
>>
>> At least I`ll make a guess based on feeling + some data :)
>>
>> În lun., 2 mai 2022 la 14:16, Pap <oracle.developer35_at_gmail.com> a scris:
>>
>>> Thank you so much Laurentiu and Dominic.
>>>
>>> I tried checking the dash_wait_chains script for 15minutes window , just
>>> by passing first parameter as session_type='FOREGROUND' and but not seeing
>>> much blocking waits as in below output. And also as i don't have the sql_id
>>> for the exact ALTER statement captured as it must be very fast and thus not
>>> getting captured in ASH/AWR, so was not able to pass the second parameter
>>> i.e sql_id filter.
>>>
>>> To Dominic's point , i think this ALTER sql_id wont be captured as its
>>> very fast and failing immediately when its seeing the lock from other
>>> session. And also we are struggling to catch the culprit session too. So as
>>> suggested, will try to set the ddl_lock_timeout to ~300 i.e 5 minutes so as
>>> to capture enough blocking session against the ALTER statement logged in
>>> the ASH/AWR to get the details around the culprit session. And considering
>>> the data-load itself take ~1 hrs+ so this amount of additional wait time(5
>>> mins) should be fine. That should possibly also fix the issue if the
>>> blocking lock/culprit session is just holding it for very small time and
>>> releasing it immediately after.
>>>
>>>
>>> [image: User: "image.png"]
>>>
>>>
>>>
>>>
>>> On Mon, 2 May 2022, 12:03 pm Dominic Brooks, <dombrooks_at_hotmail.com>
>>> wrote:
>>>
>>>> Problem is that the error is likely to be immediate.
>>>>
>>>> If you increase/set the param ddl_lock_timeout in your load code before
>>>> you do enable constraints then
>>>> a) you might wait sufficient time not to fail, ie your blocker finishes
>>>> before the timeout end and/or
>>>> b) you should capture some information in ASH about the blocking
>>>> scenario.
>>>>
>>>> Cheers,
>>>> Dominic
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On 2 May 2022, at 06:55, Laurentiu Oprea <laurentiu.oprea06_at_gmail.com>
>>>> wrote:
>>>>
>>>> 
>>>> Hello Pap,
>>>>
>>>> To diagnose your issue ASH should do the trick.
>>>>
>>>> Here is an example(script dash_wait_chains) :
>>>>
>>>> https://tanelpoder.com/2013/11/06/diagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2/
>>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftanelpoder.com%2F2013%2F11%2F06%2Fdiagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2%2F&data=05%7C01%7C%7C97ae9142e13241b6ef5308da2c005496%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637870677178768226%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=RXOJ6ean3%2FHVZiEqVguXpBwUZWt7sQi4OjPAPTmACX8%3D&reserved=0>
>>>>
>>>> Thank you.
>>>>
>>>> În lun., 2 mai 2022 la 05:58, Pap <oracle.developer35_at_gmail.com> a
>>>> scris:
>>>>
>>>>> Hi, We are facing Ora-0054 error on specific time of the month for a
>>>>> job and from the line number noted in the error log its pointing to the
>>>>> alter statement which enables reference constraints after a direct path
>>>>> data load and commit. But we tried monitoring it a few times during
>>>>> the error period/run time and surprisingly we are not seeing any lock(from
>>>>> v$locked_object) on the base table(say e.g. MAIN_TAB in below example) but
>>>>> still the job failed and it succeeded after a couple of rerun attempts.
>>>>> But it seems like , some process runs and takes lock(for may be very
>>>>> small time) and makes this 'ALTER' failure and we were unable to find
>>>>> that out.
>>>>>
>>>>> So I wanted to understand , how to get the culprit session/sql/job so
>>>>> that we can have some dependency set so that this failure won't
>>>>> occur? Is there any possible trace to set which will help us to debug this
>>>>> scenario and get the details around the culprit session? Or somehow
>>>>> we can track it from any of the ASH/AWR views for which we have
>>>>> enough retention in place?
>>>>>
>>>>> *****The error log pointing to one of the below statements
>>>>> failure inside the procedure *******
>>>>>
>>>>> ALTER TABLE SCHEMA1.MAIN_TAB ENABLE NOVALIDATE CONSTRAINT MAIN_TAB_R01;
>>>>>
>>>>> ALTER TABLE SCHEMA1.MAIN_TAB ENABLE NOVALIDATE CONSTRAINT MAIN_TAB_R03;
>>>>>
>>>>> ORA-00054: resource busy and acquire with NOWAIT specified or timeout
>>>>> expired
>>>>>
>>>>> ********Table structure***************
>>>>>
>>>>> CREATE TABLE SCHEMA1.MAIN_TAB
>>>>>
>>>>> ( FFKEY NUMBER CONSTRAINT MAIN_TAB_C01 NOT NULL,
>>>>>
>>>>> SCKEY NUMBER CONSTRAINT MAIN_TAB_C02 NOT NULL,
>>>>>
>>>>> ACKEY NUMBER,
>>>>>
>>>>> CKEY NUMBER CONSTRAINT MAIN_TAB_C04 NOT NULL,
>>>>>
>>>>> CSTKEY NUMBER CONSTRAINT MAIN_TAB_C05 NOT NULL,
>>>>>
>>>>> EPC_KEY NUMBER CONSTRAINT MAIN_TAB_C06 NOT NULL,
>>>>>
>>>>> CRRKEY NUMBER CONSTRAINT MAIN_TAB_C07 NOT NULL
>>>>>
>>>>> );
>>>>>
>>>>> ALTER TABLE SCHEMA1.MAIN_TAB ADD ( CONSTRAINT MAIN_TAB_R01 FOREIGN
>>>>> KEY (CKEY) REFERENCES SCHEMA1.REF_TAB1 (CKEY) ENABLE NOVALIDATE,
>>>>>
>>>>> CONSTRAINT MAIN_TAB_R03 FOREIGN KEY (SCKEY) REFERENCES
>>>>> SCHEMA1.REF_TAB2 (SCKEY) ENABLE NOVALIDATE);
>>>>>
>>>>> ******************
>>>>>
>>>>> CREATE TABLE SCHEMA1.REF_TAB1( CKEY NUMBER CONSTRAINT REF_TAB1_C01
>>>>> NOT NULL );
>>>>>
>>>>> CREATE UNIQUE INDEX SCHEMA1.REF_TAB1_PK ON SCHEMA1.REF_TAB1(CKEY);
>>>>>
>>>>> ALTER TABLE SCHEMA1.REF_TAB1 ADD ( CONSTRAINT REF_TAB1_PK PRIMARY
>>>>> KEY (CKEY) USING INDEX SCHEMA1.REF_TAB1_PK ENABLE VALIDATE);
>>>>>
>>>>> *******************
>>>>>
>>>>> CREATE TABLE SCHEMA1.REF_TAB2( SCKEY NUMBER CONSTRAINT REF_TAB2_C01
>>>>> NOT NULL );
>>>>>
>>>>> CREATE UNIQUE INDEX SCHEMA1.REF_TAB2_PK ON SCHEMA1.REF_TAB2(SCKEY);
>>>>>
>>>>> ALTER TABLE SCHEMA1.REF_TAB2 ADD ( CONSTRAINT REF_TAB2_PK PRIMARY
>>>>> KEY (SCKEY) USING INDEX SCHEMA1.REF_TAB2_PK ENABLE VALIDATE);
>>>>>
>>>>> ******************
>>>>>
>>>>>
>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon May 02 2022 - 22:15:49 CEST

Original text of this message