Re: Reading resource busy error system trace

From: Pap <oracle.developer35_at_gmail.com>
Date: Mon, 4 Jul 2022 21:36:39 +0530
Message-ID: <CAEjw_fj8MbfrRByNdQKRwfxJdvXOiQAi-EZbAd8XRdaKtsK2bA_at_mail.gmail.com>



I searched with string 'Call Stack Trace' but didnt find any such in all the captured trace files generated with that ospid.

I double checked with the team and found the level 266 trace was not getting generated somehow, so they did set it as level 258 as below. However wondering if this detail expanded 'call stack trace' is not captured just because it's been generated with level 258 as opposed to level 266.

ALTER SYSTEM SET EVENTS '54 trace name systemstate level 258';

On Mon, 4 Jul 2022, 7:52 pm Jonathan Lewis, <jlewisoracle_at_gmail.com> wrote:

>
> You didn't miss anything.
>
> I read your comments about *systemstate level 266*, but my fingers typed *errorstack
> level 1* when I wasn't watching because that's a good first choice to try
> when you see a session raising an error.
>
> This is an extract from my test dump (with a lot of lines near the start
> of the call stack deleted):
>
> ----- Error Stack Dump -----
> <error barrier> at 0x7ffc8604b220 placed dbkda.c_at_296
> ORA-00054: resource busy and acquire with NOWAIT specified or timeout
> expired
> ----- Current SQL Statement for this session (sql_id=dutzuyds7kb59) -----
> alter table child enable novalidate constraint c_fk_p
>
> ----- Call Stack Trace -----
> calling call entry argument values in hex
>
> location type point (? means dubious value)
>
> -------------------- -------- --------------------
> ----------------------------
> ksedst1()+95 call kgdsdst() 7FFC8604A680 000000002
> 7FFC860449B0 ?
> 7FFC86044AC8 ?
> ...
> 000000000 000000082 ?
> ktaiam()+1110 call ksesecl0() 7FC6171F49C0 ?
> 7FC6170D0050 ?
> 000000036 ? 000000036
> 09BB833E8 000000000
> ktagetg0()+929 call ktaiam() 00001E6ED 000000004
> 000000000
> 7FFC8604DAB8 ?
> 000000000 ?
> 000000000 ?
> ktagetp_internal()+ call ktagetg0() 00001E6ED ? 000000004 ?
> 141 000000004 ?
> 7FFC8604DAB8 ?
> 000000000 ? 000000000 ?
> ktagetg_ddlX()+323 call ktagetp_internal() 00001E6ED 000000000
> 000000004
> 000000000 ? 000000000
> 746DC3D600000000
> ktagetg_ddl()+30 call ktagetg_ddlX() 00001E6ED ? 000000000 ?
> 000000004 ? 000000000 ?
> 000000000 ? 000000000
> kkdllk0()+1551 call ktagetg_ddl() 00001E6ED ? 000000000 ?
> 000000004 ? 000000000 ?
> 000000000 ? 000000000 ?
>
> In your systemstate dump you should find a shortstack that includes a
> section similar to:
> ...
> ktaiam()+1110<-ktagetg0()+929<-ktagetp_internal()+141<-ktagetg_ddlX()+323<-ktagetg_ddl()+30
> ...
> which gets expanded to the detailed call stack above.
>
> Regards
> Jonathan Lewis
>
>
>
> On Mon, 4 Jul 2022 at 13:26, Pap <oracle.developer35_at_gmail.com> wrote:
>
>> Thank You Jonathan.
>>
>> In the event trace i don't see any 'call stack trace' but seeing 'Short
>> stack dump' something as below. But again in that i don't see such
>> object_id in hexa format. Am i missing anything?
>>
>> Also i didnt see any DML on the parent table same time in ASH but even
>> dev team saying they dont perform any DML on the parent objects during this
>> heavy child direct path insert runs.
>>
>> https://gist.github.com/oracle9999/95307babd0366cc0c25afd447fa0264d
>>
>>
>> On Mon, 4 Jul 2022, 3:44 pm Jonathan Lewis, <jlewisoracle_at_gmail.com>
>> wrote:
>>
>>>
>>> Your dump file seems to be saying that the session is holding a mode 4
>>> on the "child" table, and since the insert /*+ append */ would have been
>>> holding a mode 6 the commit must have cleared the lock properly before the
>>> following code acquired the mode 4.
>>>
>>> Even though you're enabling with "novalidate" the enable would try to
>>> lock the child and then the parent in mode 4, so the first thing you need
>>> to check is that the problem isn't with the parent table being locked in an
>>> incompatible fashion.
>>>
>>> I'm not 100% certain, but if you check the call stack trace (headed as
>>> follows)
>>>
>>> ----- Call Stack Trace -----
>>> calling call entry argument values in hex
>>> location type point (? means dubious
>>> value)
>>> -------------------- -------- --------------------
>>> ----------------------------
>>>
>>> I think you'll find several lines similar to:
>>> ktagetg0()+929 call ktaiam() 00001E6ED 000000004
>>> 000000000
>>> 7FFFC6AA87B8 ?
>>> 000000000 ?
>>> 000000000 ?
>>> ktagetp_internal()+ call ktagetg0() 00001E6ED ? 000000004
>>> ?
>>> 141 000000004 ?
>>> 7FFFC6AA87B8 ?
>>> 000000000 ? 000000000
>>> ?
>>> ktagetg_ddlX()+323 call ktagetp_internal() 00001E6ED 000000000
>>> 000000004
>>> 000000000 ? 000000000
>>> 4DE9A6BF00000000
>>> ktagetg_ddl()+30 call ktagetg_ddlX() 00001E6ED ? 000000000
>>> ?
>>> 000000004 ? 000000000
>>> ?
>>> 000000000 ? 000000000
>>> kkdllk0()+1551 call ktagetg_ddl() 00001E6ED ? 000000000
>>> ?
>>> 000000004 ? 000000000
>>> ?
>>> 000000000 ? 000000000
>>> ?
>>>
>>> The first parameter will be the object_id of the parent table where the
>>> attempt to lock failed. (In my sample 0001E6ED).
>>>
>>> This may take you one step forward in your search for the source of the
>>> problem. Is it possible that you've got informatica doing "insert lots of
>>> parent rows", "insert lots of child rows" nearly concurrently with the
>>> expectation that the parent insert will complete and commit before the
>>> child insert starts (or completes). I have seen a couple of cases where
>>> such things happen, and if the parent insert gets delayed the child insert
>>> crashes.
>>>
>>> Regards
>>> Jonathan Lewis
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sat, 2 Jul 2022 at 18:57, Pap <oracle.developer35_at_gmail.com> wrote:
>>>
>>>> Hello All, I followed the blog below and tried debugging the system
>>>> trace to find the blocking session/sql which caused this resource busy
>>>> error, Ora-0054. But I am encountering strange behaviour. It appears to be
>>>> the same session blocking itself.The session which did the direct path load
>>>> and commit seems not releasing exclusive lock for some duration post
>>>> commit. Below are the details of the behaviour. Has anybody
>>>> encountered such a scenario?
>>>>
>>>>
>>>> https://nenadnoveljic.com/blog/troubleshooting-ora-00054-resource-busy-and-acquire-with-nowait-specified-or-timeout-expired/
>>>>
>>>> So the code which is encountering the error is below. Its disabling all
>>>> constraints + Making index unusable and then does direct path load to that
>>>> table and then after while trying to enable the constraint of the same
>>>> table using 'ALTER' and its failing with Ora-0054. The sample code for data
>>>> load and ALTER table is as below.
>>>>
>>>> ****Sample code *****
>>>> INSERT /*+Append*/ INTO TAB1 (c1,c2.....)
>>>> SELECT (ROWNUM + sqq_key) ,col1, col2, col3..... from....tab2,
>>>> tab3, tab4.....;
>>>> COMMIT;
>>>>
>>>> FOR I IN (SELECT TABLE_NAME, CONSTRAINT_NAME FROM ALL_CONSTRAINTS WHERE
>>>> TABLE_NAME = v_table_nm AND CONSTRAINT_TYPE = 'R' AND STATUS = 'DISABLED')
>>>> LOOP
>>>> EXECUTE IMMEDIATE ('ALTER TABLE ' || v_table_nm || ' ENABLE
>>>> NOVALIDATE CONSTRAINT '|| I.CONSTRAINT_NAME);
>>>> END LOOP I;
>>>>
>>>> ********
>>>>
>>>> When i scan through trace file as per the blog step by step, i see below
>>>>
>>>> *Step- 1 -* I searched through the object_Id(101739) for TAB1 and got
>>>> the section below, i.e pointing towards a DML lock on the table. And the
>>>> process state object is proc: 0x243efffc0.
>>>>
>>>> ----------------------------------------
>>>> SO: 0x233cd8bc8, type: DML lock (83), map: 0x237d28a10
>>>> state: LIVE (0x4532), flags: 0x0
>>>> owner: 0x233b9af60, proc: 0x243efffc0
>>>> link: 0x233cd8be8[0x28acf1e58, 0x1d793c780]
>>>> pg: 0
>>>> SOC: 0x237d28a10, type: *DML lock* (83), map: 0x233cd8bc8
>>>> state: LIVE (0x99fc), flags: INIT (0x1)
>>>> DML LOCK: tab=101739 flg=11 chi=0
>>>> his[0]: mod=4 spn=69302
>>>> 2022-07-01 05:44:10.662*:ksq.c_at_10787:ksqdmc(): Enqueue Dump
>>>> (enqueue) TM-00018D6B-00000000-00000000-00000000 DID: ksqlkdid:
>>>> 0001-00CF-000ACD5C
>>>> lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 res_flag: 0x7
>>>> mode: S, lock_flag: 0x20, lock: 0x237d28a38, res: 0x2499f3798
>>>> own: 0x2488605f8, sess: 0x2488605f8, *proc: 0x2403a4510*, prv:
>>>> 0x2499f37a8
>>>> slk: 0x2530967c8
>>>> ----------------------------------------
>>>>
>>>>
>>>> *STEP - 2:-* Then i searched through the trace for the state object
>>>> address i.e. "*SO: 0x243efffc0*" and got below. its pointing to
>>>> pid:207 as below.
>>>>
>>>>
>>>> *** 2022-07-01T05:44:10.538156-04:00
>>>> PROCESS 207:
>>>> ----------------------------------------
>>>> *SO: 0x243efffc0*, type: process (2), map: 0x2403a4510
>>>> state: LIVE (0x4532), flags: 0x1
>>>> owner: (nil), proc: 0x243efffc0
>>>> link: 0x243efffe0[0x243efffe0, 0x243efffe0]
>>>> child list count: 15, link: 0x243f00030[0x243b7f190, 0x1d7903b10]
>>>> pg: 0
>>>> SOC: 0x2403a4510, type: process (2), map: 0x243efffc0
>>>> state: LIVE (0x99fc), flags: INIT (0x1)
>>>> (process) *Oracle pid:207,* ser:3445, calls cur/top:
>>>> 0x1a0372f08/0x1d2ee6c70
>>>> flags : (0x0) - icon_uid:0 logon_pdbid=0
>>>> flags2: (0x1000), flags3: (0x510)
>>>> call error: 0, sess error: 0, txn error 0
>>>> intr queue: empty
>>>> (post info) last post received: 0 0 277
>>>> last post received-location: kcl2.h LINE:5703 ID:kclcget
>>>> last process to post me: 0x24c3b2180 1 6
>>>> last post sent: 0 0 33
>>>> last post sent-location: ksa2.h LINE:298 ID:ksasnd
>>>> last process posted by me: 0x24837eaa0 1 6
>>>> waiter on post event: 0
>>>> (latch info) hold_bits=0x0 ud_influx=0x4101b
>>>> (osp latch info) hold_bits=0x0 ud_influx=0x0
>>>> Process Group: DEFAULT, pseudo proc: 0x2484c5d70
>>>> O/S info: user: grid, term: UNKNOWN, ospid: 383076
>>>> OSD pid info:
>>>> Short stack dump:
>>>>
>>>> ksedsts()+426<-ksdxfstk()+58<-ksdxdocmdmultex()+5961<-ksdxdocmdmult()+55<-ksudmp_proc()+1626<-ksudss_main()+1976<-ksudss_opt()+1544<-dbkdaKsdActDriver()+2484<-dbgdaExecuteAction()+354<-dbgdaRunAction()+770<-dbgdRunActions()+83<-dbgdProcessEventActions()+525<-dbgdChkEventKgErr()+394<-dbkdChkEventRdbmsErr()+65<-dbgePostErrorKGE()+1066<-dbkePostKGE_kgsf()+71<-kgeade()+338<-kgeselv()+89<-ksesecl0()+189<-ktaiam()+1110<-ktagetg0()+929<-ktagetp_internal()+141<-ktagetg_ddlX()+323<-ktagetg_ddl()+30<-kkdllk0()+1551<-kkdcprc()+711<-kkdcmco()+3682<-atbadd()+8770<-atbdrv()+14663<-opiexe()+30061<-opiosq0()+4575<-opipls()+14619<-opiodr()+1202<-rpidrus()+198<-skgmstack()+65<-rpidru()+132<-rpiswu2()+614<-rpidrv()+1494<-psddr0()+467<-psdnal()+646<-pevm_EXIM()+276<-pfrinstr_EXIM()+43<-pfrrun_no_tool()+52<-pfrrun()+902<-plsql_run()+755<-peicnt()+279<-kkxexe()+720<-opiexe()+24762<-kpoal8()+2226<-opiodr()+1202<-ttcpip()+1239<-opitsk()+1897<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+245
>>>>
>>>> KGL-UOL (Process state object)
>>>> KGX Atomic Operation Log 0x2403a5690
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a56f8
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a5760
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a57c8
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a5830
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a5898
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a5900
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a5968
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> KGX Atomic Operation Log 0x2403a59d0
>>>> Mutex (nil)(0, 0) idn 0 oper NONE(0)
>>>> FSO mutex uid 262142 efd 0 whr 0 slp 0
>>>> ----------------------------------------
>>>>
>>>>
>>>> *Step-3 :- * Then searching for the *pid-207* in the start of the
>>>> trace , i found *sid - 1072 ser 23 *and its pointing to the same ALTER
>>>> TABLE statement. The OSPID is the same one for which the trace gets
>>>> generated. And if I try to see the DML in this sesion before this ALTER
>>>> it's the same one, which has just finished up doing the COMMIT for the
>>>> direct path load.
>>>>
>>>> So I am wondering , is it possible that we are hitting any bug due to
>>>> which Oracle is actually not releasing the Exclusive table lock immediately
>>>> post Commit of the direct path load ? And that is why , even if we are
>>>> trying to do the ALTER post COMMIT in the same session we are hitting
>>>> Ora-0054 as it blocks itself? And then should we put some wait/sleep
>>>> statement post commit i.e. before the ALTER to avoid this bug?
>>>>
>>>> 207: USER ospid 383076 sid 1072 ser 23,
>>>> Cmd: ALTER TABLE
>>>>
>>>> On Mon, Jun 13, 2022 at 4:28 PM Pap <oracle.developer35_at_gmail.com>
>>>> wrote:
>>>>
>>>>> Thank You So much Nenad.
>>>>>
>>>>> Unfortunately while checking with the team got to know that the other
>>>>> trace files has been removed from the location. So i will wait for the next
>>>>> occurrence of the error to capture and analyze all the traces wrt that one
>>>>> ospid.
>>>>>
>>>>>
>>>>> On Sun, 12 Jun 2022, 9:24 pm Noveljic Nenad, <
>>>>> nenad.noveljic_at_vontobel.com> wrote:
>>>>>
>>>>>> Your system dump apparently consists of several files.
>>>>>>
>>>>>>
>>>>>>
>>>>>> At the very beginning of the system dump there must be a section that
>>>>>> identifies the blockers for each process and looks something like this:
>>>>>>
>>>>>> 157: USER ospid 1385877 sid 786 ser 49681, waiting for 'log file
>>>>>> switch (checkpoint incomplete)'
>>>>>>
>>>>>> Cmd: INSERT
>>>>>>
>>>>>> Blocked by inst: 1, sid: 266, ser: 13238
>>>>>>
>>>>>> Final Blocker inst: 1, sid: 266, ser: 13238
>>>>>>
>>>>>> 159: USER ospid 1295896 sid 796 ser 48963, waiting for 'log file
>>>>>> switch (checkpoint incomplete)'
>>>>>>
>>>>>> Cmd: INSERT
>>>>>>
>>>>>> Blocked by inst: 1, sid: 266, ser: 13238
>>>>>>
>>>>>> Final Blocker inst: 1, sid: 266, ser: 13238
>>>>>>
>>>>>>
>>>>>>
>>>>>> First, check who was blocking the Powercenter process with ospid 89092
>>>>>> (Blocked by…). Then search **all** system dump trace files (**89092**.trc)
>>>>>> for the blocker to look at its state objects. Once you now the sid and
>>>>>> serial of the blocker, you can position yourself to its state objects with
>>>>>> the following search expression: sid: xxx ser: xxx . One of the state
>>>>>> objects will be for the cursor that blocked your DDL on MAIN_TAB.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nenad
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Pap <oracle.developer35_at_gmail.com>
>>>>>> *Sent:* Sonntag, 12. Juni 2022 14:06
>>>>>> *To:* Noveljic Nenad <nenad.noveljic_at_vontobel.com>
>>>>>> *Cc:* Oracle L <oracle-l_at_freelists.org>
>>>>>> *Subject:* Re: Reading resource busy error system trace
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thank You Nenad.
>>>>>>
>>>>>> So actually here the session/Alter statement, which failed, is from
>>>>>> the informatica power center session only. So do you mean we should search
>>>>>> the body of the original trace file from which this trace continued from
>>>>>> i.e "u01/app/oracle/diag/rdbms/DB1p/DB11/trace/DB11_ora_89092_8.trc"
>>>>>> rather this one. And in that trace the sql statements near the "final
>>>>>> blocker" string will be the cause of this error?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ____________________________________________________
>>>>>>
>>>>>> Please consider the environment before printing this e-mail.
>>>>>>
>>>>>> Bitte denken Sie an die Umwelt, bevor Sie dieses E-Mail drucken.
>>>>>>
>>>>>>
>>>>>> Important Notice
>>>>>>
>>>>>> This message is intended only for the individual named. It may
>>>>>> contain confidential or privileged information. If you are not the named
>>>>>> addressee you should in particular not disseminate, distribute, modify or
>>>>>> copy this e-mail. Please notify the sender immediately by e-mail, if you
>>>>>> have received this message by mistake and delete it from your system.
>>>>>> Without prejudice to any contractual agreements between you and us
>>>>>> which shall prevail in any case, we take it as your authorization to
>>>>>> correspond with you by e-mail if you send us messages by e-mail. However,
>>>>>> we reserve the right not to execute orders and instructions transmitted by
>>>>>> e-mail at any time and without further explanation.
>>>>>> E-mail transmission may not be secure or error-free as information
>>>>>> could be intercepted, corrupted, lost, destroyed, arrive late or
>>>>>> incomplete. Also processing of incoming e-mails cannot be guaranteed. All
>>>>>> liability of Vontobel Holding Ltd. and any of its affiliates (hereinafter
>>>>>> collectively referred to as "Vontobel Group") for any damages resulting
>>>>>> from e-mail use is excluded. You are advised that urgent and time sensitive
>>>>>> messages should not be sent by e-mail and if verification is required
>>>>>> please request a printed version.
>>>>>> Please note that all e-mail communications to and from the Vontobel
>>>>>> Group are subject to electronic storage and review by Vontobel Group.
>>>>>> Unless stated to the contrary and without prejudice to any contractual
>>>>>> agreements between you and Vontobel Group which shall prevail in any case,
>>>>>> e-mail-communication is for informational purposes only and is not intended
>>>>>> as an offer or solicitation for the purchase or sale of any financial
>>>>>> instrument or as an official confirmation of any transaction.
>>>>>> The legal basis for the processing of your personal data is the
>>>>>> legitimate interest to develop a commercial relationship with you, as well
>>>>>> as your consent to forward you commercial communications. You can exercise,
>>>>>> at any time and under the terms established under current regulation, your
>>>>>> rights. If you prefer not to receive any further communications, please
>>>>>> contact your client relationship manager if you are a client of Vontobel
>>>>>> Group or notify the sender. Please note for an exact reference to the
>>>>>> affected group entity the corporate e-mail signature. For further
>>>>>> information about data privacy at Vontobel Group please consult
>>>>>> www.vontobel.com.
>>>>>>
>>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jul 04 2022 - 18:06:39 CEST

Original text of this message