Re: 10046 trace - unaccounted for time
Date: Wed, 6 Aug 2008 14:40:09 -0500 (CDT)
Message-ID: <40196.206.227.160.10.1218051609.squirrel@lady.zephyrus.com>
It's actually an insert that is causing the problems. From tkprof output you
can see the total cpu is only .03 seconds for 19 inserts. Below that is the
raw trace information that shows 78 seconds of unaccounted for time during one
of the inserts.
We are setting up some tests on a non-production database and I will try to get more information and run queries during the actual hang.
BTW, the first column is a unique ID and the 2nd column is a clob but usually small (less than 1K bytes).
Keith
INSERT INTO PWRLINE.LSCHANNELCUTDATA(UIDCHANNELCUT, VALUECODES)
VALUES
(:1,:2) RETURNING "VALUECODES" INTO :3
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 19 0.00 0.00 0 0 0 0 Execute 19 0.03 0.01 0 19 133 19 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 38 0.03 0.01 0 19 133 19 =====================
PARSING IN CURSOR #3 len=109 dep=0 uid=19 oct=2 lid=19 tim=16059356779862 hv=1757063019 ad='13f277e0'
INSERT INTO PWRLINE.LSCHANNELCUTDATA(UIDCHANNELCUT, VALUECODES) VALUES(:1,:2) RETURNING "VALUECODES"
INTO :3
END OF STMT
PARSE #3:c=0,e=169,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=16059356779856
WAIT #3: nam='SQL*Net message to client' ela= 1 p1=1413697536 p2=1 p3=0 EXEC #3:c=0,e=700,p=0,cr=1,cu=7,mis=0,r=1,dep=0,og=4,tim=16059356780754 WAIT #3: nam='SQL*Net message from client' ela= 1270 p1=1413697536 p2=1 p3=0*** 2008-08-03 23:26:47.196
WAIT #10: nam='latch free' ela= 12 p1=16764539576 p2=98 p3=0 *** 2008-08-03 23:27:05.703
WAIT #10: nam='db file sequential read' ela= 15214 p1=121 p2=125734 p3=1 WAIT #10: nam='latch free' ela= 12 p1=16796726312 p2=98 p3=0 *** 2008-08-03 23:28:05.067
WAIT #10: nam='db file sequential read' ela= 44241 p1=232 p2=112428 p3=1 WAIT #10: nam='SQL*Net more data from client' ela= 148 p1=1413697536 p2=120 p3=0 WAIT #10: nam='SQL*Net more data from client' ela= 38 p1=1413697536 p2=131 p3=0 WAIT #10: nam='SQL*Net more data from client' ela= 23 p1=1413697536 p2=142 p3=0 WAIT #10: nam='direct path write (lob)' ela= 2578 p1=113 p2=43160 p3=1 WAIT #0: nam='SQL*Net message to client' ela= 6 p1=1413697536 p2=1 p3=0 WAIT #0: nam='SQL*Net message from client' ela= 13420 p1=1413697536 p2=1 p3=0 WAIT #0: nam='SQL*Net message to client' ela= 2 p1=1413697536 p2=1 p3=0 WAIT #0: nam='SQL*Net message from client' ela= 1652 p1=1413697536 p2=1 p3=0 WAIT #0: nam='SQL*Net message to client' ela= 1 p1=1413697536 p2=1 p3=0 WAIT #0: nam='SQL*Net message from client' ela= 906 p1=1413697536 p2=1 p3=0 WAIT #0: nam='SQL*Net message to client' ela= 1 p1=1413697536 p2=1 p3=0 WAIT #0: nam='SQL*Net more data to client' ela= 148 p1=1413697536 p2=2002 p3=0 WAIT #0: nam='SQL*Net message from client' ela= 4457 p1=1413697536 p2=1 p3=0 =====================
> There is a 4th possible session state. It may be that the session is in
> an uninstrumented event. While this is unlikely, it does happen. I have
> seen this in active sessions where the event state is 'WAITED KNOWN
> TIME', the seconds_in_wait are increasing and the statistic 'CPU used by
> this session' is not increasing.
>
> Check the FETCH entry that encompasses those WAITs. If the c value (cpu
> time) matches up with the 'missing' time, you were likely using a lot of
> CPU. If the cpu time does not match, you could be waiting on cpu or in
> an uninstrumented event.
>
>
> Regards,
> Daniel Fink
>
>
> --
> Daniel Fink
>
> Help me support The Children's Hospital of Denver!
> I'm riding in the 2008 Courage Classic - 157 miles in 3 days
> Help me reach my goal of $2,500.00 in donations.
> Visit my Personal Rider Page http://www.couragetours.com/2008/danielwfink to
> donate
>
> OptimalDBA.com - Oracle Performance, Diagnosis, Data Recovery and Training
>
> OptimalDBA http://www.optimaldba.com
> Oracle Blog http://optimaldba.blogspot.com
>
> Lost Data? http://www.ora600.be/
>
>
> Cary Millsap wrote:
>> I agree with what Riyaj has said. It helps sometimes to realize that
>> "waits" just means essentially "OS calls."
>>
>> I'd expect your process is indeed doing a lot of buffer processing,
>> using data that's already in your database buffer cache. You can
>> confirm or refute this hypothesis quickly with a couple of snaps of
>> v$sess_io.
>>
>> Cary Millsap
>> http://method-r.com
>> http://carymillsap.blogspot.com
>>
>>
>> On Wed, Aug 6, 2008 at 12:46 PM, Riyaj Shamsudeen
>> <riyaj.shamsudeen_at_gmail.com <mailto:riyaj.shamsudeen_at_gmail.com>> wrote:
>>
>> Hi Keith
>> That means that process is either running on CPU or waiting for
>> CPU in the run queue, to be available. May be that all buffers it
>> is trying to access is in buffer cache and process is running on
>> the CPU without any waits.
>>
>> Cheers
>> Riyaj
>> The Pythian Group
>> blog: http://orainternals.wordpress.com
>>
>> Keith Moore wrote:
>>
>> Oracle 9.2.0.4 <http://9.2.0.4>, Solaris
>>
>> We are having a production issue where user processes are very
>> slow. We are
>> seeing 'cache buffers chains' latch events in the 10046 trace
>> file.
>>
>> What I'm not fully understanding is the long periods of
>> unaccounted for time
>> where the process hangs. If I do a 'tail -f' on the trace file
>> I can see it
>> pause for several seconds. Here is a small sample of the output:
>>
>> *** 2008-08-05 15:22:57.748
>> WAIT #23: nam='latch free' ela= 8 p1=16764558008 p2=98 p3=0
>> WAIT #23: nam='latch free' ela= 12 p1=16764645560 p2=98 p3=0
>> *** 2008-08-05 15:23:16.414
>> WAIT #23: nam='latch free' ela= 9 p1=16796730344 p2=98 p3=0
>> WAIT #23: nam='latch free' ela= 10 p1=16764600056 p2=98 p3=0
>> WAIT #23: nam='db file sequential read' ela= 1893 p1=66
>> p2=52489 p3=1
>> *** 2008-08-05 15:23:33.376
>> WAIT #23: nam='latch free' ela= 8 p1=16764602936 p2=98 p3=0
>> *** 2008-08-05 15:23:47.319
>> WAIT #23: nam='latch free' ela= 7 p1=16796666408 p2=98 p3=0
>>
>> If you look at the first four lines, you see 20 seconds have
>> elapsed but the
>> only wait events are the two 'cache buffers chains' waits for
>> a total of 20
>> microseconds.
>>
>> Can someone explain what is going on here and how to troubleshoot?
>>
>> FYI, this process is deleting and the inserting a LOB and the
>> hot blocks seem
>> to be in the LOB index.
>>
>> Thanks
>> Keith
>>
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
>>
>>
>>
>> --
>> http://www.freelists.org/webpage/oracle-l
>>
>>
>>
>
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Aug 06 2008 - 14:40:09 CDT