A question on Interpreting 10046 trace and the waits [message #291720] |
Sun, 06 January 2008 04:16 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
Hi,
I need help to understand and interpret the output of 10046 trace file (with level 8 ). The problem that we are trying to solve is that: in a batch process where thousands of inserts are being done, at some point application hangs - more correctly, transaction time out occurs. The developers say that: it is the because soemthing happening at the database level and we are analyzing that issue, for that this is a portion of the trace file:
[B]<Query X> [/B]
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 7 0.01 0.00 0 0 0 0
Fetch 14 0.00 0.00 0 63 0 70
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 22 0.01 0.00 0 63 0 70
Misses in library cache during parse: 0
Misses in library cache during execute: 1
Optimizer goal: CHOOSE
Parsing user id: 933
Rows Row Source Operation
------- ---------------------------------------------------
50 SORT UNIQUE
50 NESTED LOOPS OUTER
40 INDEX RANGE SCAN PK_TABLE1 (object id 1581385)
30 TABLE ACCESS BY INDEX ROWID TABLE2
30 INDEX RANGE SCAN NK3_TABLE2 (object id 1581563)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 15 0.00 0.00
[B] SQL*Net message from client 14 29.35 57.40 [/B]
********************************************************************************
It is seen from the trace that maximum wait is "SQL*Net message from the client (54 seconds). In this regard, my question is:
1) Is the <Query X> responsible for this wait? or is the next query in the trace file after this <Query X>, responsible for this ?
2) Developers are asking that where or in what portion of the code should they look to investigate? (based on trace file, they expect me to tell them that.) - Now from the trace file, I found the maximum wait is in the portion copied above. so in that case does it mean that -
a) the <Query X> is reponsible for the wait ? or the next query in trace file just after the <Query X> is responsible or the application code somewhere around <Query X> is responsible?
With thanks,
Nirav
|
|
|
|
Re: A question on Interpreting 10046 trace and the waits [message #291732 is a reply to message #291729] |
Sun, 06 January 2008 05:38 |
orausern
Messages: 826 Registered: December 2005
|
Senior Member |
|
|
Michel,
Yes, you are exactly correct, it is the same question, but actually, the programmers are telling me that:
It is the app server that is waiting on the database, not the other way around. and they ask me the question:
If it is the database that is waiting then, which portion of the code is making it wait? This is a J2EE project. is there any way, by seeing the location of the wait event, I can determine, where the programmers have to look?
To put in their words:
We can’t understand where and why database is waiting to hear back from the application. There is no ongoing interaction between the application and database. Application creates the DML and issues the command to the database to implement and waits for the return value from the database and the transaction times out as it does not hear back from the database on the DML issued.
"So if you can clarify where the DB is waiting to hear back from Application, so we can look at the code" Now how do I find the location where it is waiting?
So I review the entire trace file and find that after one particular query, there is a huge wait. So my question is: Can I tell them that: look in the code around the place where you put this sql (<Query X>)? Would that be a correct interpretation?
Thanks,
Nirav
[Updated on: Sun, 06 January 2008 05:51] Report message to a moderator
|
|
|
|
Re: A question on Interpreting 10046 trace and the waits [message #291816 is a reply to message #291720] |
Mon, 07 January 2008 00:01 |
orafan23
Messages: 13 Registered: December 2005
|
Junior Member |
|
|
As Michel rightly pointed out you have to dig into the trace further.
SQL*Net message from client cannot be attributed to a particular action.Oracle dumps all the latencies into this., say if from the time you turned on trace ,to the time time 1st query was run and then there is a 10secs delay before the next query fires,Oracle rolls up all of these into this wait event.
If there is something you might want to see from the application code, try to remove the nested loop avoiding gathering rows one by one.Also try using array processing feature.
Hope that helps.
Regards
|
|
|
|