SQL*Net message waits

From: Robyn Anderson Sands <robyn_at_iTeamTech.COM>
Date: Tue, 16 Mar 2004 17:23:58 -0500 (EST)
Message-ID: <Pine.GSO.4.05.10403161625190.5401-100000@dizzy.sweettea.org>

Hi Mark,

I've asked the users to let me know which processes need to be evaluated but they haven't been able to tell me thus far. Application speed is an issue with some users, but not others. The only SLA I've been given is 99.98 uptime on a very buggy application (sigh).

I normally use 10046 traces on individual, longer running jobs and I know this isn't the technically correct way to do this, but the only question I'm being asked at the moment is 'Is there a communication problem between the app server and the db?'

Summary of the trace file info looks like this:

call	count	cpu	elapsed	disk	query	current	rows
Parse	9388	2.29	2.75	0	10	0	0
Execute 21338	14.84	18.42	50	46799	12979	2733
Fetch	18272	56.16	61.66	62591	4929531	4158	217312	
Total	48998	73.29	82.83	62641	4976340	17137	220045

Event waited			times   max     total
db file scattered read 		6792	0.02	2.49
db file sequential read		2537	0.03	5.65
latch free			47	0.02	0.12
log file sync			5	0.03	0.06
SQL*Net message from client:	14414	28.86	804.08 *
SQL*Net message to client:	14415	0.01	0.05   *
SQL*Net more data from client:	50	0.01	0.01
SQL*Net more data to client:	4296	0.01	0.23

These numbers are from pooled connection traces and there is lots of user think time that would be included. That's my real question: can I use a 10046 trace to evaluate pooled client connections? Since the database appears to be waiting on a client input, can I ignore the 'from client' waits or could they be hiding other problems?

No problems have been noted on the database or the os. The servers are all sized beyond the vendor specs; client hardware is another story.

If there's a better way to monitor for connection problems between an app server and the database, please let me know ...

Robyn

-----Original Message-----

From: Bobak, Mark
Sent: Tuesday, March 16, 2004 4:32 PM
To: oracle-l_at_freelists.org
Subject: RE: SQL*Net message waits

Robyn,

Ok, of a 30 minute window, 0.05 seconds total was attributed to SQL*Net message waits. How many total SQL*Net waits were in the trace? I'm a little concerned because with the scoping error you admit is present in your tracing, it's going to obfuscate the solution a bit. I'm also a little concerned that "the users can't tell me exactly which actions they believe are slow....". That statement tells me your system has no real measurable SLA. Without that, how will you know when you've succeeded? Ok, ok, I know, it wasn't your idea but you're stuck with it. Which business process is the most business critical and performing the worst. Start there. Try to get an accurately scoped trace of that process. This will reveal much in determining where the problem is.

Now, going back to the trace you already got: What does the profile of the other 29 minutes 59.95 seconds look like? How many database calls are in the trace file? Database calls are PARSE, EXEC and FETCH. Also, how much CPU time is represented in that 30 minutes? How much wait time? What events were you waiting on?

Also, consider what's going on the in the O/S. Are the CPUs hammered? Are there lots of processes waiting on I/O?

Hope that helps,

-Mark

Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--

Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html

Received on Tue Mar 16 2004 - 16:31:06 CST