Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: OT Important: Oracle processes taking lots of CPU
Surprised to see somebody still uses the word 'grok'. I reread the book by
Heinlein last summer and found it still worth reading although it is from
1961 (before computers almost). Not that I read it when it came out, a bit
young then, I first read it late in the 70's (last century, my, my, am I
getting old?).
-----Oorspronkelijk bericht-----
Van: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]Namens Mark W. Farnham
Verzonden: dinsdag 23 november 2004 17:03
Aan: sfaroult_at_roughsea.com; Tony.Adolph_at_o2.com; ORACLE-L_at_freelists.org;
New DBA
Onderwerp: RE: Important: Oracle processes taking lots of CPU
Give that man a cigar!
But I'll toss in a few bits anyway. The man page section on the system call or library reference "select" is often not loaded on various derivatives of Dennis Ricthie's operating system.
In brief, last time I checked the arguments were (nfds, readfds, writefds,
exceptfds, timeout),
where nfds is the size of the structure (makes some sense that it is a
constant), and the "fds" are file descriptors. If you're reading, then read
and except should be non-zero, if you're writing then write and except
should be non-zero. That looks like a really big number for timeout.
I wonder if this is a zombie (shadow process, two_task, pick your name for it) that is disconnected and something else is now inadvertantly using the same descriptor, causing this process to repeatedly be falsely woken. Then its activity would cause it to survive any timeout and keep sitting on the select if it is in a spot where Oracle wants it to wait for a reply.
Heh. A truss daemon rotating through Oracle back ends might be a lightweight way to check for candidate zombies to kill.
Then again, you might be really waiting in a competitive race for access, but I'm don't grok why both read and write are filled in and a number that looks like a file descriptor is in the timeout argument.
Regards,
mwf
-----Original Message-----
From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]On Behalf Of Stephane Faroult
Sent: Tuesday, November 23, 2004 9:01 AM
To: Tony.Adolph_at_o2.com; ORACLE-L_at_freelists.org; New DBA
Subject: Re: Important: Oracle processes taking lots of CPU
I know that it may seem confusing on oracle-l, but 'select' doesn't ONLY refer to the SQL language - in that case, it has to do with I/O multiplexing - try 'man select'. Identifying what your file descriptors are pointing to might help. In any case, you are more likely to see things with sar or iostatthat the V$ views, as you pointed. Regards,
Stephane Faroult
RoughSea Ltd
http://www.roughsea.com
On Tue, 23 Nov 2004 05:25 , New DBA <new_dba_on_the_block_at_yahoo.com> sent:
Tony,
Yes if Oracle is not waiting but working no wait will be registered. But it should atleast be reflected in "CPU used by this session" stat. It doesn't.
I traced a few processes, but the trace files show no SQL which takes lots of CPU. Moreover, the CPU utlization in the trace file, or in v$sesstat don't match with the actual CPU taken by the process as seen from the OS commands like "top"
Thats why I believe its some kind of O/S issue.
So I did a truss on the process. And I saw the following line repeating infinitely.
select(2048, 0x800003fffdffb3d0, 0x800003fffdffb4d0, 0x800003fffdffb5d0, 0x800003fffdffb6d0) = 0
I'm not sure how to interpret the output of truss, so I posted it in this forum since there are many experts out here, who might be able to interpret it!
Is there any further information I can gather at the O/S level which throws some light on the problem?
As far as statspack is concerned, we haven't implemented statspack, but I did run utlbstat/utlestat and uploaded the output to oraperf.com. It didn't suggested or detect excess CPU/LIOs, since those stats are pretty acceptable in the trace files.
Regards
New DBA
>I'm no expert here, but here *may be* a few things
>to think about:
>
>When Oracle is actually doing something it isn't
>recorded as a wait event,
>e.g. getting a datablock that is in cache doesn't
>generate a wait event.
>If your query is "horrible" you could be using loads
>of CPU without
>generating many waitevents.
>
>A little more dodgy info: "db file sequential
>read" is normally
>accociated with datafile access by rowid, ie. after
>an index lookup.
>
>I think I'd try to find out which queries are
>running during the
>performance problem times and explaining the
>queries.
>
>Also, have you run spreport for this time period?
>
>Told you I wasn't an expert, but I hope that prompts
>other readers to fill
>in the gaps and give you better hints,
>
>Good luck
>Tony
-- http://www.freelists.org/webpage/oracle-l[3] --- Links --- 1 javascript:parent.opencompose('Tony.Adolph_at_o2.com','','','') 2 modules/refer.pl?redirect=http%3A%2F%2Fmail.yahoo.com 3 modules/refer.pl?redirect=http%3A%2F%2Fwww.freelists.org%2Fwebpage%2Foracle- l -- http://www.freelists.org/webpage/oracle-l -- http://www.freelists.org/webpage/oracle-l -- http://www.freelists.org/webpage/oracle-lReceived on Wed Nov 24 2004 - 02:21:16 CST