Excessive cpu usage on 10.2.0.3 linux 64bit

From: Giovanni Cuccu <giovanni.cuccu_at_gmail.com>
Date: Wed, 25 Jun 2008 13:19:20 +0200
Message-ID: <23e0d1170806250419p4c22c3f8s2a54ef34f8229cfe@mail.gmail.com>

Hi all,

I'm facing a problem on Oracle 10.2.0.3 linux 64bit (Centos 5) I'm issuing a simple delete from command, the table has 4M rows and is referenced by some fk (but there is and index for every fk on the other tables).
the command takes ages to complete and I cannot figure why since two days ago there were no problems.
Since it'is a development box I hadn't collected any performance data before the issue.
This is what I've done in order to collect some information. I issued a query on v$session_longops that shows a very slow progress in getting a single block from the table I enabled sql_trace with binds and here is an excerpt

WAIT #2: nam='free buffer waits' ela= 10728 file#=2 block#=59589 set-id#=5 obj#=46865 tim=1185928593541132 WAIT #2: nam='free buffer waits' ela= 10726 file#=2 block#=59589 set-id#=5 obj#=46865 tim=1185928593551874 WAIT #2: nam='free buffer waits' ela= 10729 file#=2 block#=59589 set-id#=5 obj#=46865 tim=1185928593562617
*** 2008-06-25 12:47:59.815

WAIT #2: nam='free buffer waits' ela= 6959 file#=2 block#=59589 set-id#=5 obj#=46865 tim=1185928593569595 WAIT #2: nam='free buffer waits' ela= 221 file#=2 block#=59589 set-id#=5 obj#=46865 tim=1185928593569854

WAIT #2: nam='db file scattered read' ela= 105346 file#=7
block#=106821 blocks=16 obj#=46828 tim=1185928594430683
WAIT #2: nam='db file sequential read' ela= 6993 file#=8 block#=934757

blocks=1 obj#=46865 tim=1185928595079089 WAIT #2: nam='db file sequential read' ela= 4705 file#=8 block#=131725 blocks=1 obj#=46865 tim=1185928597295791 WAIT #2: nam='db file sequential read' ela= 6222 file#=8 block#=118799 blocks=1 obj#=46864 tim=1185928597498068 WAIT #2: nam='db file sequential read' ela= 9507 file#=8 block#=131805 blocks=1 obj#=46865 tim=1185928599659415 WAIT #2: nam='db file sequential read' ela= 9257 file#=8 block#=118559 blocks=1 obj#=46864 tim=1185928599885609 WAIT #2: nam='db file sequential read' ela= 15414 file#=8 block#=764020 blocks=1 obj#=46864 tim=1185928602481449
*** 2008-06-25 12:48:12.280

WAIT #2: nam='latch free' ela= 36 address=4318780640 number=127
tries=1 obj#=-1 tim=1185928605742858
WAIT #2: nam='db file sequential read' ela= 38090 file#=8

block#=118736 blocks=1 obj#=46864 tim=1185928605981000

WAIT #2: nam='latch free' ela= 23 address=4318780640 number=127
tries=1 obj#=-1 tim=1185928609090528
WAIT #2: nam='latch free' ela= 44 address=4318780640 number=127
tries=1 obj#=-1 tim=1185928610275061
WAIT #2: nam='db file sequential read' ela= 3839 file#=8 block#=761139

blocks=1 obj#=46864 tim=1185928612233425 WAIT #2: nam='db file sequential read' ela= 24 file#=8 block#=934722 blocks=1 obj#=46865 tim=1185928612798930
*** 2008-06-25 12:48:23.585

WAIT #2: nam='db file sequential read' ela= 17800 file#=8 block#=763028 blocks=1 obj#=46864 tim=1185928616782421

I noticed the 10 second pauses and i tried to do some other investigations

select * from v$session_wait where sid=145 usually shows db file sequential read event

select * from v$session_event where sid=145 shows the i/o events as the most time consuming (high values in the time_waited_micro)

select * from v$sess_time_model where sid=145 shows

145	3649082374	DB time	2410134327
145	2748282437	DB CPU	2328552497

......

So far it seems that the process is consuming a lot of cpu

the ps -fe confirmed that the unix process corresponing to the command executed is eating 94% of the CPU
I did a strace of the unix process and this is an excerpt: poll([{fd=13, events=POLLIN|POLLRDNORM}], 1, 0) = 0

times(NULL)                             = 498061066
times(NULL)                             = 498061066
times(NULL)                             = 498061066
times(NULL)                             = 498061066
times(NULL)                             = 498061066
times(NULL)                             = 498061066
times(NULL)                             = 498061068
times(NULL)                             = 498061068
times(NULL)                             = 498061068
times(NULL)                             = 498061068
times(NULL)                             = 498061068
times(NULL)                             = 498061068
times(NULL)                             = 498061069
times(NULL)                             = 498061069
times(NULL)                             = 498061069
times(NULL)                             = 498061069
times(NULL)                             = 498061069
times(NULL)                             = 498061069
poll([{fd=13, events=POLLIN|POLLRDNORM}], 1, 0) = 0
times(NULL)                             = 498061071
times(NULL)                             = 498061071
times(NULL)                             = 498061071
times(NULL)                             = 498061071
times(NULL)                             = 498061071
times(NULL)                             = 498061071
times(NULL)                             = 498061072
times(NULL)                             = 498061072

pread(12, "\6\242\0\0\215\320\1\2\6\361)#\0\0\f\4B\335\0\0\2\0.\0\20\267\0\0\25\26\35#"..., 8192, 974233600) = 8192

times(NULL)                             = 498061073
times(NULL)                             = 498061073
times(NULL)                             = 498061073
times(NULL)                             = 498061073
times(NULL)                             = 498061075
times(NULL)                             = 498061075
times(NULL)                             = 498061075
times(NULL)                             = 498061075
times(NULL)                             = 498061075
times(NULL)                             = 498061075

poll([{fd=13, events=POLLIN|POLLRDNORM}], 1, 0) = 0

I'm not linux an expert but it seems a polling process. the command iostat 5 -k shows very low rates (some days ago I saw values 10 times higher while executing the same sql command)

Can anyone tell me if there is some other investigation I can do? does anyopne has any advice?

Thanks a lot,

Giovanni

-- 
--------------------------------------------------------------------
"You don't know the power of dark side" - Darth Vader
--
http://www.freelists.org/webpage/oracle-l

Received on Wed Jun 25 2008 - 06:19:20 CDT

This message: [ Message body ]
Next message: Mercadante, Thomas F (LABOR): "RE: Recreating a trigger in a 'live' system"
Previous message: freek D'Hooge: "RE: Off -Topic - Netbackup Command Help -Solved"
Next in thread: Dennis Williams: "Re: Excessive cpu usage on 10.2.0.3 linux 64bit"
Reply: Dennis Williams: "Re: Excessive cpu usage on 10.2.0.3 linux 64bit"
Maybe reply: Giovanni Cuccu: "Re: Excessive cpu usage on 10.2.0.3 linux 64bit"
Maybe reply: Yong Huang: "Re: Excessive cpu usage on 10.2.0.3 linux 64bit"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message