Re: Oracle 10g hangs intermittently waiting for I/O

From: Yechiel Adar <adar666_at_inter.net.il>
Date: Tue, 19 May 2009 08:35:54 +0300
Message-id: <4A12453A.9010704_at_inter.net.il>



I read all the thread and want to add:
Do you have a live backup of your storage in a remote location. We had a problem with slow I/O that was solved after they stopped the remote copy of these disks.
Maybe a big update to disks is causing slowdown as the controller is busy writing to remote location.

In our case they decided to stop the remote copy on the redo logs :-) Wise guys.

Adar Yechiel
Rechovot, Israel

Paweł Kotlarz wrote:
> Hello all.
>
> I have oracle 10.2.0.3 data warehouse database on 11.1.0.7 ASM with
> asmlib. RHEL 4.7. Proliant DL585 G2 with MSA70 storage.
>
> The problem I face is an 'I/O hiccup'. The database can work properly
> for a week or two and then suddenly keep stalling for no apparent
> reason. Users complain that their selects take 2x or 3x more time.
> vmstat shows I/O activity (bi, bo colums) for half a minute and for
> another half a minute shows no activity (bi and bo columns equal to 0)
> and a number of processes waiting for I/O (procs/b column). strace on an
> oracle process waiting for I/O shows it is waiting for a completion of
> 'read' call. The only thing that helps is rebooting the box.
>
> I can isolate the problem to specific disks using iostat. These disks
> are the same on a day the problem occurs but they are different on
> another occurrance of the problem. Storage / Linux admins do not see any
> problem on their side.
>
> I have several one-off patches recommended by Oracle support:
>
> Bug 5452672: Hung database instance if linux kernel miss aio request
> Bug 6656824: LNX-10204-TC6 SIGSEGV AT SKGFR_REAP64()+281, IN DBW0
> Bug 6087207: WARNING:ORACLE PROCESS RUNNING OUT OF OS KERNEL I/O
> RESOURCES
> Bug 6882513 - MERGE LABEL REQUEST ON TOP OF 10.2.0.3 FOR BUGS 6801535
> 5576584
> Bug 5576584 (4880399): ASM PARALLEL READS PERFORMANCE NOT ACCEPTABLE
>
> I plan to upgrade to 10.2.0.4 but need first to sort out some hash join
> bugs (yet unknown to Oracle) that break our large queries with ora-600
> errors.
>
> What would you recommend to do to narrow down the problem to Oracle /
> ASM / asmlib / Linux / storage fault?
>
> Do you know of any other bugs that can show such a behaviour?
>
> Thanks.
>
>
> Pawel Kotlarz
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue May 19 2009 - 00:35:54 CDT

Original text of this message