Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: RAC unexpected reboot of nodes
alek wrote:
> HI,
>
> I'm a quite new in the RAC field and I want to know if the following
> behavior is normal for such a configuration:
>
> A few weeks ago we succeeded to configure an Oracle 10.2.0.1 cluster.
> The configuration was comprised of 2 nodes and the underlying OS was
> Redhat AS4. The installation went well following all the installation
> steps mentioned into the official oracle documentation. The OCR and the
> voting disks were configured using NFS. At that time we noticed that
> from time to time one of the nodes (not always the same) was
> unexpectedly rebooted. The system or oracle logs didn't offered any
> clues therefore our conclusion was that the NFS might cause problems.
> In order to prove this we decided to configure a RAC on a single node
> just for testing purposes. The OCR, voting disks and the oracle
> software were installed on OCFS2 partitions therefore no NFS was
> involved. On this node we configured 2 oracle instances which worked
> fine for a while but, from time to time or when the server is stressed
> with intensive SQLs the entire server is rebooted. After some searching
> on metalink we found out the Bug.4741921/4556989 (36) INSTANCE
> RESTARTED AFTER SHUTDOWN ABORT IN RAC ENVIRONMENT which is fixed in
> 10.2.0.2 patch. We downloaded and installed the patch but it seems that
> the strange behavior is still there. We notice, indeed, that the
> frequency of the server reboot is lower now but we have no explanation
> for what really causes the reboot.
> Have anyone notice the same behavior on the 10.2.0.x RAC configuration?
> Are there any workarounds for this?
The unexpected re-booting of nodes is unfortunately a pretty common occurrence under linux -- often usually associated with some kind of lockup/access problem against the shared disk systems.
There are some parameters you can attempt to set "higher" to allow the clusterware to tolerate longer periods of ... ( inability to access the storage ).
Jeffrey Hunter in his site www.idevelopment.info has some long writeups on RAC linux configuration that may be a step in the right direction.
Opening a tar with oracle support is for better or worse probably another direction that you need to proceed in. Received on Thu Mar 09 2006 - 12:08:38 CST
![]() |
![]() |