Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> Re: RAC on OCFS2 acceptance testing
A customer ran into a simlar problem(s) with OCFS2 and RHEL4 upd 4
(smp kernel).
heavy db updates or mixed io (cp from ocfs to ext3, oracle export to
ext3) would cause the cluster to become unresponsive and crash a node.
cp and exp caused a high load avg and heavy swapping. We couldn't
even ssh to the host.
I didn't understand the heavy swapping because there was 3GB of cache
mem available (shown by free -m).
something to do with ocfs and low mem usage. I never got a clear
answer on it.
the ended up setting "vm.lower_zone_protection=100" which helped the swapping issue.
The fencing problem was attributed to the following init.ora parms.
filesystemio_options = asynch disk_asynch_io = TRUE
they were changed to:
disk_asynch_io=FALSE
filesystemio_options='DIRECTIO'
Things have improved since.
I asked Oracle for a good document for OCFS2 and RAC and still
haven't got a response.
I also asked for optimal kernel parameter settings for OCFS2.
The closest I got was the following list, but no values.
- vm.swappiness - vm.lower_zone_protection - vm.vfs_cache_pressure - vm.dirty_ratio - vm.dirty_background_ratio
I'm not sure about "unbreakable" Oracle/Linux combo. I'd be happy if they focused on "stable" Oracle/Linux.
It comes back to "You get what you pay for". Customers think that Oracle spends as much money on the "freebies" (i.e. OCFS) as they do the database.
my 2¢
P.S. I spend as much time on Bugzilla as I do metalink these days.
On Dec 28, 2006, at 11:14 AM, Kevin Closson wrote:
>
> And to point out that I'm not being obtuse,
> here is a snippet from
> http://oss.oracle.com/bugzilla/show_bug.cgi?id=822 :
>
>
> Environment:
> Linux x86-64 Redhat 4.0 Update 3
> OCFS2 1.2.3 3-node cluster.
> Problem:
> After installation, created two filesystems to be used for
> software.
> To limit timeout problems, increased the
> O2CB_HEARTBEAT_THRESHOLD TO
> 31.
>
> During maintenance window, decided to use the OCFS2 filesystem
> to store a large backup file (about 5-10 gig file).
> SCP'ed the file from an outside server to node1 of the cluster
> using command "scp $file oracle_at_sachlp10:/ocfs2_fs1/.
>
> After a few minutes, node1 crashed.
> Did not find error messages on node1, but found them in
> /var/log/messages
> on node2:
>
> ...wow, sounds like a pretty aggressive workload, right?
> --
> http://www.freelists.org/webpage/oracle-l
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Sat Dec 30 2006 - 11:17:57 CST
![]() |
![]() |