Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: clustering
We used to experience problems in our RAC environment when there's an
interconnect failure. There's a workaround for that problem, that was worked
for us -
Create a directory under $ORACLE_HOME/rdbms called ".aixopt". Create (touch) a file called SUSTAIN_IPC_FAILURE (uppercase - 0 byte file).
We're using 9.2.0.3 2node RAC on AIX 5L / HACMP 4.4
Does Sun or Tru64 have similar workarounds or does it work flawlessly without the workaound. Having this workaround tells RAC to make sure atleast there's one surviving instance in the cluster instead of all instances crashing. Here's the section from alert log file with an example of handling failures of all 3 interconnects.
Marking down Network with IP 192.168.17.11
Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:28 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:29 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:29 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:29 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:30 2003
Marking down Network with IP 192.168.18.11
Thu Apr 10 23:29:33 2003
Marking down Network with IP 192.168.19.11
WARNING!!! NO COMMON NETWORKS FOR ALL NODES TO COMMUNICATE
SUSTAINING IPC FAILURE
THIS SHOULD BE THE ONLY INSTANCE RUNNING IN THIS CLUSTER
-----Original Message-----
Sent: Tuesday, July 29, 2003 9:29 AM
To: Multiple recipients of list ORACLE-L
Hrrrmm - well, we've never seen the problem you describe, and we've got a pretty big RAC environment here (clusters from two to six nodes, and we combine dev clusters to build bigger ones as we need). What the situation you describe sounds like is what happens when there's interconnect failure. Each node thinks independently that its been separated from the rest of the cluster and (effectively) shoots itself in the head. This causes every instance to hang. This is why the crafty RAC Jedi designs well their interconnect architecture.
But yes, if you're willing to take the "completely 2n capacity" cluster route and have two databases, double the oracle licenses, two storage arrays, two fibre channel networks, etc. , that is the highest availability/reliability cluster you can have - although at the highest cost and complexity.
Which clustering solution is right for you? Cheap and inelegant? Expensive and bullet-proof? Well, that's why we get paid the big bucks, right? :)
Thanks,
Matt
-- Matthew Zito GridApp Systems Email: mzito_at_gridapp.com Cell: 646-220-3551 Phone: 212-358-8211 x 359 http://www.gridapp.com <http://www.gridapp.com/> -----Original Message----- Tanel Poder Sent: Monday, July 28, 2003 7:05 PM To: Multiple recipients of list ORACLE-L However, failed transactions must be handled from client side. Queries may migrate to surviving nodes transparently. Also, currently RAC has many problems, such all nodes hanging when one node dies. Completely separate systems are still (an will always be) the most available solution. Tanel. ----- Original Message ----- To: Multiple recipients of list <mailto:ORACLE-L_at_fatcity.com> ORACLE-L Sent: Monday, July 28, 2003 7:49 PM Another Important different is that RAC is best High Availability solution in case of System/Instance Failure where in case of HP or Veritas Cluster, all of the resource get stopped on live system/node of the cluster and then get started on second node and hence user will be affected. But in case of system or Instance failure, there is seamless transition of the User session in RAC Indy Johal "Ron Rogers" <RROGERS_at_galottery.org> Sent by: ml-errors_at_fatcity.com 07/28/03 12:29 PM Please respond to ORACLE-L To: Multiple recipients of list ORACLE-L <ORACLE-L_at_fatcity.com> cc: Subject: Re: clustering ak, As I understand it, an HP cluster is 2 boxes that have the capability to access the same disks and data but only one can have the oracle instance running and accessing the datafiles(active). Sort of like a high availability option. With RAC both boxes can access the instance and datafiles at the same time. List, Correct me if I need it. RonReceived on Tue Jul 29 2003 - 17:09:28 CDT
>>> oramagic_at_hotmail.com 07/28/03 12:14PM >>>
Hi Guys , I am new to this clustering concept. Just trying to understand few basics . Need ur help . what is differece between oracle running on sun /hp cluster with 2 nodes and oracle with RAC running on 2 nodes ? thanks, -ak -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Ron Rogers INET: RROGERS_at_galottery.org Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Balakrishnan, Ashok - VSCM INET: Balakrishnan.Ashok_at_vectorscm.com Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
![]() |
![]() |