Re: Oracle 10g RAC on AIX
Date: Mon, 28 Jan 2008 17:35:15 -0500
Message-ID: <611ad3510801281435v22eb8a9cx683b47338290fd29@mail.gmail.com>
Off the top of my head:
1) On each node, do you get the same results from "opatch lsinventory" for
each ORACLE_HOME (crs, asm and db)? Does each node show that the same
patchsets and one-offs are applied to all relevant OH's?
2) are there any errors or unusual messages in the clusterware logs?
Particularly:
--- $ORA_CRS_HOME/log/`hostname`/cssd/ocssd.log
--- $ORA_CRS_HOME/log/`hostname`/crsd/crsd.log
On 1/28/08, Sanjay Mishra <smishra_97_at_yahoo.com> wrote:
>
> Jeremy
>
> This is dev environment. I had patched the CRS/RDBMS to 10203 and then
> applied 6440669 merge Patch. I had now even rebooted all nodes and now all
> daemon are back. ocrcheck is also showing everything is fine but now
> crs_stat -t is hanging on all of them. When it was rebooted, it came fine
> bit after 15-20 minutes, when I tried again, it is hanging
>
>
>
> While applying the 6440669 patch, it has the last part on AIx it has
> special instrction
>
> To try and identify the likely cause please execute the following commands
>
> # and provide the output to your support representative, who will be able to
>
> # identify the corrective steps.
>
> # genld -l | grep <CRS_HOME>
>
> # genkld | grep <CRS_HOME> ( full or partial path will do )
>
>
> I had few rows coming in the output. I had even ran the
> /usr/sbin/slibclean again and ran the above gen... command bit still see it
> like
> $genkld | grep "/u01/app/crs/10.2"
> 900000003d75100 103e0
> /u01/app/crs/10.2/lib/libdbcfg10.a[shr_dbcfg10.o]
> 900000002691100 162fbbd /u01/app/crs/10.2/lib/libttsh10.a[shr_ttsh10.o]
> 900000003d2c100 40de1 /u01/app/crs/10.2/lib/libocrb10.a[shr_ocrb10.o]
> 900000003d28100 162d /u01/app/crs/10.2/lib/libskgxn2.a[shr_skgxn2.o]
> 900000003d1d100 a1cc
> /u01/app/crs/10.2/lib/libocrutl10.a[shr_ocrutl10.o]
> 900000003cc1100 5b130 /u01/app/crs/10.2/lib/libocr10.a[shr_ocr10.o]
> 9000000025eb100 a5459
> /u01/app/crs/10.2/lib/libhasgen10.a[shr_hasgen10.o]
> $genld -l | grep "/u01/app/crs/10.2"
> 900000003d28100 162d /u01/app/crs/10.2/lib/libskgxn2.a[shr_skgxn2.o]
> 900000003d2c100 40de1 /u01/app/crs/10.2/lib/libocrb10.a[shr_ocrb10.o]
> 900000003d1d100 a1cc
> /u01/app/crs/10.2/lib/libocrutl10.a[shr_ocrutl10.o]
> 900000002691100 162fbbd
> /u01/app/crs/10.2/lib/libttsh10.a[shr_ttsh10.o]
> 9000000025eb100 a5459
> /u01/app/crs/10.2/lib/libhasgen10.a[shr_hasgen10.o]
> 900000003cc1100 5b130 /u01/app/crs/10.2/lib/libocr10.a[shr_ocr10.o]
> 900000003d2c100 40de1 /u01/app/crs/10.2/lib/libocrb10.a[shr_ocrb10.o]
> 900000003d28100 162d /u01/app/crs/10.2/lib/libskgxn2.a[shr_skgxn2.o]
> 900000003d1d100 a1cc
> /u01/app/crs/10.2/lib/libocrutl10.a[shr_ocrutl10.o]
> 900000003cc1100 5b130 /u01/app/crs/10.2/lib/libocr10.a[shr_ocr10.o]
> 900000002691100 162fbbd
> /u01/app/crs/10.2/lib/libttsh10.a[shr_ttsh10.o]
> 9000000025eb100 a5459
> /u01/app/crs/10.2/lib/libhasgen10.a[shr_hasgen10.o]
> 900000003d1d100 a1cc
> /u01/app/crs/10.2/lib/libocrutl10.a[shr_ocrutl10.o]
> 900000003d2c100 40de1 /u01/app/crs/10.2/lib/libocrb10.a[shr_ocrb10.o]
> 900000003cc1100 5b130 /u01/app/crs/10.2/lib/libocr10.a[shr_ocr10.o]
> 900000003d28100 162d /u01/app/crs/10.2/lib/libskgxn2.a[shr_skgxn2.o]
> 900000002691100 162fbbd
> /u01/app/crs/10.2/lib/libttsh10.a[shr_ttsh10.o]
> 9000000025eb100 a5459
> /u01/app/crs/10.2/lib/libhasgen10.a[shr_hasgen10.o]
> 900000003d1d100 a1cc
> /u01/app/crs/10.2/lib/libocrutl10.a[shr_ocrutl10.o]
> 900000003d2c100 40de1 /u01/app/crs/10.2/lib/libocrb10.a[shr_ocrb10.o]
> 900000003d28100 162d /u01/app/crs/10.2/lib/libskgxn2.a[shr_skgxn2.o]
> 900000003cc1100 5b130 /u01/app/crs/10.2/lib/libocr10.a[shr_ocr10.o]
> 900000002691100 162fbbd
> /u01/app/crs/10.2/lib/libttsh10.a[shr_ttsh10.o]
> 9000000025eb100 a5459
> /u01/app/crs/10.2/lib/libhasgen10.a[shr_hasgen10.o]
> I don't know about this and so as per patch instruction , open the issue
> to Oracle support.
>
> There is no error during the patch application on any node.
> ----- Original Message ----
> From: Jeremy Schneider <jeremy.schneider_at_ardentperf.com>
> To: smishra_97_at_yahoo.com
> Cc: oracle-l_at_freelists.org
> Sent: Monday, January 28, 2008 4:36:09 PM
> Subject: Re: Oracle 10g RAC on AIX
>
> Is this a production system? If it's not production, then I'm curious -
> does the system consistently come up in this state when you reboot
> everything? Also, I couldn't tell from your email - did you run those
> commands (ps, crsctl, init.d stop) on node 1 or on the other nodes?
>
>
> On 1/28/08, Sanjay Mishra <smishra_97_at_yahoo.com> wrote:
> >
> > Hi
> >
> > I am working on 10g RAC on AIX and having the strange problem. I applied
> > Patchset 2 and now getting some problem in access crs
> >
> > On Node 1, it shows in crs_stat -t that all Resources gsd/ons/vip are up
> > on all 5 modes while on all other nodes, it shows the process are running
> > but crs_stat -t gives the following error
> > CRS-0184: Cannot communicate with the CRS daemon.
> >
> > ps -ef|grep crs gives the output as
> > root 565460 716832 0 14:12:33 - 0:22
> > /u01/app/crs/10.2/bin/crsd.bin reboot
> > root 589936 1 0 14:12:46 - 0:00
> > /u01/app/crs/10.2/bin/racgmain ora.gkd122.vip rundetach ora.gkd122.vip 1
> > startorp gkd122
> > oracle 614456 860378 0 14:12:41 - 0:00
> > /u01/app/crs/10.2/bin/evmlogger.bin -o
> > /u01/app/crs/10.2/evm/log/evmlogger.info -l
> > /u01/app/crs/10.2/evm/log/evmlogger.log
> > oracle 635124 426140 0 14:12:36 - 0:05
> > /u01/app/crs/10.2/bin/ocssd.bin
> > oracle 675854 1 0 14:12:47 - 0:00
> > /u01/app/crs/10.2/opmn/bin/ons -d
> > root 716832 1 0 14:12:31 - 0:00 /bin/sh /etc/init.crsd
> > run
> > oracle 733338 684146 0 14:12:34 - 0:00 /bin/sh -c cd
> > /u01/app/crs/10.2/log/gkd122/cssd/oclsomon; ulimit -c unlimited;
> > /u01/app/crs/10.2/bin/oclsomon || exit $?
> > oracle 737488 675854 0 14:12:47 - 0:00
> > /u01/app/crs/10.2/opmn/bin/ons -d
> > root 843828 667806 0 14:12:34 - 0:00
> > /u01/app/crs/10.2/bin/oprocd.bin run -t 1000 -m 500 -f
> > oracle 860378 761930 0 14:12:33 - 0:00
> > /u01/app/crs/10.2/bin/evmd.bin
> > oracle 864454 733338 0 14:12:34 - 0:01
> > /u01/app/crs/10.2/bin/oclsomon.bin
> >
> > I tried to stop crsctl check crs
> > $crsctl check crs
> > CSS appears healthy
> > Try againEVM appears healthy
> > I tried to stop crs using crsctl stop crs or /etc/init.crs stop as root
> > but the processes are still running.
> >
> > Any advice to look into it
> >
> > Sanjay
> >
> > ------------------------------
> > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try
> > it now.<http://us.rd.yahoo.com/evt=51733/*http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ>
> >
>
>
>
> --
> Jeremy Schneider
> Chicago, IL
> http://www.ardentperf.com/category/technical
>
>
> ------------------------------
> Looking for last minute shopping deals? Find them fast with Yahoo! Search.<http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping>
>
-- Jeremy Schneider Chicago, IL http://www.ardentperf.com/category/technical -- http://www.freelists.org/webpage/oracle-lReceived on Mon Jan 28 2008 - 16:35:15 CST