Cluster Installation Fails [message #603735] |
Thu, 19 December 2013 00:13 |
burasami
Messages: 20 Registered: April 2010
|
Junior Member |
|
|
Hi All,
I have started the fresh installation
at one stage installation asked to run root.sh but it started to through following error.
INFO: /opt/app/oracle/product/10.2.0/db_1/root.sh #On nodes rubikon120,rubikon121
INFO: To execute the configuration scripts:
1. Open a terminal window
2. Log in as "root"
3. Run the scripts in each cluster node
4. Return to this window and click "OK" to continue
Note: Do not run the scripts simultaneously on the listed nodes.
INFO: Starting to execute configuration assistants
INFO: Command = /opt/app/oracle/product/10.2.0/db_1/bin/racgons add_config rubikon120.xxx.com.yyy:6200 rubikon121.xxx.com.yyy:6200
Command = /opt/app/oracle/product/10.2.0/db_1/bin/racgons has failed
Execution Error : WARNING: rubikon120.xxx.com.yyy:6200 already configured.
WARNING: rubikon121.xxx.com.yyy:6200 already configured.
INFO: Configuration assistant "Oracle Notification Server Configuration Assistant" failed
-----------------------------------------------------------------------------
*** Starting OUICA ***
Oracle Home set to /opt/app/oracle/product/10.2.0/db_1
Configuration directory is set to /opt/app/oracle/product/10.2.0/db_1/cfgtoollogs. All xml files under the directory will be processed
INFO: The "/opt/app/oracle/product/10.2.0/db_1/cfgtoollogs/configToolFailedCommands" script contains all commands that failed, were skipped or were cancelled. This file may be used to run these configuration assistants outside of OUI. Note that you may have to update this script with passwords (if any) before executing the same.
-----------------------------------------------------------------------------
SEVERE: OUI-25031:Some of the configuration assistants failed. It is strongly recommended that you retry the configuration assistants at this time. Not successfully running any "Recommended" assistants means your system will not be correctly configured.
1. Check the Details panel on the Configuration Assistant Screen to see the errors resulting in the failures.
2. Fix the errors causing these failures.
3. Select the failed assistants and click the 'Retry' button to retry them.
INFO: User Selected: Yes/OK
INFO: Starting to execute configuration assistants
INFO: Command = /opt/app/oracle/product/10.2.0/db_1/bin/racgons add_config rubikon120.xxx.com.yyy:6200 rubikon121.xxx.com.yyy:6200
Command = /opt/app/oracle/product/10.2.0/db_1/bin/racgons has failed
Execution Error : WARNING: rubikon120.xxx.com.yyy:6200 already configured.
WARNING: rubikon121.xxx.com.yyy:6200 already configured.
I stopped the installation step at middle and searched in google and try with several option from web
such as check voting and ocr permission
but the files are in correct permission.
OCR has root:oinstall rw-r----
Voting disk has oracle:oinstall rw-r----
after that i have run the following command
oracle@rubikon120:~/software/clusterware/clusterware/cluvfy> ./runcluvfy.sh stage -post crsinst -n rubikon120,rubikon121 -verbose
Performing post-checks for cluster services setup
Checking node reachability...
Check: Node reachability from node "rubikon120"
Destination Node Reachable?
------------------------------------ ------------------------
rubikon121 yes
rubikon120 yes
Result: Node reachability check passed from node "rubikon120".
Checking user equivalence...
Check: User equivalence for user "oracle"
Node Name Comment
------------------------------------ ------------------------
rubikon121 passed
rubikon120 passed
Result: User equivalence check passed for user "oracle".
Checking Cluster manager integrity...
Checking CSS daemon...
Node Name Status
------------------------------------ ------------------------
rubikon121 running
rubikon120 running
Result: Daemon status check passed for "CSS daemon".
Cluster manager integrity check passed.
Checking cluster integrity...
Node Name
------------------------------------
rubikon120
rubikon121
Cluster integrity check failed. This check did not run on the following nodes(s):
rubikon121
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
WARNING:
CSS is probably working with a non-clustered, local-only configuration on nodes:
rubikon121
Verification will proceed with nodes:
rubikon120
Uniqueness check for OCR device passed.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check failed.
Checking CRS integrity...
Checking daemon liveness...
Check: Liveness for "CRS daemon"
Node Name Running
------------------------------------ ------------------------
rubikon121 no
rubikon120 no
Result: Liveness check failed for "CRS daemon".
Checking daemon liveness...
Check: Liveness for "CSS daemon"
Node Name Running
------------------------------------ ------------------------
rubikon121 yes
rubikon120 yes
Result: Liveness check passed for "CSS daemon".
Checking daemon liveness...
Check: Liveness for "EVM daemon"
Node Name Running
------------------------------------ ------------------------
rubikon121 yes
rubikon120 no
Result: Liveness check failed for "EVM daemon".
Liveness of all the daemons
Node Name CRS daemon CSS daemon EVM daemon
------------ ------------------------ ------------------------ ----------
rubikon121 no yes yes
rubikon120 no yes no
CRS integrity check failed.
Checking node application existence...
Checking existence of VIP node application
Node Name Required Status Comment
------------ ------------------------ ------------------------ ----------
rubikon121 yes unknown failed
rubikon120 yes unknown failed
Result: Check failed.
Checking existence of ONS node application
Node Name Required Status Comment
------------ ------------------------ ------------------------ ----------
rubikon121 no unknown ignored
rubikon120 no unknown ignored
Result: Check ignored.
Checking existence of GSD node application
Node Name Required Status Comment
------------ ------------------------ ------------------------ ----------
rubikon121 no unknown ignored
rubikon120 no unknown ignored
Result: Check ignored.
Post-check for cluster services setup was unsuccessful on all the nodes.
now am freeze without other option to go ahead. Kindly help me out to resolve this RAC installation
Thanks & Regards
Sami
|
|
|
Re: Cluster Installation Fails [message #603743 is a reply to message #603735] |
Thu, 19 December 2013 01:30 |
trantuananh24hg
Messages: 744 Registered: January 2007 Location: Ha Noi, Viet Nam
|
Senior Member |
|
|
First time, please post more information:
- What's platform type?
- What's shared-storage type?
- Listing Network configure (/etc/hosts; /etc/hostname.NIC for Solaris; /etc/network-config for Linux)
- Are you using multipathing? If yes, which is kind of multipathing, shared-storage, bounce network or both of them? With bounce network, what's type of multipathing? Active-Active or Active-Passive?
- Listing of devices for OCR and voting disk? In Solaris, they're might be slice, in Linux, they're might be slice using raw-binding or not.
- Are you using mknode?
- At last, please post content from error log here
|
|
|
Re: Cluster Installation Fails [message #603760 is a reply to message #603743] |
Thu, 19 December 2013 03:13 |
burasami
Messages: 20 Registered: April 2010
|
Junior Member |
|
|
Hi All,
Thanks for your reply.
Suse Enterprise Edition 10
What's shared-storage type?
OCFS2
mkfs.ocfs2
shared-storage, bounce network or both of them? With bounce network,
oracle@rubikon120:/> ssh rubikon121 date
Thu Dec 19 14:49:16 NPT 2013
oracle@rubikon120:/>
[02:35:36 PM] sn: oracle@rubikon121:~> ssh rubikon120 date
Thu Dec 19 14:50:02 NPT 2013
oracle@rubikon121:~>
ossd log
[ CSSD]2013-12-16 15:10:19.477 [1199630656] >TRACE: clssgmReconfigThread: completed for reconfig(1), with status(1)
[ CSSD]2013-12-16 15:10:19.588 [1140881728] >TRACE: clssgmClientConnectMsg: Connect from con(0x2aaaaad22250) proc(0x2aaaaad271c0) pid() proto(10:2:1:1)
[ CSSD]2013-12-16 15:10:19.589 [1140881728] >TRACE: clssgmClientConnectMsg: Connect from con(0x2aaaaad27bf0) proc(0x2aaaaad2a2d0) pid() proto(10:2:1:1)
[ CSSD]2013-12-16 15:10:19.589 [1140881728] >TRACE: clssgmClientConnectMsg: Connect from con(0x2aaaaad24db0) proc(0x2aaaaad2a000) pid() proto(10:2:1:1)
[ CSSD]2013-12-16 15:10:20.393 [1191237952] >TRACE: clssnmWaitForAcks: done, msg type(15)
[ CSSD]2013-12-16 15:10:20.393 [1191237952] >TRACE: clssnmDoSyncUpdate: Sync Complete!
[ CSSD]2013-12-16 15:10:20.393 [1132489024] >TRACE: clssnmSendFatalOn: req to syncLeader(1)
[ CSSD]2013-12-16 15:10:20.414 [1124096320] >TRACE: clssnmFatalThread: Fatal mode enabled
[ CSSD]2013-12-16 15:13:27.239 [1107310912] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(1) wrtcnt(1) LATS(5101196) Disk lastSeqNo(1)
[ CSSD]2013-12-16 15:13:28.825 [1132489024] >TRACE: clssnmConnComplete: connected to node 2 (con 0x752c20), state 1 birth 0, unique 1387186106/1387186106 prevConuni(0)
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmDoSyncUpdate: Initiating sync 2
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSendSync: syncSeqNo(2)
[ CSSD]2013-12-16 15:13:28.949 [1132489024] >TRACE: clssnmHandleSync: Acknowledging sync: src[1] srcName[rubikon120] seq[6] sync[2]
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(2)
[ CSSD]2013-12-16 15:13:29.020 [2131051872] >USER: NMEVENT_SUSPEND [00][00][00][02]
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmDoSyncUpdate: node(0) missCount(193) state(0)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmDoSyncUpdate: node(2) is transitioning from joining state to active state
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSetupAckWait: node(1) is ACTIVE
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSendVote: syncSeqNo(2)
[ CSSD]2013-12-16 15:13:29.953 [1132489024] >TRACE: clssnmSendVoteInfo: node(1) syncSeqNo(2)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(1)
[ CSSD]2013-12-16 15:13:30.956 [1191237952] >TRACE: clssnmWaitForAcks: done, msg type(13)
[ CSSD]2013-12-16 15:13:30.957 [1191237952] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmEvict: Start
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmWaitOnEvictions: Start
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmWaitOnEvictions: Node(0) down, LATS(0),timeout(5105916)
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSetupAckWait: Ack message type (15)
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSetupAckWait: node(1) is ACTIVE
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSendUpdate: syncSeqNo(2)
[ CSSD]2013-12-16 15:13:31.961 [1132489024] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2013-12-16 15:13:31.961 [1132489024] >TRACE: clssnmDeactivateNode: node 0 () left cluster
CSSD]2013-12-18 02:19:06.104 [1174452544] >TRACE: clssnmPollingThread: node rubikon121 (2) missed(2) checkin(s)
[ CSSD]2013-12-18 02:19:08.112 [1174452544] >TRACE: clssnmPollingThread: node rubikon121 (2) missed(2) checkin(s)
[ CSSD]2013-12-18 11:59:52.844 [1174452544] >TRACE: clssnmPollingThread: node rubikon121 (2) missed(2) checkin(s)
alert log from node 1
2013-12-16 15:10:16.273
[cssd(23486)]CRS-1605:CSSD voting file is online: /oracrsfiles/oracrs/vote.crs. Details in /opt/app/oracle/product/10.2.0/db_1/log/rubikon120/cssd/ocssd.log.
2013-12-16 15:10:19.477
[cssd(23486)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rubikon120 .
2013-12-16 15:10:20.832
[crsd(23116)]CRS-1012:The OCR service started on node rubikon120.
2013-12-16 15:10:20.898
[evmd(23359)]CRS-1401:EVMD started on node rubikon120.
2013-12-16 15:10:21.277
[crsd(23116)]CRS-1201:CRSD started on node rubikon120.
2013-12-16 15:13:32.060
[cssd(23486)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rubikon120 rubikon121 .
2013-12-16 17:33:13.782
[evmd(11079)]CRS-1401:EVMD started on node rubikon120.
2013-12-16 17:33:14.333
[crsd(10814)]CRS-1012:The OCR service started on node rubikon120.
2013-12-16 17:41:19.945
[cssd(16116)]CRS-1605:CSSD voting file is online: /oracrsfiles/oracrs/vote.crs. Details in /opt/app/oracle/product/10.2.0/db_1/log/rubikon120/cssd/ocssd.log.
OCFS2 status from Node 1
rubikon120:/ # /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 31
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active
rubikon120:/ #
OCFS2 status from Node 2
rubikon121:/ # /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 31
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active
rubikon121:/ #
Thanks & Regards
Sami
|
|
|
|