Home » Server Options » RAC & Failsafe » Cluster Installation Fails
|
Re: Cluster Installation Fails [message #603743 is a reply to message #603735] |
Thu, 19 December 2013 01:30   |
trantuananh24hg
Messages: 744 Registered: January 2007 Location: Ha Noi, Viet Nam
|
Senior Member |
|
|
First time, please post more information:
- What's platform type?
- What's shared-storage type?
- Listing Network configure (/etc/hosts; /etc/hostname.NIC for Solaris; /etc/network-config for Linux)
- Are you using multipathing? If yes, which is kind of multipathing, shared-storage, bounce network or both of them? With bounce network, what's type of multipathing? Active-Active or Active-Passive?
- Listing of devices for OCR and voting disk? In Solaris, they're might be slice, in Linux, they're might be slice using raw-binding or not.
- Are you using mknode?
- At last, please post content from error log here
|
|
|
Re: Cluster Installation Fails [message #603760 is a reply to message #603743] |
Thu, 19 December 2013 03:13   |
burasami
Messages: 20 Registered: April 2010
|
Junior Member |
|
|
Hi All,
Thanks for your reply.
Suse Enterprise Edition 10
What's shared-storage type?
OCFS2
mkfs.ocfs2
shared-storage, bounce network or both of them? With bounce network,
oracle@rubikon120:/> ssh rubikon121 date
Thu Dec 19 14:49:16 NPT 2013
oracle@rubikon120:/>
[02:35:36 PM] sn: oracle@rubikon121:~> ssh rubikon120 date
Thu Dec 19 14:50:02 NPT 2013
oracle@rubikon121:~>
ossd log
[ CSSD]2013-12-16 15:10:19.477 [1199630656] >TRACE: clssgmReconfigThread: completed for reconfig(1), with status(1)
[ CSSD]2013-12-16 15:10:19.588 [1140881728] >TRACE: clssgmClientConnectMsg: Connect from con(0x2aaaaad22250) proc(0x2aaaaad271c0) pid() proto(10:2:1:1)
[ CSSD]2013-12-16 15:10:19.589 [1140881728] >TRACE: clssgmClientConnectMsg: Connect from con(0x2aaaaad27bf0) proc(0x2aaaaad2a2d0) pid() proto(10:2:1:1)
[ CSSD]2013-12-16 15:10:19.589 [1140881728] >TRACE: clssgmClientConnectMsg: Connect from con(0x2aaaaad24db0) proc(0x2aaaaad2a000) pid() proto(10:2:1:1)
[ CSSD]2013-12-16 15:10:20.393 [1191237952] >TRACE: clssnmWaitForAcks: done, msg type(15)
[ CSSD]2013-12-16 15:10:20.393 [1191237952] >TRACE: clssnmDoSyncUpdate: Sync Complete!
[ CSSD]2013-12-16 15:10:20.393 [1132489024] >TRACE: clssnmSendFatalOn: req to syncLeader(1)
[ CSSD]2013-12-16 15:10:20.414 [1124096320] >TRACE: clssnmFatalThread: Fatal mode enabled
[ CSSD]2013-12-16 15:13:27.239 [1107310912] >TRACE: clssnmReadDskHeartbeat: node(2) is down. rcfg(1) wrtcnt(1) LATS(5101196) Disk lastSeqNo(1)
[ CSSD]2013-12-16 15:13:28.825 [1132489024] >TRACE: clssnmConnComplete: connected to node 2 (con 0x752c20), state 1 birth 0, unique 1387186106/1387186106 prevConuni(0)
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmDoSyncUpdate: Initiating sync 2
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSetupAckWait: Ack message type (11)
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSetupAckWait: node(1) is ALIVE
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSetupAckWait: node(2) is ALIVE
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmSendSync: syncSeqNo(2)
[ CSSD]2013-12-16 15:13:28.949 [1132489024] >TRACE: clssnmHandleSync: Acknowledging sync: src[1] srcName[rubikon120] seq[6] sync[2]
[ CSSD]2013-12-16 15:13:28.949 [1191237952] >TRACE: clssnmWaitForAcks: Ack message type(11), ackCount(2)
[ CSSD]2013-12-16 15:13:29.020 [2131051872] >USER: NMEVENT_SUSPEND [00][00][00][02]
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmWaitForAcks: done, msg type(11)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmDoSyncUpdate: node(0) missCount(193) state(0)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmDoSyncUpdate: node(2) is transitioning from joining state to active state
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSetupAckWait: Ack message type (13)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSetupAckWait: node(1) is ACTIVE
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmSendVote: syncSeqNo(2)
[ CSSD]2013-12-16 15:13:29.953 [1132489024] >TRACE: clssnmSendVoteInfo: node(1) syncSeqNo(2)
[ CSSD]2013-12-16 15:13:29.953 [1191237952] >TRACE: clssnmWaitForAcks: Ack message type(13), ackCount(1)
[ CSSD]2013-12-16 15:13:30.956 [1191237952] >TRACE: clssnmWaitForAcks: done, msg type(13)
[ CSSD]2013-12-16 15:13:30.957 [1191237952] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmEvict: Start
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmWaitOnEvictions: Start
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmWaitOnEvictions: Node(0) down, LATS(0),timeout(5105916)
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSetupAckWait: Ack message type (15)
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSetupAckWait: node(1) is ACTIVE
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSetupAckWait: node(2) is ACTIVE
[ CSSD]2013-12-16 15:13:31.961 [1191237952] >TRACE: clssnmSendUpdate: syncSeqNo(2)
[ CSSD]2013-12-16 15:13:31.961 [1132489024] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2013-12-16 15:13:31.961 [1132489024] >TRACE: clssnmDeactivateNode: node 0 () left cluster
CSSD]2013-12-18 02:19:06.104 [1174452544] >TRACE: clssnmPollingThread: node rubikon121 (2) missed(2) checkin(s)
[ CSSD]2013-12-18 02:19:08.112 [1174452544] >TRACE: clssnmPollingThread: node rubikon121 (2) missed(2) checkin(s)
[ CSSD]2013-12-18 11:59:52.844 [1174452544] >TRACE: clssnmPollingThread: node rubikon121 (2) missed(2) checkin(s)
alert log from node 1
2013-12-16 15:10:16.273
[cssd(23486)]CRS-1605:CSSD voting file is online: /oracrsfiles/oracrs/vote.crs. Details in /opt/app/oracle/product/10.2.0/db_1/log/rubikon120/cssd/ocssd.log.
2013-12-16 15:10:19.477
[cssd(23486)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rubikon120 .
2013-12-16 15:10:20.832
[crsd(23116)]CRS-1012:The OCR service started on node rubikon120.
2013-12-16 15:10:20.898
[evmd(23359)]CRS-1401:EVMD started on node rubikon120.
2013-12-16 15:10:21.277
[crsd(23116)]CRS-1201:CRSD started on node rubikon120.
2013-12-16 15:13:32.060
[cssd(23486)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rubikon120 rubikon121 .
2013-12-16 17:33:13.782
[evmd(11079)]CRS-1401:EVMD started on node rubikon120.
2013-12-16 17:33:14.333
[crsd(10814)]CRS-1012:The OCR service started on node rubikon120.
2013-12-16 17:41:19.945
[cssd(16116)]CRS-1605:CSSD voting file is online: /oracrsfiles/oracrs/vote.crs. Details in /opt/app/oracle/product/10.2.0/db_1/log/rubikon120/cssd/ocssd.log.
OCFS2 status from Node 1
rubikon120:/ # /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 31
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active
rubikon120:/ #
OCFS2 status from Node 2
rubikon121:/ # /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 31
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active
rubikon121:/ #
Thanks & Regards
Sami
|
|
|
|
Goto Forum:
Current Time: Wed Jun 04 00:25:38 CDT 2025
|