Home » Server Options » RAC & Failsafe » 12.1.0.2 root.sh fails to start after deconfiguring clusterware (12.1.0.2, redhat 7.3)
12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668119] |
Wed, 07 February 2018 17:28 |
|
juniordbanewbie
Messages: 250 Registered: April 2014
|
Senior Member |
|
|
Dear all,
as the customer have additional requirements on bonding, crs daemon does not start. As a result I deconfigure the whole clusterware according to
https://docs.oracle.com/database/121/CWLIN/rem_orcl.htm#CWLIN349
unfortunately I did not zero out the disk that is used to store ocr and voting disk after deconfiguring
when I reconfigured again, ora.asm did not even start. in fact the root.sh script did not even finish completing on the 1st node
so I try to deconfigured again,
unfortunately this time I could not deconfigured again.
here's the output of deconfigure
PRCR-1068 : Failed to query resources
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1068 : Failed to query resources
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1070 : Failed to check if resource ora.net1.network is registered
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1070 : Failed to check if resource ora.helper is registered
CRS-0184 : Cannot communicate with the CRS daemon.
PRCR-1070 : Failed to check if resource ora.ons is registered
CRS-0184 : Cannot communicate with the CRS daemon.
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.evmd' on 'dwhdb1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'dwhdb1'
CRS-2676: Start of 'ora.mdnsd' on 'dwhdb1' succeeded
CRS-2676: Start of 'ora.evmd' on 'dwhdb1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'dwhdb1'
CRS-2676: Start of 'ora.gpnpd' on 'dwhdb1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'dwhdb1'
CRS-2672: Attempting to start 'ora.gipcd' on 'dwhdb1'
CRS-2676: Start of 'ora.cssdmonitor' on 'dwhdb1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'dwhdb1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'dwhdb1'
CRS-2672: Attempting to start 'ora.diskmon' on 'dwhdb1'
CRS-2676: Start of 'ora.diskmon' on 'dwhdb1' succeeded
CRS-2676: Start of 'ora.cssd' on 'dwhdb1' succeeded
2018/02/07 16:15:32 CLSRSC-115: Start of resource 'ora.asm' failed
2018/02/07 16:15:32 CLSRSC-558: failed to deconfigure ASM
Died at /u01/app/12.1.0.2/grid/crs/install/crsdeconfig.pm line 1039.
The command '/u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib -I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -deconfig -force -lastnode' execution failed
here's the alert log
2018-02-07 16:41:17.993 [OCTSSD(22351)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 22351 is exiting
2018-02-07 16:41:19.107 [ORAROOTAGENT(22641)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 22641
2018-02-07 16:41:19.137 [OCTSSD(22654)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 22654
2018-02-07 16:41:20.220 [OCTSSD(22654)]CRS-2407: The new Cluster Time Synchronization Service reference node is host dwhdb1.
2018-02-07 16:41:20.220 [OCTSSD(22654)]CRS-2401: The Cluster Time Synchronization Service started on host dwhdb1.
2018-02-07 16:42:19.117 [ORAROOTAGENT(22641)]CRS-5818: Aborted command 'start' for resource 'ora.cluster_interconnect.haip'. Details at (:CRSAGF00113:) {0:9:4} in /u01/app/grid/diag/crs/dwhdb1/crs/trace/ohasd_orarootagent_root.trc.
2018-02-07 16:42:19.225 [ORAROOTAGENT(22641)]CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
2018-02-07 16:42:19.225+Start action for HAIP aborted. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/dwhdb1/crs/trace/ohasd_orarootagent_root.trc".
2018-02-07 16:42:23.119 [OHASD(21751)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.cluster_interconnect.haip'. Details at (:CRSPE00163:) {0:9:4} in /u01/app/grid/diag/crs/dwhdb1/crs/trace/ohasd.trc.
2018-02-07 16:42:23.148 [OCTSSD(22654)]CRS-2405: The Cluster Time Synchronization Service on host dwhdb1 is shutdown by user
2018-02-07 16:42:23.148 [OCTSSD(22654)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 22654 is exiting
2018-02-07 16:42:24.153 [OHASD(21751)]CRS-2878: Failed to restart resource 'ora.asm'
here's the output of ohasd_orarootagent_root.trc
2018-02-07 16:42:23.148079 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] (:CLSN00108:) clsn_agent::stop {
2018-02-07 16:42:23.148118 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/12.1.0.2/grid
2018-02-07 16:42:23.148124 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] Utils::getOracleHomeAttrib oracle_home:/u01/app/12.1.0.2/grid
2018-02-07 16:42:23.148344 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] PID 22654 from /u01/app/12.1.0.2/grid/ctss/init/dwhdb1.pid
2018-02-07 16:42:23.148351 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] CLSDM Based stop action
2018-02-07 16:42:23.148365 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] Using Timeout value of 18000 for stop message
2018-02-07 16:42:23.148670 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] ClsdmClient::sendMessage clsdmc_respget return: status=0, ecode=0
2018-02-07 16:42:23.151393 : USRTHRD:2424530688: {0:9:4} Thread:[DaemonCheck:ctssd] Thread exiting
2018-02-07 16:42:23.151405 : USRTHRD:2424530688: {0:9:4} Thread:[DaemonCheck:ctssd] Skipping Agent Initiated a check action
2018-02-07 16:42:23.151410 : USRTHRD:2424530688: {0:9:4} Thread:[DaemonCheck:ctssd] isRunning is reset to false here
2018-02-07 16:42:24.149128 :GIPCXCPT:2439239424: gipcInternalSend: connection not valid for send operation endp 0x7f1c7c051180 [000000000000046b] { gipcEndpoint : localAddr 'ipc', remoteAddr 'ipc://dwhdb1_DBG_CTSSD', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 22654, readyRef (nil), ready 0, wobj 0x7f1c7c05d7b0, sendp 0x7f1c7c05d570 status 0flags 0x2000a61e, flags-2 0x1, usrFlags 0x20020 }, ret gipcretConnectionLost (12)
2018-02-07 16:42:24.149160 :GIPCXCPT:2439239424: gipcSendF [clsdmc_send : clsdmc.c : 728]: EXCEPTION[ ret gipcretConnectionLost (12) ] failed to send on endp 0x7f1c7c051180 [000000000000046b] { gipcEndpoint : localAddr 'ipc', remoteAddr 'ipc://dwhdb1_DBG_CTSSD', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 22654, readyRef (nil), ready 0, wobj 0x7f1c7c05d7b0, sendp 0x7f1c7c05d570 status 0flags 0x2000a61e, flags-2 0x1, usrFlags 0x20020 }, addr 0000000000000000, buf 0x7f1c740232d0, len 65, cookie (nil), flags 0x0
CLSDMC:2439239424: Failed to send dynamic control message to connection [ipc://dwhdb1_DBG_CTSSD][12]
2018-02-07 16:42:24.149186 : CLSDMC:2439239424: gipcWait gets wrong msg from connection [ipc://dwhdb1_DBG_CTSSD][0] with type gipcreqtypeDisconnect
2018-02-07 16:42:24.149240 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-10 errbuf:CRS-02004: error 0 encountered when sending messages to CTSSD
2018-02-07 16:42:24.150469 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [stop] (:CLSN00108:) clsn_agent::stop }
2018-02-07 16:42:24.150479 : AGFW:2439239424: {0:9:4} Command: stop for resource: ora.ctssd 1 1 completed with status: SUCCESS
2018-02-07 16:42:24.150761 : AGFW:2435036928: {0:9:4} Agent sending reply for: RESOURCE_STOP[ora.ctssd 1 1] ID 4099:868
2018-02-07 16:42:24.150908 : CLSDMC:2439239424: Connecting to ipc://dwhdb1_DBG_CTSSD
2018-02-07 16:42:24.151086 : CLSDMC:2439239424: Error: gipcWait for gipcConnect - ret_gipcreqinfo=gipcretConnectionRefused, type_gipcreqinfo=gipcreqtypeConnect
2018-02-07 16:42:24.151135 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-7 errbuf:
2018-02-07 16:42:24.151163 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] Calling PID check for daemon
2018-02-07 16:42:24.151201 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] Process id 22654 translated to
2018-02-07 16:42:24.151229 : CLSDMC:2439239424: Connecting to ipc://dwhdb1_DBG_CTSSD
2018-02-07 16:42:24.151363 : CLSDMC:2439239424: Error: gipcWait for gipcConnect - ret_gipcreqinfo=gipcretConnectionRefused, type_gipcreqinfo=gipcreqtypeConnect
2018-02-07 16:42:24.151421 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-7 errbuf:
2018-02-07 16:42:24.151463 :CLSDYNAM:2439239424: [ora.ctssd]{0:9:4} [check] Check return = 1, state detail = NULL
2018-02-07 16:42:24.151655 : AGFW:2435036928: {0:9:4} ora.ctssd 1 1 state changed from: STOPPING to: OFFLINE
2018-02-07 16:42:24.151731 : AGFW:2435036928: {0:9:4} Agent sending last reply for: RESOURCE_STOP[ora.ctssd 1 1] ID 4099:868
2018-02-07 16:42:24.151781 : AGFW:2435036928: {0:9:4} Agent has no resources to be monitored, Shutting down ..
2018-02-07 16:42:24.151816 : AGFW:2435036928: {0:9:4} Agent sending message to PE: AGENT_SHUTDOWN_REQUEST[Proxy] ID 20486:63
2018-02-07 16:42:24.152954 : AGFW:2435036928: {0:9:4} Agent is shutting down.
2018-02-07 16:42:24.152963 : AGENT:2435036928: {0:9:4} Agfw calling user exitCB, will exit on return
2018-02-07 16:42:24.152968 : AGENT:2435036928: {0:9:4} returned from user exitCB, exiting
2018-02-07 16:42:24.152987 : AGFW:2435036928: {0:9:4} Agent is exiting with exit code: 1
how should I proceed from here?
should I delete the gpnp profiles, the zero out the asm disk used for storing ocr and voting disk?
many thanks in advance!
|
|
|
|
|
|
|
Re: 12.1.0.2 root.sh fails to start after deconfiguring clusterware [message #668297 is a reply to message #668142] |
Sat, 17 February 2018 03:38 |
|
juniordbanewbie
Messages: 250 Registered: April 2014
|
Senior Member |
|
|
Dear all,
this is how I resolve the issues
however take note this is only used as a last resort and it is suggested by MOS
/bin/rm -rf /u01/app/12.1.0 .2/grid
# /bin/rm -rf /u01/app/12.1.0.2/grid _corrupted
# /bin/rm -rf /u01/app/oraInventory
# /bin/rm -rf /u01/app/oraInventory _corrupted
# /bin/rm -f /etc/oraInst.loc
# /bin/rm -rf /etc/oracle
# /bin/rm -f /etc/oratab
# /bin/rm -rf /usr/tmp/.o racle
install binaries,
next do a runcluvfy from grid subfolder from the unzip grid installer location
./runcluvfy.sh stage -pre crsinst -n dhwdb1,dhwdb2 -q /dev/oracleasm/ocr_vote_1 -osdba asmdba -asm -presence local -asmgrp
asmadmin -crshome /u01/app/12.1.0.2/grid -networks bond0:10.10.30.0:public/bond1:192.168.2.24:cluster_interconnect -fixup -fixupnoexec -
verbose -asmdev /dev/oracleasm/ocr_vote_1,/dev/oracleasm/ocr_mirror_1,/dev/oracleasm/data_1,/dev/oracleasm/fra_1
https://docs.oracle.com/database/121/CWLIN/crsunix.htm#CWLIN490
cd $GRID_HOME/crs/config
./config.sh -silent responseFile <responsefile full path> -showprogress -executePrereqs
./config.sh -silent responseFile <responsefile full path> -showprogress -debug -waitforcompletion
|
|
|
Goto Forum:
Current Time: Thu Jan 02 07:25:39 CST 2025
|