Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> CRS stuff (part 1)

CRS stuff (part 1)

From: Henry Poras <henry_at_itasoftware.com>
Date: Wed, 23 Nov 2005 09:48:03 -0500
Message-ID: <007101c5f03c$e8ba3ab0$3800040a@itasoftware.com>


OK, I tried to send this a couple of times and it never made it. Maybe it was just too long. I'll try breaking it into two parts and see what happens.

Continuing on from a thread of last week, I looked at some of the CRS boot scripts to see what they did. This is a summary of my first cut at the logic. (wish me luck with the formatting) At boot time, the scripts that are run are:

init.evmd run (from inittab)
init.cssd fatal (from inittab)
init.crsd run (from inittab)
init.crs (from an rc directory)

The three scripts run from inittab are all run using 'respawn' (the process is restarted if it is terminated)

init.crs start (from rc)
  runs init.cssd autostart
    if the AUTOSTARTFILE (/etc/oracle/scls_scr/$HOST/root/crsstart)=disable     then

         init.cssd norun (this just sets the cssrun file in the 
                          above dir to 'norun')
    if AUTOSTARTFILE = enable
    then
         run init.cssd manualstart
           get the boot time of the server (init.cssd booted) 
           and put this into cssrun

# so far cssrun is either norun, or the boottime of the server

init.evmd run (from inittab)
  run init.cssd startcheck (I will digress in a moment to detail startcheck)     check every 30 sec. until the exit status of init.cssd startcheck = 0     # what if it errors out? Does this loop ever exit?     once init.cssd startcheck succeeds, run $CRS_HOME/bin evmd run (as oracle)

    #lockfiles, flagfiles and pidfiles are also cleaned up and      apparently recreated

init.cssd startcheck
  # this is called by just about every other script. According to     internal comments it:

  # returns 0 if we should start
  # returns 1 on a non-cluster boot (i.e. ASM, no RAC)
  # returns 2 if disabled by admin
  # returns 3 on error
  #
  # I am skipping third party vendor clustering logic and non-cluster stuff
  if cssrun does not exist
  or
  if cssrun is not equal to the boottime (see init.crs start)   then exit with status of 3
  wait for crsctl to be readable
  wait for Voting disk and OCR to come up   run crsctl check boot (as oracle) #what does this do?   loop until exit status of crsctl is 0
  exit init.cssd startcheck with status of 0 (OK)

# Back to inittab stuff

# init.cssd fatal is next, but the logic here is by far the longest, so
# I will skip it and handle it last

init.crsd run
  run init.cssd startcheck
    check every 30 seconds until exit status = 0   check if this is the first running of init.crsd run after server boot

     if crsdboot doesn't contain the boottime, this is the first running
     then
          FIRST=true
          echo boottime>crsdboot
          #do some PIDFILE, LOCKFILE, FLAGFILE stuff
          run $CRS_HOME/bin/crsd -1 &
          # not sure what this binary and flag does
     fi
     # for every run of init.crsd run (boot and respawn)
     run $CRS_HOME/bin/crsd run
     # start the crs daemon. We can guess what that does

# Back to init.cssd fatal
# init.cssd fatal calls init.cssd daemon as a background
# process, and then continues to loop to make sure the
# daemon script is still there.
# init.cssd daemon calls ocssd. If ocssd fails, or a
# duplicate one is started, the server reboots (Metalink Note265769.1).
#
init.cssd fatal
  run init.cssd startcheck
    check every 30 seconds until exit status = 0   run init.cssd daemon &
    run $CRS_HOME/bin/ocssd (as oracle)
    # what happens when this css daemon (ocssd )fails?     if cssrun is 'norun'
    or
    if /etc/oracle/scls_scr/$HOST/oracle/cssfatal = disable     then

       do nothing (exit out to loop in init.cssd fatal)     else # css daemon dies, cssrun = 'boottime', cssfatal='enable'

       reboot -n -f
       init.cssd norun 
       #disable respawn. init.cssd startcheck returns 3
  # Return to init.cssd fatal
  # check every second if the daemon script still exists   loop (infinite)
    run init.cssd startcheck
      check exit status. If non-zero, exit (e.g. if cssrun = norun)
      # respawn is now off. Node shutdown is handled outside of CRS
      look for pid of background daemon process (kill -0)
      if it exists
      then
          continue looping
      else
          start another one (init.cssd daemon &)
          # this will lead to a reboot -n -f
  end loop

TBC . Henry

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Nov 23 2005 - 08:50:33 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US