Re: Checkpoint in RAC

From: K Gopalakrishnan <kaygopal_at_gmail.com>
Date: Mon, 3 Dec 2007 01:08:24 -0600
Message-ID: <3b0f44a10712022308h54fd1b7dl6f3262832ca406e6@mail.gmail.com>

Ahmed,

It is called two pass recovery.. I had discussed this in greater details in my RAC book. If you don't have the book handy,here is the relevant excerpt

(3) Two Pass recovery in RAC
Two pass recovery in RAC has some additional steps to be performed since multiple instances (threads) may have failed or crashed. This involves reading and merging all the redo information for a particular block from all the threads. This is called as a Log Merge or a Thread Merge operation.
One of the bigger challenges in RAC recovery is that a block could have been modified in any of the instances (dead or alive). This was not the case in OPS. Hence, in RAC, getting hold of a "latest" version of that dirty block needs an intelligent and efficient mechanism that completes the identification of the latest version of that block and processing it for recovery. The introduction of PI images and BWRs makes it possible to significantly reduce recovery time and efficiently recover from an instance failure or crash. (4) First Pass:
This pass does not perform the actual recovery but merges and reads redo threads to create a hash table of blocks that need recovery that are not known to have been written back to the datafiles. This is where Incremental Checkpoint SCN is very crucial since the RBA denotes a starting point for recovery. All modified blocks are added to the recovery set. As BWRs are encountered, the file, DBA and SCN of each change vector is processed to limit the number of blocks to actually recover in the next pass. A block need not be recovered if its BWR version is greater than the latest PI present in any of the buffer caches.
Redo threads from all failed instances are read and merged by SCN, beginning at the RBA of the last incremental checkpoint for each thread.When the first change to a data block is encountered in the merged redo stream, a block entry is added in the recovery set data structure. Entries in the recovery set are organized in a hash table.

(4) Second Pass
In this stage, SMON re-reads the merges redo stream (by SCN) from all threads needing recovery. The redolog entries are again compared against the recovery set built in I pass and if there is a match they are applied to the in-memory buffers as in single pass recovery. The buffer cache is flushed and the checkpoint SCN for each thread is updated upon successful completion. This is also the single pass thread recovery if only one pass is to be done.

On Dec 2, 2007 10:34 PM, Ahmed kdnl <sulkdnl_at_yahoo.com> wrote:
> Thanks Dan for update,
>
> According to my example in the origninal post ,the datafile and control file
> will have latest SCN which is T2 ,so it has to apply the redo from T2 since
> T2 is greater than T1.
> Please put some points on it ?
>
>
>
>
> Dan Norris <dannorris_at_dannorris.com> wrote:
>
> Checkpoints are instance events, not database events, so in the case
> outlined below, instance2 would have to start recovery at the point T1 in
> instance1's redo logs and recover from there. As you may know, instance
> recovery incurs a certain amount of time where the GRD is frozen and
> transactions must wait until it is unfrozen before they can continue.
> Therefore, the amount of time required for recovery will affect the behavior
> of the database. Moral of the story: take care to ensure that checkpoints
> happen often enough to keep that window "short" (as defined by you and your
> application).
>
> Someone please correct me if I've gone off the path somewhere.
>
> Hope that helps.
>
> Dan
>
> ----- Original Message ----
> From: Ahmed kdnl <sulkdnl_at_yahoo.com>
> To: oracle-l_at_freelists.org
> Sent: Wednesday, November 28, 2007 1:07:01 AM
> Subject: Checkpoint in RAC
>
> Hi
>
> Anyone put some points on how checkpoint happening in RAC?
>
> For example intance1 checkpointed at time T1 and instance2 checkpointed at
> time T2
> Now in case of instance1 failure ,how instanace2 does the recovery of
> instance1.
> From which checkpoint time, it will read and apply redo,T1 or T2?
>
>
> Thanks in advance
> Syed
> ________________________________
> Get easy, one-click access to your favorites. Make Yahoo! your homepage.
>
>
>
> ________________________________
> Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it
> now.
>
>

-- 
Best Regards,
K Gopalakrishnan
Co-Author: Oracle Wait Interface, Oracle Press 2004
http://www.amazon.com/exec/obidos/tg/detail/-/007222729X/

Author: Oracle Database 10g RAC Handbook, Oracle Press 2006
http://www.amazon.com/gp/product/007146509X/
--
http://www.freelists.org/webpage/oracle-l

Received on Mon Dec 03 2007 - 01:08:24 CST