Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: Is a SUSPEND really necessary with EMC SnapView
Ever since someone suggested that "split backup" mirrors should in fact work
in the late 1980's, this has been an example of a test versus design trap.
Even if the split technology seems to work just fine and you have trouble creating a situation where you prove it to break, there may in fact be a design hole in the timing of things such that it is not guaranteed or designed to work. Most of the relatively early "plexing" technologies did not design in clean breaks. Some even marked dissociated plexes as corrupt even though they were probably okay.
The basic question is usually something like: "Do we definitely flush all writes to the plex being dissociated before we release it when we get a dissociate command?" (You can translate plex and dissociate into your volume manager's nouns and verbs.) Since the feature was that you could release storage to be used for something else, most of the early volume managers put more value on giving up the storage quickly that making a clean copy of what you were in fact asking to be no longer a copy. Stopping to explicitly flush outstanding writes is not the fastest way to dissociate, so making it bulletproof was not usually part of the feature of dissociation.
In the intervening years, many of the volume manager technologies, being made aware of the functional requirement, have provided for a clean break, where the intention is exactly what the early adopters of split mirrors wanted.
I'm not in a position to sort out which vendors work exactly which way with which commands, but I want to make sure that you realize you're trying to prove a negative in this case. Working for every case you've tried just isn't good enough. Working for every case you've tried plus the vendor's assurance that it is intended to function as you're using it is probably sufficient. If they tell you that a SUSPEND is required, then do it, even if you can't easily make it fail without the SUSPEND.
mwf
-----Original Message-----
From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org]On Behalf Of Hemant K Chitale
Sent: Sunday, August 22, 2004 10:13 AM
To: ORACLE-L_at_freelists.org
Subject: Is a SUSPEND really necessary with EMC SnapView
There has earlier been discussion [with me asking questions about
SnapShot/SnapCopy implementations
and later also responding to questions] about how an Oracle Hot Backup is
done with
SnapShot/SnapClone mechanisms.
In my organisation I do have a few SnapClone implementations on Hitachi and
EMC SANs.
I use BEGIN BACKUP and END BACKUP before and after the split but do not use
aSUSPEND.
Recently a colleague of mine tested an EMC SnapView SnapClone of a
productiondatabase
using the steps
on primary
BEGIN BACKUP
split
END BACKUP
on secondary
STARTUP MOUNT {OPEN fails with Recovery Required, as expected}
RECOVER DATABASE
OPEN
Run "dbv" on all datafiles
However, later, when we started querying the clone data we found corrupt
indexes.
ANALYZE TABLE VALIDATE STRUCTURE CASCADE failed for a few tables.
That is when I came in to the picture. I found an EMC doc on 8i [and also
another doc on 9i the
EMC engineer sent me] specifically state why a SUSPEND is required. Both
EMCengineers
at my site categorically stated that they use BEGIN BACKUP and END BACKUP
butnot a SUSPEND
at other sites. Yet the EMC docs state that a SUSPEND is required.
How have your experiences been ?
{as for the "corrupt database" I have asked the DBA, SysAdmin and EMC
engineers to schedule
another test, still without the SUSPEND as the EMC engineers swear that it
isnot required}.
http://www.emc.com/pdf/partnersalliances/oracle/clarFC4700_snapview_oracle8i
.
'pdf[1]
Page 16
"The use of ALTER SYSTEM SUSPEND is often questioned in backup scenarios
where use of different
SNAP or mirror-splitting technologies is leveraged to perform
instantaneous, or very rapid, data
duplication.
With hot backups, the physical data content of the various Oracle files
continue to change even after a
tablespace has been placed into hot backup mode.
Oracle relies on the ordering sequence of how various OS writes to the files
are organized to ensure that the
logical content relationship of the files on durable media allow a correct
recovery to be performed in the
event of unexpected server or storage system failures.
When the Oracle files are distributed over a number of system disk devices,
acommon practice in most
Oracle deployments to minimize the impact of single device failure, and to
improve general I/O
performance, the different devices have to be duplicated together.
However, when we are starting the SnapView sessions on the different
devices,they are not started
atomically. Timing windows may exist as a result. The set of Oracle files
being snapped may appear to
have lost the required I/O order sequencing.
The ALTER SYSTEM SUSPEND command suspends physical I/Os to the various
Oracledatabase files
until ALTER SYSTEM RESUME is executed. With I/O suspended to the various
database files, there will
be a temporary quiescence of OS level I/O to the various Oracle files.
Duringthis window, the physical
content of all the Oracle files would be content-consistent. When all the
required SNAP sessions are
successfully started within this window, everything should then be working
correctly."
Hemant K Chitale
Oracle 9i Database Administrator Certified Professional
http://web.singnet.com.sg/~hkchital
[2]
-- Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.htmlReceived on Sun Aug 22 2004 - 10:29:23 CDT
-----------------------------------------------------------------
![]() |
![]() |