Re: RMAN/Oracle SE/RDS

From: Tim Gorman <tim.evdbt_at_gmail.com>
Date: Thu, 1 Aug 2024 08:39:10 -0700
Message-ID: <a21f8c66-0780-4544-9a50-ed1c7d37ee59_at_gmail.com>



Ethan,

I wrote and supported the extensions to Azure Backup for Oracle databases (HERE
<
https://learn.microsoft.com/en-us/azure/virtual-machines/workloads/oracle/oracle-database-backup-azure-backup?tabs=azure-portal>), so I feel capable of answering some of the questions here, as well as any specific questions on Azure.

First of all, there are essentially two basic methods for performing database backups...

  1. streaming
  2. storage-level snapshots

RMAN falls squarely into the category of "streaming", where the files comprising an Oracle database backup are copied from one location to another.  Later on in the process, those "backup sets" can be streamed again to another location (i.e. vault) for redundancy and data protection, or streamed yet again to another geographic location as well for additional protection and disaster resilience.

Storage-level snapshots, like Azure Backup, are based on the storage being able to designate a point-in-time at which a snapshot is "frozen" while modifications continue to occur.  I'm not referring to the old-style "mirror splits" originating 30+ years ago, but rather copy-on-write or redirect-on-write technology originating and perfected in the past 20 years.  Later on in the process, those media snapshots can be streamed to another location (i.e. vault) for redundancy and protection, and still later the media snapshots can be streamed to yet another geographical location (i.e. another cloud region, etc) for further redundancy and disaster resilience.

It is important to understand where in the technology the streaming occurs, and where (and how) snapshots occur.

It is also important to understand the metrics by which recoverability must be measured...

  1. recovery point objective (RPO)
    • also described as "/point-in-time to which recovery is expected/"
  2. recovery time objective (RTO)
    • also described as "/expectation for the amount of time for return-to-service/"

In general, backups by Oracle RMAN can recover with RPO=0 or RPO near zero, but RTO can be problematic due to the time it takes to stream data from and to the database files.

In general, backups using storage-level snapshots can also recover RPO=0 or RPO near zero, and RTO can be extremely quick if the local snapshot image can be used.  If the snapshot image must be restored from a local vault, then RTO increases dramatically (because of the streaming restore from the vault).  It should be noted that there are few circumstances where the local snapshot image cannot be used, but they certainly exist.

One of the advantage of Oracle RMAN are more advanced features like corruption checking and file-level or block-level restores; storage-level snapshots are generally an all-or-nothing type of restore.

One of the advantages of storage-level snapshots, especially Azure Backup, is that the entire VM image is backed up, not just the database.  So cloning a copy of the entire running VM with its database(s) is incredibly easy, to the point where some of "advantages" of Oracle features become moot.  For example, while Azure Backup can only restore the entire database, it is fast and easy to restore the entire database to a new cloned VM, copy what is needed back to the original VM, and then destroy the clone VM when finished with it.  LIkewise, a temporary cloned VM can easily be employed to perform BACKUP VALIDATE CHECK LOGICAL operations to find corrupted database blocks, or as test images for various uses.  One nonsense objection I've heard to that is: "but cloning VMs is so expensive", which just isn't true compared to the hours and days spent running a single RMAN restore.  A clone VM doesn't have to use the same VM shape as the original;  it can be much less expensive, and for a few dollars, a clone VM can accomplish more in an hour or two of existence than a DBA will spend in wages through days of figuring things out.

Twenty-five or so years ago, I was one of those promoting Oracle RMAN, trying to get DBAs to leave aside their beloved manual backup scripts.  The fact remains that Oracle RMAN is irrevocably based on the concept of sequentially streaming backups to tape, even though nobody uses tape-based media any longer.  It should be noted that Oracle RMAN must be integrated to a "virtual tape library" (i.e. NetBackup, Networker, etc) using an SBT driver, and that terminology alone should be convincing.  RMAN is the epitome of streaming sequential database backup technology, but it is indeed firmly grounded in extinct technology.  If you think about it, the concept of "incremental backups" central to RMAN are themselves accommodations to the limitations imposed by streaming sequential media, and as everyone knows, "incremental backups" only save time during backup, and add time and complexity during restore.  RMAN does a fine job of disguising that complexity during restore, but it is still present.

I could continue blathering on, for there is so much to discuss in pro's or con's, but I'll stop for now...

Hope this helps?

Thanks!

-Tim

On 8/1/2024 6:46 AM, Ethan Post wrote:
> Managing a customer with SE and single threaded RMAN backup. They take
> a lot of time to run and of course a lot of time to restore. If I was
> going to optimize I would look for another file copy/split mirror or
> even more old school manual scripted solution.
>
> I am fairly certain restores are faster in RDS which leads me to think
> they are not using RMAN. I do believe OCI Database Service uses RMAN
> and wondering if they allow parallel or something when it is SE in the
> cloud with their service.
>
> Not sure what Azure does but would like to know if anyone has any
> helpful links to video overviews of Azure/Oracle backups and
> recovery/cross region.
>
> If the above is all roughly accurate then one could put a plus in the
> "cloud" column in terms of 1) You can run SE and 2) The backup and
> restore process is likely going to be faster and more powerful (cross
> region). At least that is my opinion for the moment with the
> information I have which is limited.
>
> Would like to hear your thoughts here. What is the $ value one should
> factor in when considering on prem cost to hosted cloud if we factor
> in backup/restore capabilities?
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Aug 01 2024 - 17:39:10 CEST

Original text of this message