RE: 1 minutes: best downtime story

From: <Jay.Miller_at_tdameritrade.com>
Date: Tue, 26 Mar 2013 15:59:14 +0000
Message-ID: <0D8F4CAC0F9D3C4AACC63F50FD9957F726491415_at_PRDCTWPEMLMB31.prod-am.ameritrade.com>



Back in 2001 one of our SAs was shutting down an unused server. He powered it down and then started pulling the disks out of the storage array. Unfortunately that storage array also contained disks for our live data warehouse database (which also supported several internal applications).

The database crashed of course. Since he did it overnight our warm backup to disk was running at the time so it was half overwritten. We were able to get the database partially working with the files we had available.

So we went to the previous day's tape backup only to discover the tape was bad. As was the tape from the day before. Finally the tape from 3 days earlier was in a readable state, we were able to do a restore and then start applying the changes. It took about 4 days to get it full restored.

On the plus side we finally got approval for a DR box for that database after the incident and the SA group started checking their tapes.

I also started requesting an environment to test restores. We finally got one last year (11 years later).

Jay Miller
Sr. Oracle Database Administrator

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Nigel Thomas Sent: Monday, March 18, 2013 10:48 AM
To: oracle-l
Subject: Re: 1 minutes: best downtime story

Another one (thanks Brian Pardy for his story which reminded me of this) A UK utility company had one of its data centres in a lower ground floor, with a loading bay outside. Delivery truck's brakes failed on the down ramp, and the fire doors - though designed to open outwards only - unsurprisingly proved a poor match against the accumulated momentum of the incoming 20 ton truck.

Regards Nigel

On 14 March 2013 21:07, Jeremy Schneider <jeremy.schneider_at_ardentperf.com>wrote:

> Hey all -
>
> I'm writing a paper about top causes of downtime. As one component
> of research, I'd like to get some input from you!
>
> One minute, two sentences. First sentence: describe what went down.
> Second sentence: describe why. (I have to categorize all of these.)
> Everyone should have at least one downtime story so I'm hoping for a
> lot of feedback!
>
> Answer about any technology - database, operating system, etc.
>
> Thanks!
>
> -Jeremy
>
>
> --
> Jeremy Schneider
> Pythian Consulting Group
> Chicago
>
> +1 312-725-9249
> http://www.pythian.com
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Received on Tue Mar 26 2013 - 16:59:14 CET

Original text of this message