Re: Bank Databases
Date: Mon, 25 Jun 2012 07:53:19 -0400
Message-ID: <CAJ7936znqtfTtP_j83VG1L0bNAJsOqoZtcte7_hc1RRcUoDbzQ_at_mail.gmail.com>
Doh - resending as got dinged for overquoting:
Timely enough, the Register is reporting that CA's job scheduler software may be responsible:
http://www.theregister.co.uk/2012/06/25/rbs_natwest_what_went_wrong/
Could certainly mean that Oracle was still involved (or Sybase, or some other database), but the inability to schedule jobs was the root issue.
Matt
>>>
>>> I'm particularly interested as we test our failover every 3 months and
>>> last
>>> time we did so there was a power outage on the standby which was running
>>> temporarily as primary which we hadn't anticipated. The start up script
>>> tried to bring what was currently a primary db as a standby. I'm trying to
>>> automate this and yuk without dg broker which has its own set of problems
>>> I'm a bit stymied!
>>> I'm not suggesting Nat West hadn't tested thir failover , but imagine its
>>> difficult due to volumes.
>>> On 25 June 2012 12:08, Matthew Zito <matt_at_crackpotideas.com> wrote:
>>> > Yes, though I doubt it's anything as simple as an "Oracle issue".
>>> > From my experience watching large organizations deal with complex
>>> > crises like this, typically it's a series of cascading failures - so
>>> > perhaps an Oracle database was involved, but many separate pieces had
>>> > to fail in order to get to this point.
>>> >
>>> > For example, I once saw a major global company's firmwide email system
>>> > go down for over a day due to a cascading series of:
>>> > - storage array failure
>>> > - misconfigured hardware
>>> > - engineer typo
>>> > - misunderstood recovery architecture
>>> >
>>> > I'm trying to keep it vague intentionally, but if any one of those
>>> > things hadn't happened, they would have had an hour downtime on their
>>> > email instead of a 30 hour downtime. I suspect the natwest issue is
>>> > similar, *though* I do expect that we'll get more info in the coming
>>> > days/weeks, so maybe we can get some more details then.
>>> >
>>> > Matt
>>> >
>>> > On Mon, Jun 25, 2012 at 7:01 AM, Howard Latham <howard.latham_at_gmail.com>
>>> > wrote:
>>> > >
>>> > > So Nat west being unable to process transactions for 5 days due to a
>>> > change
>>> > > in backup software and fail over could well be an Oracle issue.
>>> > >
>>> > > --
>>> > > Howard A. Latham
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Jun 25 2012 - 06:53:19 CDT