Re: ASM for single-instance 11g db server?

From: Noons <wizofoz2k_at_yahoo.com.au>
Date: Wed, 06 Apr 2011 21:47:47 +1000
Message-ID: <inhjt4$9av$1_at_dont-email.me>



Mladen Gogala wrote,on my timestamp of 6/04/2011 6:21 AM:
> On Tue, 05 Apr 2011 22:14:34 +1000, Noons wrote:
>
>> Agreed 100%. Very much so. In fact, I am in the process of
>> re-allocating all our databases to RAID5 in the new SAN, precisely
>> because given two equal SAN/processor environments I can't prove to
>> myself that RAID5 is inherently slower than RAID10.
>
> Noons, that goes against the traditional lore. Do you have any numbers to
> back such claim up?

That's the problem: "traditional lore". Not based in any reasonably recent facts. Hence why I haven't joined BAARF.

Look, without a shadow of a doubt: 10 years ago I'd think twice about putting *any* database in RAID5.

Now? With a SAN? I don't even blink. Note that I said "with a SAN"!

All my development dbs are in RAID5. And of the production ones, only one is not and it's going to RAID5 in the SAN refresh in 3 months time.

Mostly because:

1- All the arguments about what happens to RAID5 when 2 disks fail simultaneously can always be matched by the equivalent in RAID10. Hey, if it's a problem with both, exactly what was the point?

2- When was the last time any 2 disks in your (recent crop!) SAN failed simultaneously in the same RAID5 string? I thought so. And isn't that why we go to all the trouble of having a DR site *and* daily online backups *and* archived redo logs? There *is* a limit to how much "tobesuretobesure" we need to follow!

3- Modern SANs use disk failure prediction technology that avoids most if not all ad-hoc MTBF failures. In 4 years it's been used and abused, our Clarion has never had one single failure of a running lun: the box has always phoned home asking for a new one loooong before any disk failed. SANs cost a lot of moolah because they are designed to do that sort of thing from the word go.

4- I don't run OLTP dbs. Now. Most of my dbs are relatively small and low volume. In the order of 100-500GB size, with maybe 1TB/day of total I/O. With one exception, which is DW. That one is as far from OLTP as it can get: 3TB per instance (multiple for dev/test and uat) and around 8TB/day of total I/O (~= 100MB/s averaged for the whole day). Most of the accesses there are sequential and BIG - ideal fit for a big thumping RAID5 string of disks, exceedingly expensive to duplicate with RAID10.

We run Statspack everywhere every 4 hours and have done so for 4 years. I can reconstruct and plot any period of I/O stats and waits in our dbs in that timespan.

I've used that to compare loads between our prod system where we use RAID10 and our dr/dev environment where I've set things up with RAID5, 8+1 disk stripe.

Then we ran tests.

CPU and memory same, workload same, no difference whatsoever in I/O waits and throughput.

One thing came out: rman backups and restores take slightly less with the RAID5 setup.

Given that I'm not gaining anything in performance and I'm losing in backup/restore performance and I'm paying twice more with RAID10, my reaction was simply: "er...whaaaat?"

So it's now RAID5 all the way.

Until we have to run a OLTP system, that is! ;) Received on Wed Apr 06 2011 - 06:47:47 CDT

Original text of this message