Re: Storage array advice anyone?

From: Martic Zoran <zoran_martic_at_yahoo.com>
Date: Thu, 16 Dec 2004 01:44:18 -0800 (PST)
Message-ID: <20041216094418.33436.qmail@web52605.mail.yahoo.com>

Hi Matthew,

What you said is very true.
There are so many dimensions to cope with in the decision what to do/buy:

- money (HW/SW, support, ..)
- SLA (performance)
- compatibility
- knwoledge
- ...

> The performance argument simply doesn't stand up as
> an absolute
> anymore.

I have read a lot, and still using the www.baarf.com arguments agaiinst RAID F.
If somebody can explain to me that next scenario can be better covered by any available RAID F technology on the market.
You mentioned 14 drive RAID 6 and 14 drive RAID 10. It looks as more interesting then comparing 5 drives for RAID-5 and 4 drives for RAID-10.

What I am facing (probably because of my luck of knowledge and/or not enough info from storage vendors) is this:
I know that Oracle is tending to write to the I/O subsystem in that way to have nice average load to it (DBWR - MTTR, not considering log writing). Then we designed the system to do something and we somehow calculate/test the I/O throughput of 400 8k db blocks to the system per second.
This is what I/O subsystem should resist.

Would you Matthew or anybody tell me now any array/disk/SAN/NAS real configuration where RAID F will be better then RAID 1 in at least one criteria: performance and/or availability?
You can even consider changing 400 by any mumber N.

I know the data I provided is maybe not covering all. You can explain even the conditions where RAID 5 will be better: like these 400 db blocks writes are not all small writes, whatever, ...

Also, I am here much more aware of write operations then read ones for the simplicity.
If I am missing something to mention let me know.

I am waiting to read your book Matthew.
Was not impressed with Rampant one, too much copy/paste from everywhere without real test examples.

Thanks in advance,
Zoran

Matthew Zito <mzito_at_gridapp.com> wrote:

>
> This is the sort of issue that comes up often on
> oracle-l - in a
> nutshell, "Is raid 5 acceptable for database
> workloads". There's a lot
> of great writing that has been done on the subject,
> but the tragedy is
> that a lot of it is very very old, and could stand a
> rewrite. I cover
> some of these topics in my forthcoming storage book
> from o'reilly
> (*plug* *plug*), but my overall opinion is that if
> you can afford the
> utilization penalty for RAID-10, then you should
> take it every time.
>
> However, RAID-F is not the terrible thing its made
> out to be. The
> things that, in my opinion, have significantly
> changed the landscape
> for parity-protected RAID levels:
>
> -RAID-6 (aka RAID-DP) - basically adding an extra
> parity disk for very
> large RAID-5 sets to allow you to suffer three disk
> failures before
> data loss occurs (that is, data loss occurs on the
> third disk that
> dies).
> -Virtualization/abstraction of storage objects -
> when the LUN you are
> sending I/Os to is comprised of chunks from 50
> different spindles from
> 10 different RAID-5 groups, the performance is
> excellent. Another
> example of this is HSM allowing for infrequently
> used blocks to be
> "paged" out to RAID-5/6 devices with the
> high-performance blocks
> remaining on RAID-10. Yet another example is
> "third-mirror" or BCV or
> Shadow Copy (whatever the vendor term of the week
> is) for a
> point-in-time copy of your database - but that
> addresses the
> recoverability issue, not so much the availability
> issue
> -Predictive failure analysis - basically, most
> drives soft fail and
> throw errors before they hard fail (head crash,
> etc.). More modern
> disk arrays will preemptively bring in a hot spare
> to replace a drive
> that has had more than a certain number of errors.
> Reconstruction
> occurs directly from the dying disk until it is not
> responding properly
> anymore.
> -Hardware/ASIC based parity checksumming -
> performance improvement,
> plain and simple, due to pipelining and
> paralellization of parity
> generation
>
> There's really two arguments that seem to come up
> against RAID-F:
>
> -Performance - RAID-F is slow
> -Availability - RAID-F is inherently less reliable
>
> The performance argument simply doesn't stand up as
> an absolute
> anymore. There's three reasons for that - RAID-5
> implementations have
> gotten better, newer technologies like the ones I
> list above remove a
> lot of the shortcomings of RAID-5, and storage in
> general has gotten
> faster. Many databases that I have seen were very
> carefully tuned for
> the specific array, best practices, logs and indexes
> and datafiles all
> on separate disks, etc. etc and would have been just
> as fast had they
> thrown everything onto one big volume and let the
> array sort it out.
> In fact, I see many organizations creating many
> small storage objects
> for various performance-driven purposes, when they
> were getting carved
> out of the same RAID group, rendering any benefit
> imaginary at best.
> RAID-5 may not always be as fast as RAID-10, but
> often it doesn't need
> to be. Look at it this way - we'd all like to be
> running our databases
> on the biggest iron possible to improve performance,
> but we're forced
> to deal with the servers that are acceptable from a
> budgetary and
> management perspective. The same is true of
> storage.
>
> The availability argument is true, though with the
> above techniques
> things have been again mitigated with time. The key
> issue, is, though
> - what is the desired/required availability for the
> database? We would
> all like to have a 24/7 database that's as reliable
> as possible, but we
> all make decisions about where to cut corners for
> availability. Many
> organizations trust their storage arrays to be
> redundant from an
> operating perspective, when most of the truly
> damaging outages I've
> seen in my time working with storage were due to
> array failure having
> nothing to do with any RAID group configuration.
> For example, in most
> fibre channel arrays today, yanking an active disk
> drive has a
> reasonable probability of taking down the entire
> fibre channel loop,
> killing off up to 128 drives at once. Yet very few
> organizations
> mirror across storage arrays online (though many
> mirror them remotely
> for DR).
>
> The general argument FOR RAID-10 seems to be, "It's
> better, and it
> doesn't really cost THAT much more". The fact
> remains, it does cost
> more, and can cost a great deal more than a RAID-F
> configuration,
> depending on group sizing. For example, 140 disks
> in two different
> configurations - 10 14-drive RAID-10 sets and 10
> 14-drive RAID-6 sets
> (two reasonable standard configurations provided by
> EMC Clariion and
> Netapp NearStore). With 73GB drives, RAID-10 nets
> you just a shade
> over 5TB, while RAID-6 gets you 8.7TB.
>
> The ideal way to look at things is from the business
> perspective- Is
> the improved reliability for RAID-10 important
> enough _for this
> application_ that it is worth the increased cost?
> Vet your vendor
> heavily - if necessary, hire someone impartial to
> come in and explain
> to you exactly what the gotchas are going to be with
> the products
> you'll be buying. Then figure out what your
> exposure is going to be -
> if you run RAID-10, will you be buying another disk
> array in a year?
> If you run RAID-5 and you lose two disks, how long
> will it take to
> recover?
>
> Again, I'm not defending RAID-F as being as good as
> RAID-10. I'm
> simply saying that immediately disregarding RAID-5/F
> as a waste of time
> based on old information and preconceptions is like
> disregarding Linux
> based on the way it was back in 1998. Times change,
> and keeping costs
> down is something that, imho, not enough technology
> people think about.
>
> Thanks much,
> Matt
>
> --
> Matthew Zito
> GridApp Systems
> Email: mzito_at_gridapp.com
> Cell: 646-220-3551
> Phone: 212-358-8211 x 359
> http://www.gridapp.com
>
>
> On Dec 15, 2004, at 8:36 PM, Joel Garry wrote:
> oracle-l_at_freelists.org
> >> On Tue, 14 Dec 2004 10:47:20 +0000,
> chris_at_thedunscombes.f2s.com=20
> >>> My experience is that with either RAID 5 or 10
> you have to be=20
> >>> unbelievably unlucky to lose data providing
> disks are replaced
> >>> when=20
> >>> they fail and not left for a few days or even
> more. You are
> >>> talking=20
> >>> extremely remote. It might be an idea to get
> someone

=== message truncated ===

Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today! http://my.yahoo.com

--
http://www.freelists.org/webpage/oracle-l

Received on Thu Dec 16 2004 - 03:44:30 CST