Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: asynch I/O

Re: asynch I/O

From: Tim Gorman <tim_at_sagelogix.com>
Date: Sat, 20 Sep 2003 20:19:39 -0800
Message-ID: <F001.005D095E.20030920201939@fatcity.com>


Mladen,

I don't know enough yet, but last Wednesday night when I was pulled into the before-mentioned situation, performance was horrible and I/O was high. When the VP-IT said they were on NetApps, I felt I instantly knew what was wrong, but I've learned to keep my mouth shut until I have facts. James Herriott wrote that "veterinary practice gives one ample opportunity to make a complete ass of oneself", and I've found the same to be true in this line of work.

So, later that night, they migrated the 150Gb database and all binaries from one filer (an F900?) to the other (an F960?) overnight. NetApps ponied up the new filer -- said they didn't want to be blamed.

For the new filer, they made the following configuration changes:

...I'm not quite sure that I got all of that right, but that's the basic gist...

Of course, as always, people treat the volume of I/O coming from an Oracle database as an unchanging monolithic thing, and they always think the best thing is to make the cost per I/O better. That's OK, if you're made of money...

As Anjo advises in his YAPP reports from www.oraperf.com, tuning I/O means tuning the volume of I/O as well as tuning the cost per I/O. The NetApps folks already had a plan to reduce the cost per I/O before I was even called, so I've kept my mouth shut and pursued tuning the volume of I/O.

Anyway, performance was many-fold better after the changes. My standard query on ALL_INDEXES and ALL_IND_COLUMNS to find indexes belonging to a table took 22 mins on Wednesday night, but only the normal 60-90 seconds the following afternoon. When I/O is sick, the entire system sneezes.

The other side of what they did is snapshots. They take snapshots four times per day and replicate them to another filer to backup to tape. I think they only backup one of those snapshots per day. I would prefer to use RMAN and can't see any way to use it here, but I'm not about to delve into that right now. My job is tuning, not kibitzing on backups.

So, Dick's comments about how WAFL works and how snapshots impact space utilization on the filer triggered some things. Like why take four snapshots per day when you're only backing off one set to tape...?

Anyway, I'll work with these folks some more during next week; they still haven't implemented any of my recommendations (i.e. adding function-based indices, applying tuning patches, purging workflow, changing some custom code and using histograms, etc). I plan to really mess up NetApp's neat little 16-part picture of this system's I/O by making large chunks of it disappear. But that's OK -- they'll just have to adjust again.

So, administratively, I'm not quite sure what really works yet, but I'm watching and (hopefully) learning...

-Tim

on 9/20/03 2:44 PM, Mladen Gogala at mgogala_at_adelphia.net wrote:

> Can you be a little more specific? What kind of administration would you
> recommend?
> 
> On 2003.09.20 17:14, Tim Gorman wrote:
>> Dick,
>> 
>> With all due respect, I'd like to interject.  Due to the many levels of
>> abstraction imposed by the various RAID schemes, volume managers, dynamic
>> multi-pathing, file-systems, and databases, my eyes tend to cross whenever
>> someone starts talking about the movements of the disk heads, rotational
>> latency, and so forth.  The perception of "contiguousness" in a file-system
>> or database datafile on a modern server in relation to disk surfaces is
>> purely illusory.
>> 
>> It is somewhat akin to the idea that every US dollar bill is backed by a
>> sliver from a gold bar deep in the bowels of Ft Knox -- the facts are much
>> more complex, by design.
>> 
>> Your other comments about WAFL's side-effects are interesting and
>> thought-provoking.  It's been a few years since I've worked on NetApp and
>> just this week I was called in to help improve performance on a large Oracle
>> environment over NetApp.  At this point, I'm glad that I had not blurted out
>> my long-standing misgivings about the product, as it seems that its ability
>> to support higher volumes of I/O from Oracle has improved.  It just requires
>> different methods of administration and configuration.  It's not your
>> grandfather's file-system, that's for sure...
>> 
>> Respectfully,
>> 
>> -Tim
>> 
>> 
>> on 9/19/03 9:34 AM, Goulet, Dick at DGoulet_at_vicr.com wrote:
>> 

>>> Matt,
>>>
>>> Well I'm happy to see that you consider WAFL as "crafty". In my book it
>> does

>>> not have such a nice connotation. Consider the typical disk drive where
>> you

>>> layout your files as contiguous blocks of space around the disk drive. So
>>> long as the file remains it's current size all of the data is gathered
>>> together and easy to read/write. You don't need to constantly slam that
>> head

>>> around to get where you want. With WAFL all of that heads for the hills.
>>> Sure the original file is contiguous, but hit the first update and bingo
>>> that's history. Now the head has to fly around reassembling the file from
>>> blocks scattered all over the place, and what's the one thing about disk
>>> drives that has remained a constant over the years, seek time. Therefore
>> WAFL

>>> file systems will slow over time, yuck. One other nasty item. Remember
>> that

>>> tree you need to update, well until a 'snapshot' (NetApp speak) occurs
>> those

>>> blocks that have been updated several times can't be reused therefore that
>> 1GB

>>> !
>>> disk file that you originally laid out could easily consume 100GB due to
>> the

>>> updates, inserts, etc... Double YUCK! How is that so you say, remember
>> that

>>> when you tell Oracle to create a datafile it acquires and formats all of
>> the

>>> disk space it needs, say 100MB, but all of it is empty blocks. Now you
>> run
>> a

>>> SQL*Loader command to upload 50MB of data into that file. Well WAFL now
>> needs
>>>> 50MB of additional disk space to place all of those 'updated' blocks of
>> data

>>> into, so in reality the data file is now occupying ~150MB of space, but
>> 50MB

>>> of that is "hidden" from view until the snapshot fires. Fun part, your DB
>>> stops running in the middle of the day due to a lack of disk space on your
>>> NetAppliance. Your boss wants to know why your 10GB database has burned
>> up
>> a

>>> 100GB NET App Filer. Of course you as a DBA don't know because the
>> database

>>> hasn't grown any. Add more egg on your face when the snapshot fires &
>> bingo

>>> there is 90GB of free space that 'suddenly' appears. The work!
>>> around of course is to fire snapshots frequently and limit th!
>>> e number
>>> retained, but that just adds workload to the NetApp when I want it
>> servicing

>>> the database! As an old mentor once said, "You can't win for loosing!".
>>>
>>> Dick Goulet
>>> Senior Oracle DBA
>>> Oracle Certified 8i DBA
>>>
>>> -----Original Message-----
>>> Sent: Friday, September 19, 2003 11:50 AM
>>> To: Multiple recipients of list ORACLE-L
>>>
>>>
>>>
>>> This is actually platform dependent. For example, if you're using UDP
>>> mounts under Linux, you can only have one request outstanding per mount.
>>> Consequently, multiple mounts can improve performance by allowing parallel
>>> operations.
>>>
>>> A side benefit of Oracle on Netapp is WAFL, which as Dick pointed out,
>>> stands for Write Anywhere File Layout. Basically, an update to a block
>> does

>>> not cause a disk seek and an update - the system simply goes to the first
>>> available raid stripe that's free and writes the block there, then updates
>>> the tree. Besides being rather crafty, it creates a situation where
>>> compound writes to multiple files - like a tablespace update and an index
>>> update - migrate close to each other on disk. I/O patterns "train" the
>>> filesystem structure.
>>>
>>> To actually answer your original question, it will not make a difference
>> on

>>> most platforms that are properly configured. What will make a difference
>> is

>>> your network settings. Are you using Gigabit + jumbo frames?
>>>
>>> Matt
>>> *still pleased with how crafty WAFL is*
>>>
>>> --
>>> Matthew Zito
>>> GridApp Systems
>>> Email: mzito_at_gridapp.com
>>> Cell: 646-220-3551
>>> Phone: 212-358-8211 x 359
>>> http://www.gridapp.com
>>>
>>>> -----Original Message-----
>>>> From: ml-errors_at_fatcity.com [mailto:ml-errors_at_fatcity.com] On
>>>> Behalf Of Tanel Poder
>>>> Sent: Friday, September 19, 2003 3:25 AM
>>>> To: Multiple recipients of list ORACLE-L
>>>> Subject: Re: asynch I/O
>>>> 
>>>> 
>>>> Hi!
>>>> 
>>>> You can have spread your datafiles in 1, 2, 3,4 ..100
>>>> different directories or mount points, but the performance
>>>> remain the same for all of them as long as all the mount
>>>> points are striped on the same disks.
>>>> 
>>>> If you think of mount points as different sets of disks, e.g.
>>>> when adding a new mount point, you add more disks, then yes,
>>>> IO performance will improve, because larger number of disks.
>>>> 
>>>> Tanel.
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>> To: "Multiple recipients of list ORACLE-L" <ORACLE-L_at_fatcity.com>
>>>> Sent: Friday, September 19, 2003 5:09 AM
>>>> 
>>>> 
>>>>> Could you clarify something for me? Are you saying that if I have a
>>>> variety
>>>>> of 'mounts' on our netapp
>>>>> 
>>>>> say
>>>>> 
>>>>> /mnt1
>>>>> /mnt2
>>>>> 
>>>>> I would not benefit by putting my datafiles on seperate ones? I
>>>>> thought
>>>> that
>>>>> is where my I/O waits are coming from. Since we have all of our
>>>>> datafiles
>>>> in
>>>>> the same directory?
>>>>> 
>>>>> --
>>>>> Please see the official ORACLE-L FAQ: http://www.orafaq.net
>>>>> --
>>>>> Author: Ryan
>>>>>   INET: rgaffuri_at_cox.net
>>>>> 
>>>>> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
>>>>> San Diego, California        -- Mailing list and web
>>>> hosting services
>>>>> 
>>>> ---------------------------------------------------------------------
>>>>> To REMOVE yourself from this mailing list, send an E-Mail message
>>>>> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
>>>>> the message BODY, include a line containing: UNSUB ORACLE-L (or the
>>>>> name of mailing list you want to be removed from).  You may
>>>> also send
>>>>> the HELP command for other information (like subscribing).
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Please see the official ORACLE-L FAQ: http://www.orafaq.net
>>>> --
>>>> Author: Tanel Poder
>>>>   INET: tanel.poder.003_at_mail.ee
>>>> 
>>>> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
>>>> San Diego, California        -- Mailing list and web hosting services
>>>> ---------------------------------------------------------------------
>>>> To REMOVE yourself from this mailing list, send an E-Mail message
>>>> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru')
>>>> and in the message BODY, include a line containing: UNSUB
>>>> ORACLE-L (or the name of mailing list you want to be removed
>>>> from).  You may also send the HELP command for other
>>>> information (like subscribing).
>>>> 
>> 
>> --
>> Please see the official ORACLE-L FAQ: http://www.orafaq.net
>> --
>> Author: Tim Gorman
>>   INET: tim_at_sagelogix.com
>> 
>> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
>> San Diego, California        -- Mailing list and web hosting services
>> ---------------------------------------------------------------------
>> To REMOVE yourself from this mailing list, send an E-Mail message
>> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
>> the message BODY, include a line containing: UNSUB ORACLE-L
>> (or the name of mailing list you want to be removed from).  You may
>> also send the HELP command for other information (like subscribing).
>> 
> 
> --
> Mladen Gogala
> Oracle DBA

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Tim Gorman
  INET: tim_at_sagelogix.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Sat Sep 20 2003 - 23:19:39 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US