Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: asynch I/O

RE: asynch I/O

From: Matthew Zito <mzito_at_gridapp.com>
Date: Fri, 19 Sep 2003 09:14:39 -0800
Message-ID: <F001.005D083D.20030919091439@fatcity.com>

Well, the other semi-unique thing about WAFL is the fact that all meta-data stored in files, rather than custom structures. So, the "tree" I was describing is actually three files- the free inode map, the free space map, and the inode file. The most important one for our purposes here is the inode file - that describes the actual filesystem structure.

Sooo, what happens is that the inode file is modified every time the structure of the filesystem is changed. Practically speaking, this means it is cache-resident all the time. But, since WAFL's architecture is to never update blocks, the inode file is never updated on disk - a new one is simply written.

This is how Netapp snapshots work - basically there is a "root inode" that is special and points to the inode file. When you want to make a snapshot, you make a new "root inode" and statically point it at the current inode file. Since that inode file describes the view of the filesystem _at that point in time_, you end up with a read-only virtual filesystem.

This same logic is applied to insuring on-disk consistency. Every few seconds, a new snapshot is created that points at the current inode file. The netapp continues processing requests, but the on-disk filesystem structure is "fixed" at that snapshot. When the consistency timer expires, the old snapshot is deleted and a new one created that points at the current inode file - so the entire filesystem view on-disk updates atomically to represent what the Filer had already been representing in memory.

The battery-backed cache stores all of the transactions between the last consistency point and the present moment (in another unique note, it actually caches the NFS operation itself, not the low-level I/O). This gives it a pool of marked-as-completed writes to work with to help make more intelligent decisions about write layouts.

So, the situation is not so grim as "lose your cache, lose your filesystem" - in a truly tragic scenario with power failure plus cache-battery failure, the worst case is that you would recover your filer to discover that it was at a consistent state from 10 seconds before the power failure (10 seconds is the longest time a filer will go between consistency points).

Thanks,
Matt
*still pleased with netapp's craftiness*

--
Matthew Zito
GridApp Systems
Email: mzito_at_gridapp.com
Cell: 646-220-3551
Phone: 212-358-8211 x 359
http://www.gridapp.com


> -----Original Message-----
> From: ml-errors_at_fatcity.com [mailto:ml-errors_at_fatcity.com] On
> Behalf Of Tanel Poder
> Sent: Friday, September 19, 2003 12:35 PM
> To: Multiple recipients of list ORACLE-L
> Subject: Re: asynch I/O
>
>
> > available raid stripe that's free and writes the block there, then
> > updates the tree. Besides being rather crafty, it creates
> a situation
> > where
>
> And the tree is living in batter backed cache?
>
> Tanel.
>
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author: Tanel Poder
> INET: tanel.poder.003_at_mail.ee
>
> Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> San Diego, California -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru')
> and in the message BODY, include a line containing: UNSUB
> ORACLE-L (or the name of mailing list you want to be removed
> from). You may also send the HELP command for other
> information (like subscribing).
>
-- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Matthew Zito INET: mzito_at_gridapp.com Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
Received on Fri Sep 19 2003 - 12:14:39 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US