Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> "O/S block size" for Windows NT

"O/S block size" for Windows NT

From: Ben Pomicter <bp2k_at_my-deja.com>
Date: 2000/04/26
Message-ID: <8e7eni$mji$1@nnrp1.deja.com>#1/1

Here's a seemingly simple question often asked: What is the "O/S block size" for Windows NT?

Everyone in this forum, and many at Oracle support for that matter, seem to have their own answer. As for Microsoft's documentation -- the MSDN Library and Technet -- it *never* makes use of the term "block size" in any discussion of file systems or datafile I/O, nor does it use the phrases "O/S block size" or "operating system block" in any context.

There are generally two type of answers given: 1) some absolute value -- usually 2048 bytes 2) whatever the cluster size is -- which can be 512, 1024, 2048, etc., up to 64k, and is set (manually or by default) when you format the volume

I'm assuming the following things in this discussion: Oracle8, NT4, NTFS. Also, keep in mind the difference between sectors and clusters.

I've done some experimenting myself using the init.ora parameters log_checkpoint_interval and max_dump_file_size, which are (or can be) expressed as a number of "operating system blocks." For example, set log_checkpoint_interval=1024 and log_checkpoints_to_alert=true, then later on check the alert log and see how many checkpoints there are per log switch. Say your log files are 20M, and you're getting 40 checkpoints per log switch, then one "operating system block" is (20M/40)/1024=512 bytes. This is the result I keep getting, 512 bytes, and this is on volumes with a 4k cluster size. 512 bytes is the *sector* size of the disk.

The answers given generally equate an "O/S block" with the atomic unit of disk I/O used by NT. However, there is no reference in Microsoft's documentaton (nor in any of the other, many, books and articles I've scoured) that NT does I/O in, say, 2048-byte units. Nor is there any reference that a cluster is anything more than a unit of disk space allocation, as opposed to a unit of I/O.

In addition, there is documentation that Oracle (and SQL Server for that matter) actually does datafile I/O on the sector level:
" On UNIX, Oracle opens files using the O_SYNC flag to bypass the
filesystem buffer cache. In the current Win32 API, the equivalent flags are FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH. The goal in both cases is the same; to ensure that data has been posted to disk before assuming that the write has been successful." [source: Oracle Metalink, Doc ID 48681.1, "Server Architecture on UNIX and NT"]
"The file system cache is used, by default, whenever a disk is accessed.
However, applications can request that its files not be cached by using the FILE_FLAG_NO_BUFFERING parameter in its call to open a file. This is called unbuffered I/O. Applications that use unbuffered I/O are typically database applications (such as SQL Server) [and Oracle] that manage their own cache buffers. Unbuffered I/O requests must be issued in multiples of the disk sector size"[source: NT Workstation 4.0 Resource Kit, "Chapter 15 - Detecting Cache Bottlenecks"]
"In SQL Server 7.0, pages [i.e. db blocks] are 8 KB, while Windows NT
does I/O in 512-byte segments." [source: MSDN Library, "Microsoft SQL Server 7.0 Storage Engine"]
There are lots of other references to this "unbuffered I/O" and the fact that it is in units of the disk sector size. Here is what Oracle support has said:
"Oracle uses non-buffered reads and writes on NT. We therefore bypass
the NT file system cache and are thus only concerned with I/O in multiples of the O/S blocksize (512 bytes)...Cluster sizes come into play simply as an allocation unit...As far as Oracle is concerned (and NT for that matter), its datafiles are already created...So if we want to read a 2k Oracle block we pass this to the O/S who simply does a 2k read (4 sectors) from the existing file and passes the results back to Oracle and our own buffer cache. The allocation unit (cluster) has no affect on this."

Note that if 512 bytes is the unit of I/O, then the whole idea that db_block_size should be a multiple of the "O/S block size" is a moot point. (Remember, I'm only talking about NT here. For a JFS-based database on AIX, for example, Oracle does say that db_block_size should be a multiple of the JFS blocksize, 4k, while AIX does also have a physical block size of 512 bytes).

So, here's what I'm asking: can any of you out there, who feel that the NTFS cluster size, or something else, is the low-level unit of datafile I/O for Oracle on NT, come up with any chapter-and-verse references which substantiate that assertion? I would love to see them for myself, as, I'm sure, would a lot of other people.

--
Ben Pomicter
Database Administrator
Digitas


Sent via Deja.com http://www.deja.com/
Before you buy.
Received on Wed Apr 26 2000 - 00:00:00 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US