Re: Is this a good definition for clustering factor

From: jaromir nemec <jaromir_at_db-nemec.com>
Date: Thu, 17 Feb 2005 07:51:21 +0100
Message-ID: <005401c514bd$17d6ba50$3c02a8c0@JARAWIN>

Hi,

> I see no need to "improve" on these definitions.

I have a small suggestion.

> - If the value is near the number of rows, then ..

As a b*tree index contains only entries for rows with at least one not null index column, I'd relate the upper bound of the clustering factor to the number of rows with at least one not null index column.

In other words, if one have a lot of row with all index columns null in a table, the clustering factor of an b*tree index may look better that it really is.

Jaromir Nemec

http://www.db-nemec.com

Original Message ----- From: "Wolfgang Breitling" <breitliw_at_centrexcc.com> To: <juancarlosreyesp_at_gmail.com> Cc: "oracle-l" <Oracle-L_at_freelists.org> Sent: Thursday, February 17, 2005 2:18 AM Subject: Re: Is this a good definition for clustering factor

Why invent another wheel? What is wrong with Oracle's definition? Oracle 9i (9.2) Database Reference:
Indicates the amount of order of the rows in the table based on the values=
=20

of the index.
- If the value is near the number of blocks, then the table is very well=20 ordered. In this case, the index entries in a single leaf block tend to=20 point to rows in the same data blocks.
- If the value is near the number of rows, then the table is veryrandomly=20 ordered. In this case, it is unlikely that index entries in the same leaf=20 block point to rows in the same data blocks.

and Oracle 9i (9.2) Database Performance Tuning Guide and Reference: The cost of fetching rows using rowids depends on the index clustering=20 factor. Although the clustering factor is a property of the index, the=20 clustering factor actually relates to the spread of similar indexed column=
=20

values within data blocks in the table. A lower clustering factor indicates=
=20

that the individual rows are concentrated within fewer blocks in the table.=
=20

Conversely, a high clustering factor indicates that the individual rows are=
=20

scattered more randomly across blocks in the table. Therefore, a high=20 clustering factor means that it costs more to use a range scan to fetch=20 rows by rowid, because more blocks in the table need to be visited to=20 return the data.

I see no need to =EFmprove" on these definitions. There is even an example= in=20
the Tuning Guide.

Regards

Wolfgang Breitling
Centrex Consulting Corporation
http://www.centrexcc.com=20

--
http://www.freelists.org/webpage/oracle-l



--
http://www.freelists.org/webpage/oracle-l

Received on Thu Feb 17 2005 - 01:54:47 CST