"Multi-Column" Histogram / Histogram for Concatenated Index

From: Hemant K Chitale <hkchital_at_singnet.com.sg>
Date: Thu, 13 Dec 2007 23:35:38 +0800
Message-Id: <200712131535.lBDFZgh1029368@smtp41.singnet.com.sg>

I have a table with an Index on 4 columns. One particular column
(the third in the index)

is highly skewed with one value accounting for 70% of the rows. Queries with the Index columns as predicates are perfectly fine for all values other than that one value.
Even if I gather a histogram on the skewed column, queries for the "bad" value still use the index. The bad value alone accounts for 70% of the rows. In combination with a particular set of values for the other columns, the query fetches about 55% of the rows. I do not want to use the index.

Say :
Table TAB_1
Columns C_1, C_2, C_3, C_4, C_5, C_6, C7 Index on C_2,C_4,C_5,C6

Value "881" in column C_5 accounts for 70% of the rows. A particular combination of values for C_2, C_4, C_5 ("881") and C_6 account for 55% of the rows.

I cannot change the SQL code. It is "generated" by the application. Even if I were to change the code, I find that this SQL query runs in a loop for different combinations of these columns. It is only one combination that is my problem. The combinations of C_2, C_4, C_5, C_6 are driven from three other tables Besides these columns C_1 and C_7 are also linked to the other tables. Some joins are equi-joins, some are ranges. {Yes, this sounds like a fact table and dimension tables, doesn't it ?}

This is 9.2.0.4, 64-bit.
Statistics are gathered nightly. I am allowed to modify the gather statistics job to collect specific histograms.

I find that gathering histogram on C_5 alone doesn't help. {There are 6- distinct values, and I can specify upto 254 buckets}. Queries still use the index

Questions :
1. If I rebuild the index to use C_5 as the leading column would Oracle be using the Histogram to convert queries for C_5='881' into FullTableScans ?
2. Should I gather histograms on all the columns of this index ?

This is not exactly like the "correlated" columns that is used as an example for 11g MultiColumn Statistics ("Extended Statistics") as there is no relationship between the column values. What I am looking for is something like a Histogram on the Index Key
(rather than just "num_rows" and "distinct_keys", I want the density
for different combinations of key values -- a Histogram on the Index !)

Hemant K Chitale
http://web.singnet.com.sg/~hkchital
and
http://hemantscribbles.blogspot.com
and
http://hemantoracledba.blogspot.com

"There is more to life than increasing its speed." Mohandas Gandhi Quotes
: http://www.brainyquote.com/quotes/authors/m/mohandas_gandhi.html

--
http://www.freelists.org/webpage/oracle-l

Received on Thu Dec 13 2007 - 09:35:38 CST