Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> Density calculation. Was: Incorrect cardinality estimate
-----Original Message-----
From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Wolfgang Breitling
>Ergo, in order to get the cardinality estimate down you have to change
>either num_rows or density. I suggest you whack density to
>2.5e-6 or even less. Because you did gather the histogram this won't
>affect any sql which go after popular values. It will affect sql where
>you have a join on the ods_process_date column.
It seems if Oracle is doing a natural thing when calculating density.
If histogram is provided then Oracle tries to remove "noise" from
density calculation.
A small test reveals that Oracle is able to effectively remove very
popular values.
The formula is (kind of) following:
Density=Cardinality_for_Unpopular_Values/Number_of_Rows
For example:
create table test_dens
(id number
, nh number --number column with histogram
, n number --number column w/o histogram
);
---Case 3
insert into test_dens
select rownum, rownum, rownum from all_objects where rownum<=5000;
insert into test_dens
select rownum, 0, 0 from all_objects where rownum<=2500;
insert into test_dens
select rownum, 0.5, 0.5 from all_objects where rownum<=2500;
-- begin sys.dbms_stats.gather_table_stats('LTDLNE', 'TEST_DENS', method_opt=>'FOR COLUMNS NH SIZE 254, N SIZE 1'); end; / select t.table_name, t.column_name, t.num_distinct, t.density, t.num_nulls, t.num_buckets, t.histogram from user_tab_col_statistics t where table_name='TEST_DENS'; Table column num_rows density num_nulls buckets histogram type TEST_DENS NH 5002 0,0001 0 254 HEIGHT BALANCED TEST_DENS N 5002 0,000199920031987205 0 1 NONE Density for column N is exactly 1/5002 Density for column NH is: Density=Cardinality_for_Unpopular_Values/Number_of_Rows = 1/10000 = 0,0001 , where Cardinality_for_Unpopular_Values=1 because we have 5002(total distinct values)-2(popular values)=5000(unpopular distinct values) for 5000 unpopular rows yielding cardinality 5000(unpopular rows)/5000(unpopular distinct values)=1 --- Another test with Cardinality_for_Unpopular_Values=2: insert into test_dens select rownum, mod(rownum,1000), mod(rownum,1000) from all_objects where rownum<=5000; insert into test_dens select rownum, 10001, 10001 from all_objects where rownum<=2500; insert into test_dens select rownum, 10002, 10002 from all_objects where rownum<=2500; And we have: Density_for_NH_columns is 0.0005 = (5000/(1002-2))/10000, where 5000 - unpopular rows, 1002 - total distinct values, 2 - popular rows 10000 - total rows --- The problem is how Oracle cuts this noise. It may use several techniques but all of them are heuristic anyway. A guess. Fyrirvari/Disclaimer http://www.landsbanki.is/disclaimer -- http://www.freelists.org/webpage/oracle-lReceived on Wed Jan 24 2007 - 10:26:20 CST
![]() |
![]() |