Density statistic calculation in case of histogram
Date: Mon, 6 Apr 2009 12:17:30 +0530
Message-ID: <a8fd4d730904052347h1d4a1cdcjb130283a6116006f_at_mail.gmail.com>
Hi,
Please tell me how density is calculated in case of histograms.
I've a table with very skewed data:
scott_at_ORADB10G> select col_skew, count(*) from tab_skew group by col_skew order by 1;
COL_SKEW COUNT(*)
---------- ----------
1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 10 9910
10 rows selected.
scott_at_ORADB10G> exec dbms_stats.delete_table_stats(user,'TAB_SKEW');
PL/SQL procedure successfully completed.
- Creating width-based histograms, #buckets = #distinct values scott_at_ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW', method_opt=>'FOR COLUMNS COL_SKEW size 10');
PL/SQL procedure successfully completed.
scott_at_ORADB10G> select * from dba_tab_col_statistics where table_name='TAB_SKEW' and column_name='COL_SKEW';
OWNER TABLE_NA COLUMN_NAME NUM_DISTINCT LOW_VA HIGH_V DENSITY NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
------ -------- -------------------- ------------ ------ ------ -------------------- ----------- --------- ----------- --- --- -----------
SCOTT TAB_SKEW COL_SKEW 10 C102 C10BFREQUENCY Question: How density is calculated here? Also it's equal to Selectivity of non-popular values.
.00005 0 10 06-APR-09 10000 YES NO 3
scott_at_ORADB10G> exec dbms_stats.gather_table_stats(user, 'TAB_SKEW', method_opt=>'FOR COLUMNS COL_SKEW size 5');
PL/SQL procedure successfully completed.
- Creating Height-based histograms, #buckets < #distinct values scott_at_ORADB10G> select * from dba_tab_col_statistics where table_name='TAB_SKEW' and column_name='COL_SKEW';
OWNER TABLE_NA COLUMN_NAME NUM_DISTINCT LOW_VA HIGH_V DENSITY NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN HISTOGRAM
------ -------- -------------------- ------------ ------ ------ -------------------- ----------- --------- ----------- --- --- -----------
SCOTT TAB_SKEW COL_SKEW 10 C102 C10BHEIGHT BALANCED Question: How density is calculated here and how selectivity will be estimated for popolar as well as non-popular values?
.98209 0 5 06-APR-09 10000 YES NO 3
Any points/advice will be appreciated.
Neeraj.
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Apr 06 2009 - 01:47:30 CDT