Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> Re: dba_tables.num_rows is less than dba_indexes.num_rows
Leng Kaing wrote:
> Wolfgang,
>
> This is all news to me...
>
> 1) I always thought that NUM_ROWS is important. I know that a few weeks
> ago when I noticed the differences in the num_rows in tables vs indexes,
> we manually increased the table's num_rows and the query would revert to
> using an index rather than a full table scan. I have yet to do a test to
> see if what would happen if I set NUM_ROWS to a small number, eg. 5.
num_rows is important, but the CBO only uses the value from *_tables, not the one from *_indexes. As Christian already said, if someone has information to the contrary - with a repeatable testcase - let me know.
>
> 2) I also did "..., estimate_percent=>dbms_stats.auto_sample_size,
> method_opt=>'FOR
> ALL INDEXED COLUMNS', degree=>2, cascade=>true"
>
> Are you saying that FOR ALL INDEX COLUMNS or FOR COLUMNS only gathers
> histograms for the tables, but not the stats? So do I have to do it in 2
> steps - gather stats, then gather histograms?
>
> 3) I've never computed on any of the systems that I've worked on. And it
> certainly cannot be done on the current database as it would take too
> long. I always thought an estimate would be good enough. Now I'm hearing
> that only a compute will do. ARGH!! We cannot afford to do this.
Estimate is usually good enough for tables. In fact amazingly small
estimate percentages (.01 or smaller) produce very good estimates for
num_rows and avg_row_len.
Estimate is often not good enough for index statistics and IMO never
good enough for column histograms.
Maybe one of the reasons you do not have enough time for a
gather_compute is that you gather too many histograms and maybe miss a
few worth having. A gather_table_stats(.. method_opt=>'for all [indexed]
columns size <anything other than 1>') will do a full scan of the table
for every column for which a histogram is collected. For
estimate_percent=n a temp table with a n% sample of the original table
is used instead of the full table.
This testcase shows that even for a very moderatley sized table. I ran
every gather twice to alleviate any caching issues:
SQL> select count(*) from pspnlfield;
COUNT(*)
282100
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:00.04
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>null,method_opt=>'for
all columns size 254');
Elapsed: 00:05:33.01
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:01.02
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>null,method_opt=>'for
all columns size 1');
Elapsed: 00:00:56.01
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:00.03
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>null,method_opt=>'for
all columns size 254');
Elapsed: 00:06:04.06
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:01.02
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>null,method_opt=>'for
all columns size 1');
Elapsed: 00:01:01.01
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:00.01
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>10,method_opt=>'for
all columns size 254');
Elapsed: 00:00:32.00
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:01.00
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>10,method_opt=>'for
all columns size 1');
Elapsed: 00:00:06.08
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:00.00
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>10,method_opt=>'for
all columns size 254');
Elapsed: 00:00:31.07
SQL> exec dbms_stats.delete_table_stats(user,'PSPNLFIELD'); Elapsed: 00:00:01.01
SQL> exec
dbms_stats.gather_table_stats(user,'PSPNLFIELD',estimate_percent=>10,method_opt=>'for
all columns size 1');
Elapsed: 00:00:06.07
A clarification to the statement that you probably collect too many histograms and at the same time may be missing some: You have method_opt=> 'for all indexed columns [size 75]' a) not all indexed columns really need histograms, only crucial ones with significant skew - that's the too many piece. b) histograms on crucial columns with significant skew, even if not indexed, can lead the optimizer to a better performing plan - that's the too few piece.
If anyone thinks that too many histograms can't hurt (other than the time wasted collecting them), I had an incident where a client changed their statistics gathering from 'all columns size 1' (i.e. no histograms) to 'all indexed columns size skewonly' and the performance of a frequently used sql went from 0.01 seconds to ~ 90 seconds.
-- Regards Wolfgang Breitling Centrex Consulting Corporation www.centrexcc.com -- http://www.freelists.org/webpage/oracle-lReceived on Thu Aug 11 2005 - 10:21:32 CDT
![]() |
![]() |