Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: num_distinct too small with auto sample size on test db but not prod...

Re: num_distinct too small with auto sample size on test db but not prod...

From: DA Morgan <damorgan_at_psoug.org>
Date: Fri, 05 Jan 2007 12:29:25 -0800
Message-ID: <1168028965.342477@bubbleator.drizzle.com>


bdurrettccci_at_yahoo.com wrote:
> I have two large databases, production and test. They are on 10.1.0.4.
> In them is a large partitioned table - it has about 1 billion rows.
> The data in the test and prod versions are not exactly the same, but
> are close. Stats have been gathered on both tables using
> AUTO_SAMPLE_SIZE and FOR ALL COLUMNS SIZE AUTO. On the date column
> which the table is partitioned on I'm getting a very different number
> of distinct values in test and production.
>
> In production the num_distinct is about right(about 1000). It doesn't
> use histograms. In test it is way off (17) and has a 17 bucket
> histogram.
>
> I've done a lot of checking into this and am really puzzled. I saw the
> bugs on Metalink which talk about num_distinct being way off with
> auto_sample_size. I did a trace and got the SQL which Oracle uses to
> get the num_distinct. It is something like this:
>
> select /*+ cursor_sharing_exact use_weak_name_resl dynamic_sampling(0)
> no_monitoring */
> count(distinct "MYCOLUMNNAME")
> from "MYSCHEMA"."TEST" sample (10) t;
>
> where "MYCOLUMNNAME" is the column name and 10 is the percentage.
>
> When I fill in the percentage with the same percentage that was used by
> dbms_stats I get the correct value for num_distinct. It is about 1000
> on both databases. I got sample_size from dba_tab_columns and divided
> it by the number of rows from dba_tables and multiplied by 100 to get
> the sampling percentage for each database. I plugged that percentage
> into the above query and ran it against the appropriate database.
>
> So, I can't understand why the num_distict came out as 17.
>
> Also, the tables are compressed and have recently had columns added to
> them.
>
> Thanks for any input on this issue.
>
> - Bobby

Import the correct statistics.

Jonathan Lewis can likely point you to the why ... and I will personally be watching to learn something myself.

-- 
Daniel A. Morgan
University of Washington
damorgan_at_x.washington.edu
(replace x with u to respond)
Puget Sound Oracle Users Group
www.psoug.org
Received on Fri Jan 05 2007 - 14:29:25 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US