Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: Which one is good in terms of performance
If I read your table example right, then dbms_stats and block sampling =
gave worse results for 50% sampling than it did for 20%, .01%, .000001% =
sampling?
-----Original Message-----
From: oracle-l-bounce_at_freelists.org =
[mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Tim Gorman
Sent: vendredi, 24. d=E9cembre 2004 07:23
To: 'Oracle-L (E-mail)
Subject: Re: Which one is good in terms of performance
Seema and Wolfgang,
Here is some info I generated recently, to add to the discussion...
My apologies for the length of the post, but feel free to delete if not interested...
Accuracy when calculating table statistics with different sample sizes:
>=20
These are the cumulated averaged results from comparing statistics =
gathered
using both DBMS_STATS and ANALYZE TABLE against the baseline of =
performing a
COMPUTE using DBMS_STATS. For the first round, I concentrated on table
statistics, and ignored column statistics (a.k.a. histograms) by using =
=B3FOR
ALL COLUMNS SIZE 1=B2.
=20
I tested against four major tables in the Oracle E-Business Suite ERP
system, of varying sizes (from small to humongous).
For each of the table statistics gathered for the cost-based optimizer =
(i.e.
the columns NUM_ROWS, BLOCKS, and AVG_ROW_LEN in DBA_TABLES), displayed =
is
the percentage deviation from the baseline of a COMPUTE using =
DBMS_STATS:
=20
Nbr of tables % Deviation % Deviation % Deviation LABEL sampled Nbr of Blks =Nbr of
-------------------------------------------------- ------- ----------- ----------- ----------- DBMS_STATS: COMPUTE, FOR ALL COLUMNS SIZE 1 4 0.00 0.00 0.00 DBMS_STATS: 10.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.02 0.00 DBMS_STATS: 05.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.03 0.00 DBMS_STATS: 01.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.10 0.00 DBMS_STATS: 50.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.22 0.00 DBMS_STATS: 10.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 0.15 0.11 DBMS_STATS: 00.010000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.46 0.00 DBMS_STATS: 20.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.58 0.00 DBMS_STATS: 05.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 0.91 0.11 DBMS_STATS: 00.000001%, FOR ALL COLUMNS SIZE 1 4 0.00 1.12 0.00 DBMS_STATS: 20.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 1.35 0.00 DBMS_STATS: 00.010000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 1.47 0.00 DBMS_STATS: 00.000001% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 1.58 0.00 DBMS_STATS: 50.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 2.03 0.00 DBMS_STATS: 01.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 4.61 0.00 ANALYZE: 05.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.42 7.56 ANALYZE: COMPUTE, FOR ALL COLUMNS SIZE 1 4 0.00 0.00 8.34 ANALYZE: 50.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.00 8.34 ANALYZE: 20.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.62 7.73 ANALYZE: 10.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 1.44 7.15 ANALYZE: 01.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 7.24 8.62 ANALYZE: 00.010000%, FOR ALL COLUMNS SIZE 1 4 0.00 7.24 8.62 ANALYZE: 00.000001%, FOR ALL COLUMNS SIZE 1 4 0.00 7.24 8.62
In contrast, where even the very worst results obtained by the smallest sample of table statistics is just 7.24% off from performing a COMPUTE operation, the results from using smaller samples when calculating index statistics get much worse very quickly as the sampling size decreases.
Here is the results of testing against the nine indexes of the
FND_CONCURRENT_REQUESTS table, which vary widely in terms of density, =
data
characteristics, etc. The results are being sorted by the summation of =
the
percentage columns, so that the "most accurate" shows first and the =
"least
accurate" shows last...
Nbr of % =Dev'n %
indexes % Dev'n % Dev'n % Dev'n Avg = Leaf Avg Data % Dev'n % Dev'n LABEL sampled BLevel Leaf Blks Dist Keys =Blks/Key
------------------------------ ------- ------- --------- --------- =
First of all, there is the huge deviation the calculations of =
LEAF_BLOCKS by
DBMS_STATS and by ANALYZE. Each command (DBMS_STATS or ANALYZE) is
consistent, so one must be wrong and the other right. Unfortunately, in
this case, it is the DBMS_STATS package that is uniformly wrong on this,
because the results produced by the ANALYZE INDEX commands are =
corroborated
by similar results from the ANALYZE INDEX ... VALIDATE STRUCTURE =
command,
the DBMS_SPACE package, and the values stored in the DBA_SEGMENTS view. =
I
searched MetaLink for a bug report related to DBMS_STATS and =
LEAF_BLOCKS,
but could not find one. I intend to retest this against 9.2.0.5 and
10.1.0.3, when I get a chance. I=B9ll also test it against 9.2.0.1...
A few more things to note about the results shown here:
That last point is the most important. Nonetheless, since the ANALYZE =
INDEX
... COMPUTE command is clearly the most accurate, here is the same =
result
data, this time using the results from ANALYZE INDEX .. COMPUTE =
STATISTICS
as the baseline instead of DBMS_STATS:
>=20
Nbr of % =Dev'n %
indexes % Dev'n % Dev'n % Dev'n Avg = Leaf Avg Data % Dev'n % Dev'n LABEL sampled BLevel Leaf Blks Dist Keys =Blks/Key
------------------------------ ------- ------- --------- --------- =
Again, this perspective shows the two types of COMPUTE largely agreeing =
with
each other (except for the LEAF_BLOCKS and the AVG_LEAF_BLOCKS_PER_KEY =
value
derived from it), with all other results showing dramatically less =
accurate
due to decreases in the sample size.
For indexes, clearly computing yields the best results, while a bug in
DBMS_STATS appears to be skewing the LEAF_BLOCK results in v9.2.0.4. =
making
the ANALYZE INDEX command the best alternative for that version.
So, from this data, which is admittedly sparse and limited (needs more
testing on a wider range of versions and applications), I think that one =
can
conclude:
>From this, I don=B9t think that one can conclude that it is better to =
ANALYZE
INDEX ... COMPUTE instead of
DBMS_STATS.GATHER_INDEX_STATS(ESTIMATE_PERCENT=3D>NULL), because I =
don=B9t know
if the anomaly found exists outside of 9.2.0.4...
Hope this helps...
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Dec 27 2004 - 12:47:44 CST
![]() |
![]() |