Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> Re: Which one is good in terms of performance
Seema and Wolfgang,
Here is some info I generated recently, to add to the discussion...
My apologies for the length of the post, but feel free to delete if not interested...
Accuracy when calculating table statistics with different sample sizes:
>
These are the cumulated averaged results from comparing statistics gathered
using both DBMS_STATS and ANALYZE TABLE against the baseline of performing a
COMPUTE using DBMS_STATS. For the first round, I concentrated on table
statistics, and ignored column statistics (a.k.a. histograms) by using ³FOR
ALL COLUMNS SIZE 1².
I tested against four major tables in the Oracle E-Business Suite ERP
system, of varying sizes (from small to humongous).
For each of the table statistics gathered for the cost-based optimizer (i.e. the columns NUM_ROWS, BLOCKS, and AVG_ROW_LEN in DBA_TABLES), displayed is the percentage deviation from the baseline of a COMPUTE using DBMS_STATS:
Nbr of tables % Deviation % Deviation % Deviation LABEL sampled Nbr of Blks Nbr ofRows Avg Row Len
-------------------------------------------------- ------- ----------- ----------- ----------- DBMS_STATS: COMPUTE, FOR ALL COLUMNS SIZE 1 4 0.00 0.00 0.00 DBMS_STATS: 10.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.02 0.00 DBMS_STATS: 05.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.03 0.00 DBMS_STATS: 01.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.10 0.00 DBMS_STATS: 50.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.22 0.00 DBMS_STATS: 10.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 0.15 0.11 DBMS_STATS: 00.010000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.46 0.00 DBMS_STATS: 20.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.58 0.00 DBMS_STATS: 05.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 0.91 0.11 DBMS_STATS: 00.000001%, FOR ALL COLUMNS SIZE 1 4 0.00 1.12 0.00 DBMS_STATS: 20.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 1.35 0.00 DBMS_STATS: 00.010000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 1.47 0.00 DBMS_STATS: 00.000001% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 1.58 0.00 DBMS_STATS: 50.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 2.03 0.00 DBMS_STATS: 01.000000% BLKSAMP, FOR ALL COLUMNS SI 4 0.00 4.61 0.00 ANALYZE: 05.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.42 7.56 ANALYZE: COMPUTE, FOR ALL COLUMNS SIZE 1 4 0.00 0.00 8.34 ANALYZE: 50.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.00 8.34 ANALYZE: 20.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 0.62 7.73 ANALYZE: 10.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 1.44 7.15 ANALYZE: 01.000000%, FOR ALL COLUMNS SIZE 1 4 0.00 7.24 8.62 ANALYZE: 00.010000%, FOR ALL COLUMNS SIZE 1 4 0.00 7.24 8.62 ANALYZE: 00.000001%, FOR ALL COLUMNS SIZE 1 4 0.00 7.24 8.62
The sorting in this report is by the summarization of the three percentage columns, from "most accurate" results to "least accurate" results.
A couple things to note about the results shown here:
Accuracy when calculating index statistics with different sample sizes:
In contrast, where even the very worst results obtained by the smallest sample of table statistics is just 7.24% off from performing a COMPUTE operation, the results from using smaller samples when calculating index statistics get much worse very quickly as the sampling size decreases.
Here is the results of testing against the nine indexes of the FND_CONCURRENT_REQUESTS table, which vary widely in terms of density, data characteristics, etc. The results are being sorted by the summation of the percentage columns, so that the "most accurate" shows first and the "least accurate" shows last...
Nbr of % Dev'n % Dev'n indexes % Dev'n % Dev'n % Dev'n Avg Leaf Avg Data % Dev'n % Dev'n LABEL sampled BLevel Leaf Blks Dist Keys Blks/KeyBlks/Key Clu Fac # Idx Entries
------------------------------ ------- ------- --------- --------- -------- -------- ------- ------------- DBMS_STATS: INDEX COMPUTE 9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 DBMS_STATS: INDEX 05.000000% 9 0.00 0.00 22.13 14.85 16.40 0.44 0.00 DBMS_STATS: INDEX 20.000000% 9 0.00 0.00 25.39 24.29 24.22 0.44 0.00 DBMS_STATS: INDEX 10.000000% 9 0.00 0.00 25.56 16.61 33.89 0.44 0.00 DBMS_STATS: INDEX 00.010000% 9 0.00 0.00 27.45 20.13 37.91 0.44 0.00 DBMS_STATS: INDEX 50.000000% 9 0.00 0.00 26.42 21.53 38.65 0.44 0.00 DBMS_STATS: INDEX 01.000000% 9 0.00 0.00 31.17 27.90 28.18 0.44 0.00 DBMS_STATS: INDEX 00.000001% 9 0.00 0.00 27.81 22.46 40.37 0.44 0.00 ANALYZE: INDEX COMPUTE 9 0.00 595.30 0.00 239.88 0.00 0.00 0.00 ANALYZE: INDEX 50.000000% 9 0.00 596.27 23.55 329.31 13.85 17.29 17.60 ANALYZE: INDEX 01.000000% 9 8.33 530.13 32.28 353.86 23.04 32.87 34.08 ANALYZE: INDEX 00.010000% 9 8.33 530.13 32.28 353.86 23.04 32.87 34.08 ANALYZE: INDEX 00.000001% 9 8.33 530.13 32.28 353.86 23.04 32.87 34.08 ANALYZE: INDEX 20.000000% 9 0.00 596.27 32.97 551.43 58.40 19.22 19.51 ANALYZE: INDEX 05.000000% 9 0.00 596.27 40.77 854.46 59.40 24.67 23.23 ANALYZE: INDEX 10.000000% 9 0.00 596.27 40.86 792.72 141.73 29.25 29.07
First of all, there is the huge deviation the calculations of LEAF_BLOCKS by DBMS_STATS and by ANALYZE. Each command (DBMS_STATS or ANALYZE) is consistent, so one must be wrong and the other right. Unfortunately, in this case, it is the DBMS_STATS package that is uniformly wrong on this, because the results produced by the ANALYZE INDEX commands are corroborated by similar results from the ANALYZE INDEX ... VALIDATE STRUCTURE command, the DBMS_SPACE package, and the values stored in the DBA_SEGMENTS view. I searched MetaLink for a bug report related to DBMS_STATS and LEAF_BLOCKS, but could not find one. I intend to retest this against 9.2.0.5 and 10.1.0.3, when I get a chance. Iıll also test it against 9.2.0.1...
A few more things to note about the results shown here:
That last point is the most important. Nonetheless, since the ANALYZE INDEX
... COMPUTE command is clearly the most accurate, here is the same result
data, this time using the results from ANALYZE INDEX .. COMPUTE STATISTICS
as the baseline instead of DBMS_STATS:
>
Nbr of % Dev'n % Dev'n indexes % Dev'n % Dev'n % Dev'n Avg Leaf Avg Data % Dev'n % Dev'n LABEL sampled BLevel Leaf Blks Dist Keys Blks/KeyBlks/Key Clu Fac # Idx Entries
------------------------------ ------- ------- --------- --------- -------- -------- ------- ------------- ANALYZE: INDEX COMPUTE 9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ANALYZE: INDEX 50.000000% 9 0.00 0.13 23.55 15.18 13.85 17.29 17.60 DBMS_STATS: INDEX COMPUTE 9 0.00 85.04 0.00 41.30 0.00 0.00 0.00 DBMS_STATS: INDEX 05.000000% 9 0.00 85.04 22.13 38.50 16.40 0.44 0.00 DBMS_STATS: INDEX 20.000000% 9 0.00 85.04 25.39 36.70 24.22 0.44 0.00 ANALYZE: INDEX 01.000000% 9 8.33 8.42 32.28 36.11 23.04 32.87 34.08 ANALYZE: INDEX 00.010000% 9 8.33 8.42 32.28 36.11 23.04 32.87 34.08 ANALYZE: INDEX 00.000001% 9 8.33 8.42 32.28 36.11 23.04 32.87 34.08 DBMS_STATS: INDEX 01.000000% 9 0.00 85.04 31.17 36.24 28.18 0.44 0.00 DBMS_STATS: INDEX 10.000000% 9 0.00 85.04 25.56 38.07 33.89 0.44 0.00 ANALYZE: INDEX 20.000000% 9 0.00 0.13 32.97 53.44 58.40 19.22 19.51 DBMS_STATS: INDEX 50.000000% 9 0.00 85.04 26.42 37.22 38.65 0.44 0.00 DBMS_STATS: INDEX 00.010000% 9 0.00 85.04 27.45 37.53 37.91 0.44 0.00 DBMS_STATS: INDEX 00.000001% 9 0.00 85.04 27.81 37.05 40.37 0.44 0.00 ANALYZE: INDEX 05.000000% 9 0.00 0.13 40.77 107.16 59.40 24.67 23.23 ANALYZE: INDEX 10.000000% 9 0.00 0.13 40.86 95.99 141.73 29.25 29.07
Again, this perspective shows the two types of COMPUTE largely agreeing with each other (except for the LEAF_BLOCKS and the AVG_LEAF_BLOCKS_PER_KEY value derived from it), with all other results showing dramatically less accurate due to decreases in the sample size.
For indexes, clearly computing yields the best results, while a bug in DBMS_STATS appears to be skewing the LEAF_BLOCK results in v9.2.0.4. making the ANALYZE INDEX command the best alternative for that version.
So, from this data, which is admittedly sparse and limited (needs more testing on a wider range of versions and applications), I think that one can conclude:
>From this, I donıt think that one can conclude that it is better to ANALYZE
INDEX ... COMPUTE instead of
DBMS_STATS.GATHER_INDEX_STATS(ESTIMATE_PERCENT=>NULL), because I donıt know
if the anomaly found exists outside of 9.2.0.4...
Hope this helps...
-Tim
on 12/23/04 1:26 PM, Wolfgang Breitling at breitliw_at_centrexcc.com wrote:
> first off, whatever you do, DO NOT use 2) dbms_utility
>
> What I do is:
> a) DO NOT rely on Oracle's "staleness" algorithm but decide myself which
> tables need to be analyzed and at what frequency
> b) DO NOT use "for all indexed columns" but decide myself which columns
> require a histogram and with how many buckets. Those columns might very
> well include non-indexed ones and will certainly not include all indexed
> columns. I have an example where the creation of histograms on indexed
> columns led to a batch job taking an estimated 18+ hours (if we had had
> the patience to let it finish instead of killing it after 6 hours)
> instead of the ~90 seconds without the histograms
> c) use the dbms_stats procedures
> d) use estimate_percent=>dbms_stats.auto_sample-size, cascade=>false,
> method_opt=>'for all columns size 1' (i.e. the defaults for the latter
> two parameters) for tables
> e) follow with gather_index_stats with estimate_percent=>null (i.e. full
> compute) for indexes
> f) follow with gather_table_stats with estimate_percent=> (i.e. full
> compute) for individual column histograms.
-- http://www.freelists.org/webpage/oracle-lReceived on Fri Dec 24 2004 - 09:18:18 CST
![]() |
![]() |