Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> Re: Which one is good in terms of performance
Spooky, isn't it? Block sampling really means rolling the dice, I think...
If you'd like the script I used to generate these test results, feel free t=
o
email me offline. They're nothing special, but it might allow others a
jumpstart to test on different data conditions...
on 12/27/04 11:48 AM, Jacques Kilchoer at Jacques.Kilchoer_at_quest.com wrote:
> If I read your table example right, then dbms_stats and block sampling ga=
ve
> worse results for 50% sampling than it did for 20%, .01%, .000001% sampli=
ng?
>=20
>=20
>=20
>=20 >=20
>=20
>=20
> Nbr of
> tables % Deviation %
> Deviation % Deviation
> LABEL sampled Nbr of Blks Nbr=
of
>=20
>>> exposed by performing SQL trace >>>> * Method is: select avg(vsize(col-1)) + avg(vsize(col-2)) + ... + >>>> avg(vsize(col-N)) from table-name ... >> * The method of calculating AVG_ROW_LEN by the ANALYZE command cannot be >> determined using SQL trace
>=20
>=20
>=20
--Received on Mon Dec 27 2004 - 21:48:57 CST
> -------- ------- -------------
> DBMS_STATS: INDEX COMPUTE 9 0.00 0.00 0.00 0.0=
0
> 0.00 0.00 0.00
> DBMS_STATS: INDEX 05.000000% 9 0.00 0.00 22.13 14.8=
5
> 16.40 0.44 0.00
> DBMS_STATS: INDEX 20.000000% 9 0.00 0.00 25.39 24.2=
9
> 24.22 0.44 0.00
> DBMS_STATS: INDEX 10.000000% 9 0.00 0.00 25.56 16.6=
1
> 33.89 0.44 0.00
> DBMS_STATS: INDEX 00.010000% 9 0.00 0.00 27.45 20.1=
3
> 37.91 0.44 0.00
> DBMS_STATS: INDEX 50.000000% 9 0.00 0.00 26.42 21.5=
3
> 38.65 0.44 0.00
> DBMS_STATS: INDEX 01.000000% 9 0.00 0.00 31.17 27.9=
0
> 28.18 0.44 0.00
> DBMS_STATS: INDEX 00.000001% 9 0.00 0.00 27.81 22.4=
6
> 40.37 0.44 0.00
> ANALYZE: INDEX COMPUTE 9 0.00 595.30 0.00 239.8=
8
> 0.00 0.00 0.00
> ANALYZE: INDEX 50.000000% 9 0.00 596.27 23.55 329.3=
1
> 13.85 17.29 17.60
> ANALYZE: INDEX 01.000000% 9 8.33 530.13 32.28 353.8=
6
> 23.04 32.87 34.08
> ANALYZE: INDEX 00.010000% 9 8.33 530.13 32.28 353.8=
6
> 23.04 32.87 34.08
> ANALYZE: INDEX 00.000001% 9 8.33 530.13 32.28 353.8=
6
> 23.04 32.87 34.08
> ANALYZE: INDEX 20.000000% 9 0.00 596.27 32.97 551.4=
3
> 58.40 19.22 19.51
> ANALYZE: INDEX 05.000000% 9 0.00 596.27 40.77 854.4=
6
> 59.40 24.67 23.23
> ANALYZE: INDEX 10.000000% 9 0.00 596.27 40.86 792.7=
2
> 141.73 29.25 29.07
>=20
> First of all, there is the huge deviation the calculations of LEAF_BLOCKS=
by
> DBMS_STATS and by ANALYZE. Each command (DBMS_STATS or ANALYZE) is
> consistent, so one must be wrong and the other right. Unfortunately, in
> this case, it is the DBMS_STATS package that is uniformly wrong on this,
> because the results produced by the ANALYZE INDEX commands are corroborat=
ed
> by similar results from the ANALYZE INDEX ... VALIDATE STRUCTURE command,
> the DBMS_SPACE package, and the values stored in the DBA_SEGMENTS view. =
I
> searched MetaLink for a bug report related to DBMS_STATS and LEAF_BLOCKS,
> but could not find one. I intend to retest this against 9.2.0.5 and
> 10.1.0.3, when I get a chance. I=B9ll also test it against 9.2.0.1...
>=20
> A few more things to note about the results shown here:
>=20
> 1. DBMS_STATS.GATHER_INDEX_STATS and ANALYZE INDEX produce the exact same
> results when COMPUTEing
>> * except the LEAF_BLOCKS anomaly >>> * which also apparently causes an anomaly with AVG_LEAF_BLOCKS_PER_KEY
> 2. the ANALYZE INDEX ... COMPUTE results are corroborated by other
> utilities, so the ANALYZE INDEX command is more accurate in v9.2.0.4 than
> the DBMS_STATS.GATHER_INDEX_STATS command
> 3. The results from both DBMS_STATS and ANALYZE are wildly worse when
> performing an estimated sample, instead of compute
>=20
> That last point is the most important. Nonetheless, since the ANALYZE IN=
DEX
> ... COMPUTE command is clearly the most accurate, here is the same result
> data, this time using the results from ANALYZE INDEX .. COMPUTE STATISTIC=
S
> as the baseline instead of DBMS_STATS:
>>=20
> Nbr of % Dev'n =
%
> Dev'n
> indexes % Dev'n % Dev'n % Dev'n Avg Leaf
> Avg Data % Dev'n % Dev'n
> LABEL sampled BLevel Leaf Blks Dist Keys Blks/Ke=
y
> Blks/Key Clu Fac # Idx Entries
> ------------------------------ ------- ------- --------- --------- ------=
--
> -------- ------- -------------
> ANALYZE: INDEX COMPUTE 9 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00
> ANALYZE: INDEX 50.000000% 9 0.00 0.13 23.55 15.18
> 13.85 17.29 17.60
> DBMS_STATS: INDEX COMPUTE 9 0.00 85.04 0.00 41.30
> 0.00 0.00 0.00
> DBMS_STATS: INDEX 05.000000% 9 0.00 85.04 22.13 38.50
> 16.40 0.44 0.00
> DBMS_STATS: INDEX 20.000000% 9 0.00 85.04 25.39 36.70
> 24.22 0.44 0.00
> ANALYZE: INDEX 01.000000% 9 8.33 8.42 32.28 36.11
> 23.04 32.87 34.08
> ANALYZE: INDEX 00.010000% 9 8.33 8.42 32.28 36.11
> 23.04 32.87 34.08
> ANALYZE: INDEX 00.000001% 9 8.33 8.42 32.28 36.11
> 23.04 32.87 34.08
> DBMS_STATS: INDEX 01.000000% 9 0.00 85.04 31.17 36.24
> 28.18 0.44 0.00
> DBMS_STATS: INDEX 10.000000% 9 0.00 85.04 25.56 38.07
> 33.89 0.44 0.00
> ANALYZE: INDEX 20.000000% 9 0.00 0.13 32.97 53.44
> 58.40 19.22 19.51
> DBMS_STATS: INDEX 50.000000% 9 0.00 85.04 26.42 37.22
> 38.65 0.44 0.00
> DBMS_STATS: INDEX 00.010000% 9 0.00 85.04 27.45 37.53
> 37.91 0.44 0.00
> DBMS_STATS: INDEX 00.000001% 9 0.00 85.04 27.81 37.05
> 40.37 0.44 0.00
> ANALYZE: INDEX 05.000000% 9 0.00 0.13 40.77 107.16
> 59.40 24.67 23.23
> ANALYZE: INDEX 10.000000% 9 0.00 0.13 40.86 95.99
> 141.73 29.25 29.07
>=20
> Again, this perspective shows the two types of COMPUTE largely agreeing w=
ith
> each other (except for the LEAF_BLOCKS and the AVG_LEAF_BLOCKS_PER_KEY va=
lue
> derived from it), with all other results showing dramatically less accura=
te
> due to decreases in the sample size.
>=20
> For indexes, clearly computing yields the best results, while a bug in
> DBMS_STATS appears to be skewing the LEAF_BLOCK results in v9.2.0.4. maki=
ng
> the ANALYZE INDEX command the best alternative for that version.
>=20
> So, from this data, which is admittedly sparse and limited (needs more
> testing on a wider range of versions and applications), I think that one =
can
> conclude:
>=20
> * Perform the gathering of table/column and index statistics separately f=
rom
> one another
>> * Gather statistics on tables and columns in one operation using >> DBMS_STATS.GATHER_TABLE_STATS >>> * leave the CASCADE parameter at the default of FALSE >>> * use a low ESTIMATE_PERCENT and even use BLOCK_SAMPLE=3D>TRUE, if necess= ary >> * Gather statistics on indexes in another operation to either ANALYZE IN= DEX >> or >> DBMS_STATS.GATHER_INDEX_STATS >>> * only do COMPUTE
> * There is a bug in DBMS_STATS.GATHER_INDEX_STATS in 9.2.0.4 regarding th=
e
> calculation of LEAF_BLOCKS that needs to be explored further...
>=20 >> From this, I don=B9t think that one can conclude that it is better to ANAL= YZE
> INDEX ... COMPUTE instead of
> DBMS_STATS.GATHER_INDEX_STATS(ESTIMATE_PERCENT=3D>NULL), because I don=B9t kn=
ow
> if the anomaly found exists outside of 9.2.0.4...
>=20
> Hope this helps...
>=20 >=20 -- http://www.freelists.org/webpage/oracle-l
![]() |
![]() |