Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: (long) 2 Oracle doubts
Hi!
"Paul Brewer" <paul_at_paul.brewers.org.uk> wrote in message
news:3f22dae0_2_at_mk-nntp-1.news.uk.worldonline.com...
[snip]
> And if CBO can't recognise that the table will fit into one read, and that
> the index is not a good one to use(whether or not the table is cached),
I'll
> eat my red hat.
>
> Regards,
> Paul
>
Although your mileage may vary, I didn't have anything better to do with my rainy Sunday here and put together a test which should make you eat your red hat ;)
C:\>sqlplus admin/admin
SQL*Plus: Release 9.2.0.1.0 - Production on P Jul 27 13:16:01 2003
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.1.0 - Production
SQL> create tablespace test datafile 'C:\ORACLE\ORA92\ORCL\TEST01.DBF' size 10m autoextend on segment space management manual;
Tablespace created.
I used old freelist based free space management to avoid any slick tricks Oracle might do with my data..
SQL> create table t (id, name) tablespace test as select obj#, name from sys.obj$ where rownum <= 50;
Table created.
SQL> desc t;
Name Null? Type ----------------------------- -------- -------------------- ID NOT NULL NUMBER NAME NOT NULL VARCHAR2(30)
Both columns are not null from original table’s definition.
SQL> create unique index i on t(id) tablespace test;
Index created.
Note that this is a unique index (which a primary key should normally use..)
SQL> analyze table t compute statistics;
Table analyzed.
(Compute statistics on table analyzes all it’s indexes by default as well)
SQL> select blocks, empty_blocks, num_rows, avg_space, avg_row_len from user_tables where table_name = 'T';
BLOCKS EMPTY_BLOCKS NUM_ROWS AVG_SPACE AVG_ROW_LEN
1 6 50 7237 14
SQL> select blocks from user_extents where segment_name = 'T';
BLOCKS
8
We see that a 8 block extent has been allocated for table T, 1 block of it is in use for data and one remaining block is for segment header.
SQL> select blevel, leaf_blocks, distinct_keys, clustering_factor from user_indexes where index_name = 'I';
BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR
0 1 50 1
Index blevel is 0, that means all keys are accessible with only one IO to index root block.
SQL> alter tablespace test offline;
Tablespace altered.
SQL> alter tablespace test online;
Tablespace altered.
SQL> set autotrace on explain stat
SQL> alter session set events '10053 trace name context forever, level 1';
Session altered.
SQL> alter session set events '10046 trace name context forever, level 8';
Session altered.
SQL> select name from t where id = 25;
NAME
I_CDEF1 Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=11)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=1 Card=1 Bytes=
11)
2 1 INDEX (UNIQUE SCAN) OF 'I' (UNIQUE) You see, index is preferred over single block table! The reason here is that optmizer finds out that cost is 2 for table scan, but only 1 for index scan. (_table_scan_cost_plus_one parameter says that optimizer should take segment header block reads into account when computing costs. This parameter defaults to true in 9i, false in 8i). Segment header block reads are necessary for every full table scan (and index fast full scan) because the map of extents belonging to segment is there. The reads can be verified from 10046 trace (but data blocks shouldn’t be in buffer cache in order to see sequential and scattered read waits).
Note that index access cost for *unique* scans is calculated as blevel + 1. That is, for single block index, index traversing cost is 0 and for table block read cost 1 is added. In some cases cost doesn’t always represent number of IOs (and CPU usage), this special issue could be because Oracle knows that unique scans are generally very fast (since they never fetch more than one row) and normally a frequently used index’es root block is in cache anyway. This index cost is 0 as long as table is accesses by index rowid afterwards, if the column queried is actually contained in the index (no table read needed) then index scan cost is assigned a cost of 1, probably because you can’t have a statement with no cost at all.
Statistics
0 recursive calls
0 db block gets
2 consistent gets
2 physical reads
0 redo size
380 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Even though the cost was 1, we got 2 physical reads, one for index root/leaf block and one for actual data block.
Let’s play around with index stats a bit and tell optimizer that index has 3 levels, 1 root, 1 branch and a leaf level:
SQL> exec dbms_stats.set_index_stats(ownname=>'ADMIN', indname=>'I', indlevel=>2);
PL/SQL procedure successfully completed.
SQL> select name from t where id = 25;
NAME
I_CDEF1 Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=11)
1 0 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=11)
Since optimizer thinks that for getting to this unique row it requires 4 IO’ s (3 levels of index and one data block) it figures out that full table scan with only 2 IO’s is cheaper.
Statistics
0 recursive calls
0 db block gets
4 consistent gets
1 physical reads
0 redo size
380 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
We have only one additional physical read, from trace I saw it was a sequential, single block read of table segment header block. There was no scattered reads probably because Oracle kernel cache layer saw that all table blocks up to segment high water mark are already in cache. If HMW would have been higher, multiblock reads up to it would have occurred.
Let’s make the test more interesting and set costs of both scans equal, by setting index level to 1 (root and leaf levels).
SQL> exec dbms_stats.set_index_stats(ownname=>'ADMIN', indname=>'I', indlevel=>1);
PL/SQL procedure successfully completed.
SQL> select name from t where id = 25;
NAME
I_COL3 Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=11)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
11)
2 1 INDEX (UNIQUE SCAN) OF 'I' (UNIQUE) (Cost=1 Card=50)
In this case, cost was 2 for both table and index scan but index scan was chosen. Seems that when costs are equal, access path which is better in Rule Based Optimizer sense, is chosen (index over table scan, rowid over index scan). That’s why actual execution path may differ from CBO calculations of BEST_CST in it’s trace file
Ok, let’s “increase” number of leaf blocks now:
SQL> exec dbms_stats.set_index_stats(ownname=>'ADMIN', indname=>'I', indlevel=>1, numlblks=>500, clstfct=>500 );
PL/SQL procedure successfully completed.
SQL> select name from t where id = 25;
NAME
PROXY_ROLE_DATA$ Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=10)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
10)
2 1 INDEX (UNIQUE SCAN) OF 'I' (UNIQUE) (Cost=1 Card=50)
Even though for CBO the index looks way bigger than table now, index access is still chosen, because optimizer knows that for unique index it only needs to traverse from root block to one single leaf block to get it’s key.
Let’s see what happens with non-unique indexes:
SQL> drop index i;
Index dropped.
SQL> create index i on t(id);
Index created.
SQL> analyze index i compute statistics;
Index analyzed.
SQL> select name from t where id = 25;
NAME
I_ICOL1 Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=10)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
10)
2 1 INDEX (RANGE SCAN) OF 'I' (NON-UNIQUE) (Cost=1 Card=1)
Index unique scan has changed to index range scan and index access cost has changed to 1, thus with data block fetch summing up to 2. (For unique scan index access cost was 0). Note, that this index has blevel 0, since it’s just built on our 50 rows.
Ok, let’s set index bigger for CBO, two levels (indlevel 1) and number of leaf blocks to 50. Also, clustering factor can’t normally be less than number of leaf blocks, thus has to be adjusted (I actually tried to make it less by using compressed tables and non-compressed indexes but for some reason my experiment didn’t succeed..)
SQL> exec dbms_stats.set_index_stats(ownname=>'ADMIN', indname=>'I', indlevel=>1, numlblks=>50, clstfct=>50);
PL/SQL procedure successfully completed.
SQL> select name from t where id = 25;
NAME
I_ICOL1 Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=10)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
10)
2 1 INDEX (RANGE SCAN) OF 'I' (NON-UNIQUE) (Cost=1 Card=1)
Even though I set index height to 2 levels using dbms_stats, CBO still finds that index scan costs 1 IO and corresponding data block read also 1 IO. This seems strange, because index traversing cost is considered the same for 1-level and 2-level indexes. Again, this is probably an Oracle trick for handling unique and non-unique scans on small indexes better. Here’s a result of my little experiment with index heights and CBO calculated index traversing costs:
Index height Cost Unique Cost Range (non-unique)
0 0 1
1 1 1 <-------
2 2 3
3 3 4
These are index scan costs only. From the table you see, that for To calculate the whole data fetching cost you have to add 1 for *each* data block read for corresponding key. Note the *each* data block read. This is the place where clustering factor comes into play. Lets set clustering factor a little higher now:
SQL> exec dbms_stats.set_index_stats(ownname=>'ADMIN', indname=>'I', indlevel=>1, numlblks=>50, clstfct=>51);
PL/SQL procedure successfully completed.
SQL> select name from t where id = 25;
NAME
PROXY_ROLE_DATA$ Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=10)
1 0 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=10)
We see, that increasing clustering factor probably made index access cost higher than best cost 2 of full table scan. To see the detailed costs, I just put an index hint into the query:
SQL> select /*+ index(t) */ name from t where id = 25;
NAME
PROXY_ROLE_DATA$ Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=1 Bytes=10)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=3 Card=1 Bytes=
10)
2 1 INDEX (RANGE SCAN) OF 'I' (NON-UNIQUE) (Cost=1 Card=1)
The index traversing cost for range scan has remained 1, but table access cost has grown by 1. When number of distinct keys was 50 and also clustering factor was 50, CBO knew that for each key value no more than 1 data block had to be fetched to get data, thus cost of reading appropriate data from table is: ceil(1/NDV*CLUSTERING_FACTOR). NDV means number of distinct values and 1/NDV gives us DENSITY, in other words FILTER FACTOR. Thus the cost=ceil(1/50*50)=1. But as soon as clustering factor raised above 50, then cost of table access also raised to 2. When clustering factor reaches 101, then table access cost will be 3 etc..
What happens if we increase number of leaf blocks as well?
SQL> exec dbms_stats.set_index_stats(ownname=>'ADMIN', indname=>'I', indlevel=>1, numlblks=>51, clstfct=>51);
PL/SQL procedure successfully completed.
SQL> select /*+ index(t) */ name from t where id = 25;
NAME
PROXY_ROLE_DATA$ Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=10)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=4 Card=1 Bytes=
10)
2 1 INDEX (RANGE SCAN) OF 'I' (NON-UNIQUE) (Cost=2 Card=1)
We see that index range scan cost has risen from 1 to 2 as well! This has again to do with number of distinct keys in index. When we had 50 distinct keys and 50 leaf blocks, it was clear that *on average* there is only one leaf block per key value, thus querying by equality condition on a specific key no more than one index leaf block has to be read. But as long as number of leaf blocks (#LB) actually rose higher than number of distinct keys (#DK in 10053 trace), then ceil(1/#DK*#LB) adds 1 to index traversing and leaf block reading cost. The same rule applies, when #LB reaches 101 in our case, the cost will be added 1 etc..
Note that we are still talking about non-unique RANGE scans. But what happens when we add an unique constraint to ID column which has non-unique index?
SQL> alter table t add constraint c unique(id);
Table altered.
SQL> select name from t where id = 25;
NAME
PROXY_ROLE_DATA$ Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=10)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=2 Card=1 Bytes=
10)
2 1 INDEX (RANGE SCAN) OF 'I' (NON-UNIQUE) (Cost=1 Card=50)
Since optimizer knows that contents of ID column and index are definitely unique it doesn’t have to add cost for additional index leaf block scans because in unique index you only have to visit one leaf block to get to your ID (I actually tried playing around with avg row length and avg column length to make unique index leaf block scan more expensive, but with no luck ...)
Just one more experient with disabling constraints:
SQL> alter table t disable constraint c;
Table altered.
SQL> alter table t enable novalidate constraint c;
Table altered.
SQL> select name from t where id = 25;
NAME
PROXY_ROLE_DATA$ Execution Plan
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=10)
1 0 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1 Bytes=10)
Oracle now counts full table scan cheaper, because with novalidated constraints it can’t be sure whether all rows in indexed column are unique or not.
SQL> show parameter optimizer
NAME TYPE VALUE ------------------------------------ ----------- --------------------------- optimizer_dynamic_sampling integer 1 optimizer_features_enable string 9.2.0 optimizer_index_caching integer 0 optimizer_index_cost_adj integer 100 optimizer_max_permutations integer 2000 optimizer_mode string CHOOSE
Here are my optmizer* parameters. Optimizer dynamic sampling didn’t have any effect in my test, because at level 1 it will only sample when tables don’t have statistics (and there’s a join condition if I recall correctly).
Conclusion:
Index access is chosen by CBO for small lookup tables.
Since I don’t count myself as an expert on CBO, any additions and corrections are welcome.
As I stated in beginning of my mail, your mileage may vary, but Paul, I’ll be satisfied, when you eat half of your red hat ;)
Cheers,
Tanel. Received on Sun Jul 27 2003 - 13:40:10 CDT