FW: Bitmap index not used when joining tables
Date: Sat, 31 May 2014 15:24:17 -0400
Message-ID: <047401cf7d05$eae246c0$c0a6d440$_at_rsiz.com>
hotel email bounce; trying again
From: Mark W. Farnham [mailto:mwf_at_rsiz.com]
Sent: Saturday, May 31, 2014 3:14 PM
To: 'jonathan_at_jlcomp.demon.co.uk'; 'oracle-l_at_freelists.org'
Subject: RE: Bitmap index not used when joining tables
I do wonder about what your results might be for these two queries if you modified your query slightly as:
select
--+ gather_plan_statistics
s.* from sales_fact s
where s.rowid in (
select
--+ no_merge
s2.rowid from sales_fact s2
where s2.date_id in (
select
--+ no_merge
d.date_id from date_d d
where calendar_date between date'2013-12-31' and date'2014-01-01'
)
);
The idea being that if the only column it needs in the unmerged subquery
(rowid) can be sourced from the index, this should be less total work.
Possibly this is also optimized if the subquery is ordered by rowid, but
that's not the curiosity at this point. Of course even if this works you
would only want to form it up this way when you knew it was to your benefit
(knowledge the CBO can only guess by way of the thumbrule quoted by JL, and
some other thumbrule for a literal range.)
I just typed this in and didn't test it, and working is release dependent, so you and I got lucky if this works via cut and paste. Once debugged I would sort of expect this to use the index in both cases to fetch back the rowid list.
mwf
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org]
On Behalf Of Jonathan Lewis
Sent: Saturday, May 31, 2014 1:11 PM
To: oracle-l_at_freelists.org
Subject: RE: Bitmap index not used when joining tables
It's not surprising for the example you've given.
The basic principle is that the cost of a single table access via a bitmap index is likely to be different from the cost of access via the equivalent btree index because the bitmap index has no information about data clustering (i.e no clustering_factor) so it uses a guess. Sometimes this means the cost of the bitmap will be higher, sometimes lower, sometimes the same as for a btree. In your case the btree clustering factor will be very low (because of the order by in the CTAS) while the guess basically assumes that 20% of the data will be very widely scattered - in your case that probably means 400 blocks (20% of 2000 rows => 400 blocks).
Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle
From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on
behalf of Chris Saxon [chris.saxon_at_gmail.com]
Sent: 31 May 2014 16:21
To: oracle-l_at_freelists.org
Subject: Bitmap index not used when joining tables
Hi,
I've been testing using bitmap indexes on 11.2.0.2 EE. When joining two tables on a column with a BTree index, the index is used in the execution plan. If this index is changed to a bitmap index, Oracle no longer uses the index when executing the query! It assigns a higher cost to using the bitmap index when joining, despite this being a cheaper approach (in terms of consistent gets).
To see this, I created a date dimension table with 515 days (rows) and a sales fact table with 1,000 rows for each day:
create table date_d (
date_id integer not null primary key,
calendar_date date not null unique
);
create table sales_fact (
date_id integer not null
references date_d (date_id),
quantity number not null,
total_value number not null
);
insert into date_d
select rownum, date'2013-01-01'-1+rownum
from dual
connect by level <= sysdate - date'2013-01-01';
insert into sales_fact
with rws as (select * from dual connect by level <= 1000)
select d.date_id, round(dbms_random.value(1, 20)), round(dbms_random.value(10, 100), 2)
from date_d d
cross join rws
order by d.date_id;
begin
dbms_stats.gather_table_stats(user, 'sales_fact');
dbms_stats.gather_table_stats(user, 'date_d');
end;
/
If I create a BTree index on SALES_FACT.DATE_ID, then join from the date dim to the fact table, restricting to two days, Oracle uses the index on the fact table as I would expect (as we're fetching 2,000 of 515,000 rows):
create index safa_date_id on sales_fact (date_id);
set autotrace trace
select s.* from sales_fact s join date_d d
on d.date_id = s.date_id
where calendar_date between date'2013-12-31' and date'2014-01-01';
set autotrace off
Execution Plan
Plan hash value: 2189554905
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time |
---------------------------------------------------------------------------- ------------------ | 0 | SELECT STATEMENT | | 3002 | 69046 | 21
(0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 3002 | 69046 | 21
(0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | 3
(0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | 2
(0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | SAFA_DATE_ID | 1000 | | 3
(0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 | 6
(0)| 00:00:01 |
Predicate Information (identified by operation id):
4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))
5 - access("D"."DATE_ID"="S"."DATE_ID")
Statistics
1 recursive calls
0 db block gets
290 consistent gets
6 physical reads
0 redo size
36195 bytes sent via SQL*Net to client
1839 bytes received via SQL*Net from client
135 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2000 rows processed
However, if I drop the normal index and re-create it as a bitmap index the query above changes to a FTS on the SALES_FACT table:
set autotrace off
drop index safa_date_id;
create bitmap index safa_date_id on sales_fact (date_id);
set autotrace trace
select s.* from sales_fact s join date_d d
on d.date_id = s.date_id
where calendar_date between date'2013-12-31' and date'2014-01-01';
Execution Plan
Plan hash value: 525754326
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time |
---------------------------------------------------------------------------- ----------------- | 0 | SELECT STATEMENT | | 3002 | 69046 | 315
(2)| 00:00:04 |
|* 1 | HASH JOIN | | 3002 | 69046 | 315
(2)| 00:00:04 |
| 2 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | 3
(0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | 2
(0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | SALES_FACT | 515K| 5532K| 310
(1)| 00:00:04 |
Predicate Information (identified by operation id):
1 - access("D"."DATE_ID"="S"."DATE_ID")
3 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))
Statistics
1 recursive calls
0 db block gets
1267 consistent gets
0 physical reads
0 redo size
36195 bytes sent via SQL*Net to client
1839 bytes received via SQL*Net from client
135 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2000 rows processed
If we hint the query to use the bitmap index, we can see it has a higher cost. The autotrace stats report significantly fewer consistent gets though:
Execution Plan
Plan hash value: 1520624055
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time |
---------------------------------------------------------------------------- ------------------ | 0 | SELECT STATEMENT | | 3002 | 69046 | 416
(0)| 00:00:05 |
| 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 3002 | 69046 | 416
(0)| 00:00:05 |
| 3 | TABLE ACCESS BY INDEX ROWID| DATE_D | 3 | 36 | 3
(0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | SYS_C0037151 | 3 | | 2
(0)| 00:00:01 |
| 5 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 6 | BITMAP INDEX SINGLE VALUE | SAFA_DATE_ID | | | | |
| 7 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 1000 | 11000 | 416
(0)| 00:00:05 |
Predicate Information (identified by operation id):
4 - access("D"."CALENDAR_DATE">=TO_DATE(' 2013-12-31 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D"."CALENDAR_DATE"<=TO_DATE(' 2014-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))
6 - access("D"."DATE_ID"="S"."DATE_ID")
Statistics
1 recursive calls
0 db block gets
152 consistent gets
1 physical reads
0 redo size
36195 bytes sent via SQL*Net to client
1839 bytes received via SQL*Net from client
135 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2000 rows processed
If you remove the join to the date dim and just use the date ids, Oracle uses the index as expected:
select * from sales_fact
where date_id between 365 and 366;
Execution Plan
Plan hash value: 2749560877
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)| Time |
---------------------------------------------------------------------------- ----------------- | 0 | SELECT STATEMENT | | 3002 | 33022 | 221
(0)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 3002 | 33022 | 221
(0)| 00:00:03 |
| 2 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 3 | BITMAP INDEX RANGE SCAN | SAFA_DATE_ID | | | | | ---------------------------------------------------------------------------- -----------------
Predicate Information (identified by operation id):
3 - access("DATE_ID">=365 AND "DATE_ID"<=366)
Statistics
1 recursive calls
0 db block gets
144 consistent gets
0 physical reads
0 redo size
36195 bytes sent via SQL*Net to client
1839 bytes received via SQL*Net from client
135 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2000 rows processed
Looking at the 10053 trace, I can see this is because the bm index join cost is calculated as higher than the FTS of SALES_FACT:
NL Join
Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12
Access path analysis for SALES_FACT
Inner table: SALES_FACT Alias: S
Access Path: TableScan
NL Join: Cost: 929.82 Resp: 929.82 Degree: 1
Cost_io: 920.00 Cost_cpu: 317629069
Resp_io: 920.00 Resp_cpu: 317629069
- trying bitmap/domain indexes ******
Access Path: index (AllEqJoinGuess)
Index: SAFA_DATE_ID
resc_io: 1.00 resc_cpu: 8171
ix_sel: 0.001942 ix_sel_with_filters: 0.001942
NL Join : Cost: 6.00 Resp: 6.00 Degree: 1
Cost_io: 6.00 Cost_cpu: 47359
Resp_io: 6.00 Resp_cpu: 47359
Bitmap nodes:
Used SAFA_DATE_ID
Cost = 6.001464, sel = 0.001942
Access path: Bitmap index - accepted
Cost: 416.121488 Cost_io: 415.999594 Cost_cpu: 3943626.145192 Sel: 0.001942
Not Believed to be index-only
- finished trying bitmap/domain indexes ******
Best NL cost: 416.12
resc: 416.12 resc_io: 416.00 resc_cpu: 3943626
resp: 416.12 resp_io: 416.00 resc_cpu: 3943626
However, same section of the 10053 trace shows the following when using a BTree index on the SALES_FACT.DATE_ID column:
NL Join
Outer table: Card: 3.00 Cost: 3.00 Resp: 3.00 Degree: 1 Bytes: 12
Access path analysis for SALES_FACT
Inner table: SALES_FACT Alias: S
Access Path: TableScan
NL Join: Cost: 929.82 Resp: 929.82 Degree: 1
Cost_io: 920.00 Cost_cpu: 317629069
Resp_io: 920.00 Resp_cpu: 317629069
Access Path: index (AllEqJoinGuess)
Index: SAFA_DATE_ID
resc_io: 6.00 resc_cpu: 433579
ix_sel: 0.001942 ix_sel_with_filters: 0.001942
NL Join : Cost: 21.04 Resp: 21.04 Degree: 1
Cost_io: 21.00 Cost_cpu: 1323580
Resp_io: 21.00 Resp_cpu: 1323580
Best NL cost: 21.04
resc: 21.04 resc_io: 21.00 resc_cpu: 1323580
resp: 21.04 resp_io: 21.00 resc_cpu: 1323580
Why does this happen? Is this a bug or expected behaviour?
Thanks,
Chris
www.sqlfail.com
-- http://www.freelists.org/webpage/oracle-lReceived on Sat May 31 2014 - 21:24:17 CEST