Hello - I'm trying to use a bloom filter to implement a join. The reason is to get some Exadata optimization working (bloom filtering can be offloaded), but the issue is equally applicable to conventional storage. As you all know, a bloom filter can be used only for a parallel hash join.
Working in SCOTT, if I hint it I get the desired plan for a small data set:orcl> exec dbms_stats.gather_schema_stats(user)
PL/SQL procedure successfully completed.
orcl> set autot on exp
orcl> alter table emp parallel (degree 2);
Table altered.
orcl> alter table dept parallel (degree 2);
Table altered.
orcl> select /*+ use_hash(emp dept) bloom join 2 */ ename,dname from emp natural join dept where sal=1000;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2858815543
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 5 (20)| 00:00:01 | |
| 1 | PX COORDINATOR | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 1 | 26 | 5 (20)| 00:00:01 | Q1,01 | P->S
|* 3 | HASH JOIN | | 1 | 26 | 5 (20)| 00:00:01 | Q1,01 | PCWP
| 4 | PX RECEIVE | | 1 | 13 | 2 (0)| 00:00:01 | Q1,01 | PCWP
| 5 | PX SEND BROADCAST | :TQ10000 | 1 | 13 | 2 (0)| 00:00:01 | Q1,00 | P->P
| 6 | PX BLOCK ITERATOR | | 1 | 13 | 2 (0)| 00:00:01 | Q1,00 | PCWC
|* 7 | TABLE ACCESS STORAGE FULL| EMP | 1 | 13 | 2 (0)| 00:00:01 | Q1,00 | PCWP
| 8 | PX BLOCK ITERATOR | | 4 | 52 | 2 (0)| 00:00:01 | Q1,01 | PCWC
|* 9 | TABLE ACCESS STORAGE FULL | DEPT | 4 | 52 | 2 (0)| 00:00:01 | Q1,01 | PCWP
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
7 - storage("EMP"."SAL"=1000)
filter("EMP"."SAL"=1000)
9 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"DEPT"."DEPTNO"))
filter(SYS_OP_BLOOM_FILTER(:BF0000,"DEPT"."DEPTNO"))
orcl>
the bloom filter comes in at Id 9 as I want and (just as a bonus) it can be offloaded.
But if I tell the optimizer that the tables are bigger, it refuses to construct a bloom filter:orcl> exec dbms_stats.set_table_stats('scott','emp',numrows=>1000000000)
PL/SQL procedure successfully completed.
orcl> exec dbms_stats.set_table_stats('scott','dept',numrows=>1000000)
PL/SQL procedure successfully completed.
orcl> select /*+ use_hash(emp dept) bloom join 2 */ ename,dname from emp natural join dept where sal=1000;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 197865614
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8928G| 211T| 26M(100)| 01:45:08 | |
| 1 | PX COORDINATOR | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 8928G| 211T| 26M(100)| 01:45:08 | Q1,01 | P->S
|* 3 | HASH JOIN | | 8928G| 211T| 26M(100)| 01:45:08 | Q1,01 | PCWP
| 4 | PX RECEIVE | | 1000K| 12M| 7 (72)| 00:00:01 | Q1,01 | PCWP
| 5 | PX SEND BROADCAST | :TQ10000 | 1000K| 12M| 7 (72)| 00:00:01 | Q1,00 | P->P
| 6 | PX BLOCK ITERATOR | | 1000K| 12M| 7 (72)| 00:00:01 | Q1,00 | PCWC
| 7 | TABLE ACCESS STORAGE FULL| DEPT | 1000K| 12M| 7 (72)| 00:00:01 | Q1,00 | PCWP
| 8 | PX BLOCK ITERATOR | | 35M| 442M| 8993 (100)| 00:00:03 | Q1,01 | PCWC
|* 9 | TABLE ACCESS STORAGE FULL | EMP | 35M| 442M| 8993 (100)| 00:00:03 | Q1,01 | PCWP
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
9 - storage("EMP"."SAL"=1000)
filter("EMP"."SAL"=1000)
orcl>
orcl>
Hinting it to force the join order doesn't make any dofference:orcl> select /*+ ordered use_hash(emp dept) bloom join 2 */ ename,dname from emp natural join dept where sal=1000;
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 4099459956
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Di
-------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8928G| 211T| 26M(100)| 01:45:08 | | |
| 1 | PX COORDINATOR | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 8928G| 211T| 26M(100)| 01:45:08 | Q1,01 | P->S | QC (R
|* 3 | HASH JOIN | | 8928G| 211T| 26M(100)| 01:45:08 | Q1,01 | PCWP |
| 4 | PX BLOCK ITERATOR | | 35M| 442M| 8993 (100)| 00:00:03 | Q1,01 | PCWC |
|* 5 | TABLE ACCESS STORAGE FULL | EMP | 35M| 442M| 8993 (100)| 00:00:03 | Q1,01 | PCWP |
| 6 | BUFFER SORT | | | | | | Q1,01 | PCWC |
| 7 | PX RECEIVE | | 1000K| 12M| 7 (72)| 00:00:01 | Q1,01 | PCWP |
| 8 | PX SEND BROADCAST | :TQ10000 | 1000K| 12M| 7 (72)| 00:00:01 | Q1,00 | P->P | BROAD
| 9 | PX BLOCK ITERATOR | | 1000K| 12M| 7 (72)| 00:00:01 | Q1,00 | PCWC |
| 10 | TABLE ACCESS STORAGE FULL| DEPT | 1000K| 12M| 7 (72)| 00:00:01 | Q1,00 | PCWP |
-------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
5 - storage("EMP"."SAL"=1000)
filter("EMP"."SAL"=1000)
orcl>
Can anyone advise on what I am missing? Why the optimizer refuses to construct a bloom filter for the large data set? Or how to force it?
Thank you for any insight.