Home » RDBMS Server » Performance Tuning » Bloom filters (11.2.0.3)
Bloom filters [message #571760] Fri, 30 November 2012 02:35
John Watson
Messages: 8960
Registered: January 2010
Location: Global Village
Senior Member
Hello - I'm trying to use a bloom filter to implement a join. The reason is to get some Exadata optimization working (bloom filtering can be offloaded), but the issue is equally applicable to conventional storage. As you all know, a bloom filter can be used only for a parallel hash join.

Working in SCOTT, if I hint it I get the desired plan for a small data set:
orcl> exec dbms_stats.gather_schema_stats(user)

PL/SQL procedure successfully completed.

orcl> set autot on exp
orcl> alter table emp parallel (degree 2);

Table altered.

orcl> alter table dept parallel (degree 2);

Table altered.

orcl> select /*+ use_hash(emp dept) bloom join 2 */ ename,dname from emp natural join dept  where sal=1000;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 2858815543

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |          |     1 |    26 |     5  (20)| 00:00:01 |        |
|   1 |  PX COORDINATOR                 |          |       |       |            |          |        |
|   2 |   PX SEND QC (RANDOM)           | :TQ10001 |     1 |    26 |     5  (20)| 00:00:01 |  Q1,01 | P->S
|*  3 |    HASH JOIN                    |          |     1 |    26 |     5  (20)| 00:00:01 |  Q1,01 | PCWP
|   4 |     PX RECEIVE                  |          |     1 |    13 |     2   (0)| 00:00:01 |  Q1,01 | PCWP
|   5 |      PX SEND BROADCAST          | :TQ10000 |     1 |    13 |     2   (0)| 00:00:01 |  Q1,00 | P->P
|   6 |       PX BLOCK ITERATOR         |          |     1 |    13 |     2   (0)| 00:00:01 |  Q1,00 | PCWC
|*  7 |        TABLE ACCESS STORAGE FULL| EMP      |     1 |    13 |     2   (0)| 00:00:01 |  Q1,00 | PCWP
|   8 |     PX BLOCK ITERATOR           |          |     4 |    52 |     2   (0)| 00:00:01 |  Q1,01 | PCWC
|*  9 |      TABLE ACCESS STORAGE FULL  | DEPT     |     4 |    52 |     2   (0)| 00:00:01 |  Q1,01 | PCWP
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
   7 - storage("EMP"."SAL"=1000)
       filter("EMP"."SAL"=1000)
   9 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"DEPT"."DEPTNO"))
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"DEPT"."DEPTNO"))

orcl>
the bloom filter comes in at Id 9 as I want and (just as a bonus) it can be offloaded.
But if I tell the optimizer that the tables are bigger, it refuses to construct a bloom filter:
orcl> exec dbms_stats.set_table_stats('scott','emp',numrows=>1000000000)

PL/SQL procedure successfully completed.

orcl> exec dbms_stats.set_table_stats('scott','dept',numrows=>1000000)

PL/SQL procedure successfully completed.

orcl> select /*+ use_hash(emp dept) bloom join 2 */ ename,dname from emp natural join dept  where sal=1000;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 197865614

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |          |  8928G|   211T|    26M(100)| 01:45:08 |        |
|   1 |  PX COORDINATOR                 |          |       |       |            |          |        |
|   2 |   PX SEND QC (RANDOM)           | :TQ10001 |  8928G|   211T|    26M(100)| 01:45:08 |  Q1,01 | P->S
|*  3 |    HASH JOIN                    |          |  8928G|   211T|    26M(100)| 01:45:08 |  Q1,01 | PCWP
|   4 |     PX RECEIVE                  |          |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,01 | PCWP
|   5 |      PX SEND BROADCAST          | :TQ10000 |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,00 | P->P
|   6 |       PX BLOCK ITERATOR         |          |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,00 | PCWC
|   7 |        TABLE ACCESS STORAGE FULL| DEPT     |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,00 | PCWP
|   8 |     PX BLOCK ITERATOR           |          |    35M|   442M|  8993 (100)| 00:00:03 |  Q1,01 | PCWC
|*  9 |      TABLE ACCESS STORAGE FULL  | EMP      |    35M|   442M|  8993 (100)| 00:00:03 |  Q1,01 | PCWP
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
   9 - storage("EMP"."SAL"=1000)
       filter("EMP"."SAL"=1000)

orcl>
orcl>
Hinting it to force the join order doesn't make any dofference:
orcl> select /*+ ordered use_hash(emp dept) bloom join 2 */ ename,dname from emp natural join dept  where sal=1000;

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 4099459956

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Di
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |          |  8928G|   211T|    26M(100)| 01:45:08 |        |      |
|   1 |  PX COORDINATOR                  |          |       |       |            |          |        |      |
|   2 |   PX SEND QC (RANDOM)            | :TQ10001 |  8928G|   211T|    26M(100)| 01:45:08 |  Q1,01 | P->S | QC (R
|*  3 |    HASH JOIN                     |          |  8928G|   211T|    26M(100)| 01:45:08 |  Q1,01 | PCWP |
|   4 |     PX BLOCK ITERATOR            |          |    35M|   442M|  8993 (100)| 00:00:03 |  Q1,01 | PCWC |
|*  5 |      TABLE ACCESS STORAGE FULL   | EMP      |    35M|   442M|  8993 (100)| 00:00:03 |  Q1,01 | PCWP |
|   6 |     BUFFER SORT                  |          |       |       |            |          |  Q1,01 | PCWC |
|   7 |      PX RECEIVE                  |          |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,01 | PCWP |
|   8 |       PX SEND BROADCAST          | :TQ10000 |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,00 | P->P | BROAD
|   9 |        PX BLOCK ITERATOR         |          |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,00 | PCWC |
|  10 |         TABLE ACCESS STORAGE FULL| DEPT     |  1000K|    12M|     7  (72)| 00:00:01 |  Q1,00 | PCWP |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
   5 - storage("EMP"."SAL"=1000)
       filter("EMP"."SAL"=1000)

orcl>
Can anyone advise on what I am missing? Why the optimizer refuses to construct a bloom filter for the large data set? Or how to force it?

Thank you for any insight.
Previous Topic: Database Reorganisation
Next Topic: Checkpoint not complete
Goto Forum:
  


Current Time: Thu Nov 21 16:38:18 CST 2024