Home » RDBMS Server » Performance Tuning » Dunsel Joins (9i, 10g, 11g)
Dunsel Joins [message #358359] |
Mon, 10 November 2008 21:00 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
Once again I have been asked by a friend how to get the optimizer to not do Dunsel Joins. Please recall the definition of Dunsel:
Quote: | Dunsel is a term used by mid-shipmen in the 23rd century to describe a part which serves no useful purpose.
Spock to McCoy about Kirk
Episode 53 - The Ultimate Computer
|
I am finding it more and more common as systems become more generalized in their access points, for designers to be building multi-purpose views. These views typically join many tables together effectively creating a "flat file view" of some data model tree. One problem that surfaces from this is a performance issue. In most cases, a projection of some subset of columns is requested of the view. Thus in most cases there are many joins being done by the database that need not be done. Here is a clear example:
create table dept
(
dept_id number not null primary key
,dept_name varchar2(10) not null unique
)
/
create table proj
(
proj_id number not null primary key
,proj_name varchar2(10) not null unique
)
/
create table emp
(
emp_id number not null primary key
,emp_name varchar2(10) not null unique
,dept_id number not null
,proj_id number not null
,constraint emp_fk1 foreign key (dept_id) references dept
,constraint emp_fk2 foreign key (proj_id) references proj
)
/
Aside for the declaration of primary and foreign keys that define our data model as a tree with a root at EMP, I would point out that the foreign key columns in EMP are mandatory. The combination of NOT NULL and FOREIGN KEY means that for every row in EMP there is always a matching row in DEPT and always a matching row in PROJ.
A natural query to ask of this data model would be:
select dept.dept_name
,proj.proj_name
,emp.emp_name
from dept
,proj
,emp
where emp.dept_id = dept.dept_id
and emp.proj_id = proj.proj_id
/
Of particular note is that something is selected from all three tables. Thus it is necessary to visit all three tables.
However, now consider this query:
select emp.emp_name
from dept
,proj
,emp
where emp.dept_id = dept.dept_id
and emp.proj_id = proj.proj_id
/
This query does not return any data from DEPT nor PROJ. The joins EMP-->DEPT and EMP-->PROJ serve no useful purpose. They are not required for access to more data, nor can they change the number of rows returned by the query. These are DUNSEL Joins. Yet the optimizer does not (at least to my knowledge) understand this because it still does these Dunsel Joins as is witnesed by the query execution plan below:
SQL> set autotrace on
SQL> select emp.emp_name
2 from dept
3 ,proj
4 ,emp
5 where emp.dept_id = dept.dept_id
6 and emp.proj_id = proj.proj_id
7 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'EMP' (TABLE)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_C0022788' (INDEX (UNIQUE))
5 1 INDEX (UNIQUE SCAN) OF 'SYS_C0022784' (INDEX (UNIQUE))
Quote: | So... what can be done for the following situation to tell the optimizer not to do the Dunsel Joins?
Or is the optimizer not there yet?
If not, can anyone offer suggestions for an alternative approach?
|
create or replace view vw_emp
as
select dept.dept_name
,proj.proj_name
,emp.emp_name
from dept
,proj
,emp
where emp.dept_id = dept.dept_id
and emp.proj_id = proj.proj_id
/
select emp_name
from vw_emp
/
Kevin
|
|
|
Re: Dunsel Joins [message #358423 is a reply to message #358359] |
Tue, 11 November 2008 04:41 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
One potential problem I can see would come from constraints enabled with the NOVALIDATE clause.
These would alow you to have data in the b that didn't conform to the constraints urrent;y in force.
So, by trimming the query down and eliminating the dunsel joins, you would end up returning data that the original query wouldn't return
|
|
|
Re: Dunsel Joins [message #358441 is a reply to message #358359] |
Tue, 11 November 2008 06:20 |
cookiemonster
Messages: 13958 Registered: September 2008 Location: Rainy Manchester
|
Senior Member |
|
|
I don't see what you see:
Connected to:
Oracle Database 10g Release 10.2.0.2.0 - 64bit Production
SQL> explain plan for select emp.emp_name
2 from dept
3 ,proj
4 ,emp
5 where emp.dept_id = dept.dept_id
6 and emp.proj_id = proj.proj_id
7 /
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2872589290
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| EMP | 1 | 33 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
- dynamic sampling used for this statement
12 rows selected.
For me it's ignoring the redundant tables.
I get the same result if I create a view over the tables and query that:
SQL> CREATE VIEW dunsal_test AS
2 SELECT emp.emp_name, dept.dept_name, proj.proj_name
3 from dept
4 ,proj
5 ,emp
6 where emp.dept_id = dept.dept_id
7 and emp.proj_id = proj.proj_id;
View created.
SQL>
SQL> explain plan for select emp_name from dunsal_test;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2872589290
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 33 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| EMP | 1 | 33 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
As far as I was aware in 10g at least (not sure about older versions) dunsal joins are elimated if the join is on a foreign key.
|
|
|
Re: Dunsel Joins [message #358472 is a reply to message #358441] |
Tue, 11 November 2008 08:56 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
Nice work guys. So it seem my buddy's fix will be to simply upgrade. Here is what I see both to you cookiemonster and you JRowbottom.
Dunsel Joins are no longer done in 10g release 2.
When foreign keys are deferred, Dunsel Joins are still done even in 10g release 2.
When foreign keys are created NOVALIDATE, the Dunsel Joins return as well.
In 10g Rlease 1, Dunsel Joins are done.
SQL> set autotrace off
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.1.0.5.0 - 64bi
PL/SQL Release 10.1.0.5.0 - Production
CORE 10.1.0.5.0 Production
TNS for HPUX: Version 10.1.0.5.0 - Production
NLSRTL Version 10.1.0.5.0 - Production
5 rows selected.
SQL> set autotrace on
SQL> select emp.emp_name
2 from dept
3 ,proj
4 ,emp
5 where emp.dept_id = dept.dept_id
6 and emp.proj_id = proj.proj_id
7 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 NESTED LOOPS
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'EMP' (TABLE)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_C0022788' (INDEX (UNIQUE))
5 1 INDEX (UNIQUE SCAN) OF 'SYS_C0022784' (INDEX (UNIQUE))
But in 10g Release 2, Dunsel Joins, with normally enforced constraints, Dunsel Joins are not done.
SQL> set autotrace off
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bi
PL/SQL Release 10.2.0.3.0 - Production
CORE 10.2.0.3.0 Production
TNS for Solaris: Version 10.2.0.3.0 - Production
NLSRTL Version 10.2.0.3.0 - Production
5 rows selected.
SQL> set autotrace on
SQL> select emp.emp_name
2 from dept
3 ,proj
4 ,emp
5 where emp.dept_id = dept.dept_id
6 and emp.proj_id = proj.proj_id
7 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=2 Card=1 Bytes=33)
1 0 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=2 Card=1 Bytes=33)
But if we make the foreign keys deferred, then Dunsel Joins come back.
SQL> drop table emp;
Table dropped.
SQL>
SQL> create table emp
2 (
3 emp_id number not null primary key
4 ,emp_name varchar2(10) not null unique
5 ,dept_id number not null
6 ,proj_id number not null
7 ,constraint emp_fk1 foreign key (dept_id) references dept initially deferred
8 ,constraint emp_fk2 foreign key (proj_id) references proj initially deferred
9 )
10 /
Table created.
SQL>
SQL> select emp.emp_name
2 from dept
3 ,proj
4 ,emp
5 where emp.dept_id = dept.dept_id
6 and emp.proj_id = proj.proj_id
7 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=2 Card=1 Bytes=59)
1 0 NESTED LOOPS (Cost=2 Card=1 Bytes=59)
2 1 NESTED LOOPS (Cost=2 Card=1 Bytes=46)
3 2 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=2 Card=1 Bytes=33)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_C0023387' (INDEX (UNIQUE)) (Cost=0 Card=1 Bytes=13)
5 1 INDEX (UNIQUE SCAN) OF 'SYS_C0023391' (INDEX (UNIQUE)) (Cost=0 Card=1 Bytes=13)
Below we see the result when creating foreign keys NOVALIDATE. The Dunsle Joins return.
SQL> drop table emp;
Table dropped.
SQL> create table emp
2 (
3 emp_id number not null primary key
4 ,emp_name varchar2(10) not null unique
5 ,dept_id number not null
6 ,proj_id number not null
7 ,constraint emp_fk1 foreign key (dept_id) references dept novalidate
8 ,constraint emp_fk2 foreign key (proj_id) references proj novalidate
9 )
10 /
Table created.
SQL> set autotrace on
SQL> select emp.emp_name
2 from dept
3 ,proj
4 ,emp
5 where emp.dept_id = dept.dept_id
6 and emp.proj_id = proj.proj_id
7 /
no rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=2 Card=1 Bytes=59)
1 0 NESTED LOOPS (Cost=2 Card=1 Bytes=59)
2 1 NESTED LOOPS (Cost=2 Card=1 Bytes=46)
3 2 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=2 Card=1 Bytes=33)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_C0023419' (INDEX (UNIQUE)) (Cost=0 Card=1 Bytes=13)
5 1 INDEX (UNIQUE SCAN) OF 'SYS_C0023423' (INDEX (UNIQUE)) (Cost=0 Card=1 Bytes=13)
Thanks you guys. Kevin
[Updated on: Tue, 11 November 2008 11:52] Report message to a moderator
|
|
|
Re: Dunsel Joins [message #358560 is a reply to message #358472] |
Tue, 11 November 2008 20:39 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
That's fantastic - well spotted. Looks like they've overhauled the optimiser for 10.2 without telling us. I cannot find reference to that in the New Features guide (although I didn't search on "Dunsel").
I recently noticed another undocumented 10.2 improvement to good old-fashioned Partition Views. Partition Views (which pre-date partitioned tables) are UNION ALL views of tables with identical structure/indexes/constraints. They are still somewhat useful in a data warehouse as recently as 10.1 because Oracle could not PCT Fast Refresh UNION ALL Materialized Views.
The advantage of a genuine Partition View over any other UNION ALL view is that the optimiser can more effectively transform SQL joins to Partition Views.
As of 10.2, it would appear that the definition of a Partition View has been loosened. I created a UNION ALL of table A to a join of tables B and C; Oracle accepted it as a PV! Unthinkable in 10.1 and earlier.
This means that joins to UNION ALL views can avoid many of the performance potholes of the past.
Ross Leishman
|
|
|
Goto Forum:
Current Time: Tue Nov 26 02:42:22 CST 2024
|