join condition in index [message #547455] |
Wed, 14 March 2012 06:13 |
|
orarule
Messages: 11 Registered: March 2012
|
Junior Member |
|
|
Hi All,
First of many posts here!
For a hash join statement, is it beneficial to have the join condition objects in the index as well as the objects in the where clause?
Thanks!
|
|
|
|
Re: join condition in index [message #547471 is a reply to message #547465] |
Wed, 14 March 2012 06:54 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
If you have a SQL performing a Hash Join, then it is not likely to benefit by creating an index.
Usually an index will have no effect, because it is not used. If the index is used, then it is quite possible that it will make the query slower. Hash Joins are used for high data volumes and indexes are often worse than full table scans in these situations.
In the unlikely event that an index DOES improve the performance of such a join, a Nested Loops join with the join-key indexed would probably be faster still.
Ross Leishman
|
|
|
Re: join condition in index [message #547473 is a reply to message #547471] |
Wed, 14 March 2012 07:01 |
John Watson
Messages: 8964 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
I was thinking of this sort of thing:orcl> select /*+ use_hash(emp dept) */ ename,dname from emp natural join dept where empno=1000;
Execution Plan
----------------------------------------------------------
Plan hash value: 373232351
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 26 | 5 (20)| 00:00:01 |
|* 1 | HASH JOIN | | 1 | 26 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 13 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS STORAGE FULL | DEPT | 4 | 52 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
3 - access("EMP"."EMPNO"=1000)
|
|
|
Re: join condition in index [message #547476 is a reply to message #547471] |
Wed, 14 March 2012 07:12 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
rleishman wrote on Wed, 14 March 2012 22:54In the unlikely event that an index DOES improve the performance of such a join, a Nested Loops join with the join-key indexed would probably be faster still.
And that query would be better still (over volumes a bit greater than EMP/DEPT) with DEPT.DEPTNO indexed and an Indexed Nested Loops join.
I'm not saying it's impossible for an optimised query to use an index and a hash join (index joins being a good example), but without better information from the OP we are forced to work with probabilities, and the probability is that an index won't improve hash join.
Ross Leishman
|
|
|
|
Re: join condition in index [message #547481 is a reply to message #547478] |
Wed, 14 March 2012 07:19 |
John Watson
Messages: 8964 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
I'm sure Ross is right.
But as you still have not bortered to describe the indexes or the tables or the query, I don't think it is possible to take this further.
|
|
|
|
Re: join condition in index [message #547572 is a reply to message #547484] |
Wed, 14 March 2012 15:39 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Questions\: Answer all of them.
- How many rows in CRMD_ORDER_INDEX?
- How many of those rows match access("T_00"."CLIENT"=:A1 AND "T_00"."PROCESS_TYPE_IX"=:A2 AND "T_00"."OBJECT_TYPE"=:A3)
- How many rows in CRM_JEST
- How many of them match access("T_01"."MANDT"=:A0 AND "T_01"."STAT"=:A6 AND "T_01"."INACT"=:A5)
- How many rows does the join produce if you remove the DISTINCT?
- How many rows does the SQL produce with the DISTINCT?
- What is your current initialization parameter PGA_AGGREGATE_TARGET (use command SHOW PARAMETER PGA_AGGREGATE_TARGET) in SQL*Plus)
Ross Leishman
|
|
|
|
|
|
|
|
Re: join condition in index [message #547589 is a reply to message #547588] |
Wed, 14 March 2012 23:10 |
|
BlackSwan
Messages: 26766 Registered: January 2009 Location: SoCal
|
Senior Member |
|
|
>Used code below, still get the error.
sorry to hear that.
It would not throw error, if you used valid syntax like below
SELECT crmd_order_index.header
FROM crmd_order_index,
crm_jest
WHERE crm_jest.objnr = crmd_order_index.header
AND crmd_order_index.process_type_ix = 'ZCWR'
AND crmd_order_index.object_type = 'BUS2000223'
AND crmd_order_index.stat_open = 'X'
AND crm_jest.stat = 'I1002';
[Updated on: Wed, 14 March 2012 23:22] Report message to a moderator
|
|
|
Re: join condition in index [message #547594 is a reply to message #547589] |
Wed, 14 March 2012 23:25 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Your filter criteria in each of those tables selects 2% in one case and 6% in the other. These are in the "break-even" zone. It might be quicker to perform a full-table scan, but probably not by much. Or it might be quicker to perform the index scan, but probably not by much.
On balance, the index scan - even if it is a bit faster - is damaging the performance of other queries by hogging your Buffer Cache, so I would go with Full Table Scans.
I am also concerned that the join is non-unique. The join is access CRM_JEST.OBJNR = CRMD_ORDER_INDEX.HEADER.
- Is OBJNR unique in CRM_JEST?
- Is HEADER unique in CRMD_ORDER?
Unless ONE of these is true, you could be blowing out the join into millions of rows prior to the DISTINCT. If this is the case, the DISTINCT will end up using a lot of memory and eventually page to TEMP. This creates IO. This takes time.
You need to find out ho many rows are in the join before the DISTINCT.
Ross Leishman
|
|
|
|
Re: join condition in index [message #547599 is a reply to message #547598] |
Wed, 14 March 2012 23:50 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
So the join key is non-unique. Yet you are joining 4M rows to 650K rows and getting 111K back. That means there must be heaps of non-matching rows in both result sets.
Try this
SELECT DISTINCT HEADER
FROM CRMD_ORDER_INDEX
WHERE (HEADER) IN (
SELECT
FROM CRM_JEST
WHERE MANDT = :A0
AND INACT = :A5
AND STAT = :A6
)
AND CLIENT = :A1
AND PROCESS_TYPE_IX = :A2
AND OBJECT_TYPE = :A3
AND STAT_OPEN = :A4
Ross Leishman
|
|
|
|
Re: join condition in index [message #547625 is a reply to message #547623] |
Thu, 15 March 2012 03:38 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
A range-scan just means that an index scan could return more than one row. If that happens to be 2 rows, or 5 rows, it's going to be pretty quick. But mot if it is 1M rows.
The uniqueness applied to the JOIN KEY, not to the columns you were filtering. If the join key was unique in one of those tables, then the join will return no more rows than the number selected from that table. However if the join key is non-unique in BOTH tables, then the number of rows returned could be as many as a CROSS PRODUCT of the number selected from each table. It's probably nowhere near that bad, but it could be bad enough to use up a lot of TEMP space.
In your case, although the join key was not unique, the number of rows in the join-result was bad but not catastrophic.
Did the alternate syntax work? If so, the reason is probably because the IN performs an implicit DISTINCT on the sub-query before joining to the outer-query. This makes the join key unique and prevents the cross-product behavior.
With non-join predicates in an INNER JOIN, it doesn't make any difference whether they are in the JOIN clause or the WHERE clause.
Ross Leishman
|
|
|