Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> difference between hash semi join and merge semi join
hi,
can someone explain to me the difference between a hash semi join and a merge semi join?
the version of oracle i'm running is Oracle9i Enterprise Edition Release 9.2.0.4.0 on SunOS 5.8.
the problem i'm having is that this query ran within minutes on 8.0.6 (same os):
select customer_number, customer_id, customer_seq_number from customer_table_1 where customer_id in
(select distinct customer_id from customer_table_2);
customer_table_1 has about 5 million rows and the primary key on this table is customer_number, customer_id, and customer_seq_number.
customer_table_2 has about 3 million rows and the primary key contains the customer_id columns (the 2nd key value in the index).
when i do an explain plan on the query (in 9.2.0.4.0), a hash semi-join is used and the query runs forever (about 4 hours, before i kill it).
there are about 14 distinct customer_id's and the result set for the query contains about 1 million rows.
when i change the above query to the following:
select customer_number, customer_id, customer_seq_number from customer_table_1 where customer_id in
(select /*+merge_sj*/ customer_id from customer_table_2);
the query runs within 2 minutes (when i do an explain plan, the cost is about 5 times higher than when oracle was using a hash semi-join). so i am just wondering, what exactly is going on here? if anyone can point me to any websites or documentation that clearly explains the difference between hash and merge, that would be great. and what's the deal with the explain plan? why is oracle choose a bad access method (i mean i know the hash has a lower value, but it clearly is not more efficient)?
thanks in advance.
-maurice
samuels_at_seas.upenn.edu Received on Fri Jun 04 2004 - 10:33:54 CDT