Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: difference between hash semi join and merge semi join

Re: difference between hash semi join and merge semi join

From: <samuels_at_red.seas.upenn.edu>
Date: Mon, 7 Jun 2004 14:51:04 +0000 (UTC)
Message-ID: <ca1vco$6h2k$1@netnews.upenn.edu>


Thanks Jonathan, i appreciate your help. -maurice
samuels_at_seas.upenn.edu

Jonathan Lewis (jonathan_at_jlcomp.demon.co.uk) wrote:

: The semi-join operator is a modified join
: used for existence (you IN subquery has
: been turned into an EXISTS subquery).

: Once the subquery has been turned into
: a join, Oracle can choose between a
: hash, merge or NL - it doesn't really
: matter that it is a semi- rather than normal
: join.

: The performance tuning guide and ref
: is quite good on describing the joins
: in the 9.2 version (although I think it
: may have two descriptions of the hash,
: one of which is back to front).

: Merge:
: The two data sets have to be sorted
: on the join columns. The presence of
: a suitable index on the first input means
: it may not need a sort operation. The
: second input always goes through a
: sort, even if it is acquired from a suitable
: index. Oracle then walks the two sorted
: sets in tandem, identifying the overlaps.

: Hash join
: The first data set is distributed into a hash
: table, ideally in memory, based on hashing
: the join columns. For large data sets, the
: hash table will probably overflow to disc.
: The second data set is acquired; and for
: each row the join columns are hashed, so
: that a quick lookup into the hash table can
: place only for possible matches.

: Dependent on resources, data volumes, or
: data distributions, either method could be
: faster.

: The problem with the explain plan is that
: Oracle has produced an arithmetic model
: that doesn't match the actual data sets. It
: happens for a variety of reasons - often
: down to inadequate statistics.

: --
: Regards

: Jonathan Lewis

: http://www.jlcomp.demon.co.uk

: http://www.jlcomp.demon.co.uk/faq/ind_faq.html
: The Co-operative Oracle Users' FAQ

: http://www.jlcomp.demon.co.uk/seminar.html
: Optimising Oracle Seminar - schedule updated May 1st

: <samuels_at_red.seas.upenn.edu> wrote in message
: news:c9q4p2$386j$1_at_netnews.upenn.edu...
: > hi,
: > can someone explain to me the difference between a hash semi join and a
: merge semi join?
: > the version of oracle i'm running is Oracle9i Enterprise Edition Release
: 9.2.0.4.0 on SunOS 5.8.
: > the problem i'm having is that this query ran within minutes on 8.0.6
: (same os):
: > select customer_number, customer_id, customer_seq_number from
: customer_table_1 where customer_id in
: > (select distinct customer_id from customer_table_2);
: >
: > customer_table_1 has about 5 million rows and the primary key on this
: table is customer_number, customer_id, and customer_seq_number.
: > customer_table_2 has about 3 million rows and the primary key contains the
: customer_id columns (the 2nd key value in the index).
: > when i do an explain plan on the query (in 9.2.0.4.0), a hash semi-join is
: used and the query runs forever (about 4 hours, before i kill it).
: > there are about 14 distinct customer_id's and the result set for the query
: contains about 1 million rows.
: > when i change the above query to the following:
: > select customer_number, customer_id, customer_seq_number from
: customer_table_1 where customer_id in
: > (select /*+merge_sj*/ customer_id from customer_table_2);
: >
: > the query runs within 2 minutes (when i do an explain plan, the cost is
: about 5 times higher than when oracle was using a hash semi-join).
: > so i am just wondering, what exactly is going on here?
: > if anyone can point me to any websites or documentation that clearly
: explains the difference between hash and merge, that would be great.
: > and what's the deal with the explain plan? why is oracle choose a bad
: access method (i mean i know the hash has a lower value, but it clearly
: > is not more efficient)?
: >
: > thanks in advance.
: > -maurice
: >
: > samuels_at_seas.upenn.edu
Received on Mon Jun 07 2004 - 09:51:04 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US