Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: Join Two Extremely Large Tables
How are you planning to use the 3-4 Billion rows that you generate ? This is probably a critical issue than the mechanism for generating them.
If you hope to do some sort of grouping or ordering then that may dictate the method you HAVE to use on the join.
Why do you think your index would be more than 10 levels deep ? How long is the index key ? At a generous 80 bytes, with an 8K block size, you get 100 entries per node so:
Height = 1 => 100 table rows Height = 2 => 10,000 table rows Height = 3 => 1,000,000 table rows Height = 4 => 100,000,000 table rows Height = 5 => 10,000,000,000 table rows
In real life, your index is more likely to be height 4.
Remember that the ordering of the result set from a hash join is effectively governed by the second table. If you want the 15 B rows for each A row in one shot to process them together, then you can't do a hash join unless you are prepared to do a sort of the full result set afterwards.
-- Regards Jonathan Lewis http://www.jlcomp.demon.co.uk The educated person is not the person who can answer the questions, but the person who can question the answers -- T. Schick Jr One-day tutorials: http://www.jlcomp.demon.co.uk/tutorial.html ____UK_______April 22nd ____USA_(FL)_May 2nd ____Denmark__May 21-23rd ____Sweden___June ____Finland__September ____Norway___September Three-day seminar: see http://www.jlcomp.demon.co.uk/seminar.html ____UK_(Manchester)_May x 2 ____Estonia___June (provisional) ____Australia_June (provisional) ____USA_(CA, TX)_August The Co-operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/faq/ind_faq.html "Bass Chorng" <bchorng_at_yahoo.com> wrote in message news:bd9a9a76.0304161946.4c0f5cbd_at_posting.google.com...Received on Thu Apr 17 2003 - 02:29:00 CDT
> I have 2 tables A and B. Table A contains 86 million rows, about 4.5
GB
> in size. Table B contains 1.2 billion rows, about 25 GB in size.
>
> For each row in table A, I need to query table B, which is indexed,
> returning in average of 15 rows and process them.
>
> If I use foreach cursor and loop thru table A, the result will be
> probably at least 3-4 billion IOs. Since in reality, my index depth
> is probably far more than 10 levels, I would expect total IOs to be
> near 20 billions.
>
> Basically, this is an enormous index range scan.
>
> The # of IO is not a problem, but the time it takes is. This process
> would take 6-7 hours.
>
> Would if be feasible for me to hash join these 2 giant tables
> and not use index at all ? Is it faster this way ?
> What about my temp segment ? I would either give it a
> hash_area_size of 10 GB (not even sure if Oracle accepts that)
> or prepare a temp segment that size, or my process would
> fail.
>
> Your advice is appreciated.
![]() |
![]() |