Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: help with deleting duplicate records from very large table
Suhen
-
<SPAN
class=157215522-12092001>
<SPAN
class=157215522-12092001>Wouldn't it be more efficient to have something
like:
<SPAN
class=157215522-12092001>
delete
from invaudee i1
where rowid not in
<SPAN
class=157215522-12092001> (select min(rowid)
<SPAN
class=157215522-12092001> from invaudee i2
<SPAN
class=157215522-12092001> where i1.unique_value =
i2.unique_value);
<SPAN
class=157215522-12092001>
You
have to do a full table scan of the table as part of the delete. With the
query I proposed, if you have an index on invaudee then it should be used to do
a range scan for the rowid. If you use the inner select you have it will
do a full table scan for every row in invaudee.
<SPAN
class=157215522-12092001>
I just
tried it on one of my tables, here are my explain plans :
<SPAN
class=157215522-12092001>
for
delete from test_table t1 where rowid not
in (select
min(rowid) from test_table
t2 group by
unique_value_1,unique_value_2)
<SPAN
class=157215522-12092001>
<SPAN
class=157215522-12092001>EXPLAIN_PLAN OPT------------------------------------------------------------------------------------------------------------------------------1.0 DELETE STATEMENT 1 Cost=
n/aCHOOSE 2.1 DELETE TEST_TABLE
TEST_TABLE 4.2 FILTER 5.1 SORT GROUP BY 6.1 TABLE ACCESS
class=157215522-12092001>EXPLAIN_PLAN OPT------------------------------------------------------------------------------------------------------------------------------1.0 DELETE STATEMENT 3 Cost=
n/aCHOOSE 2.1 DELETE TEST_TABLE
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">List,
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I need to delete duplicate records
from a very large table (60 millions records +).
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">There would be about 3 million
duplicate entries.
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">What is the quickest way to do
this?
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">The syntax that I am using is
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">delete<FONT
face=Arial> from
invaudee
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">where<FONT
face=Arial> rowid not in (select min(rowid)
from invaudee
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">group<FONT
face=Arial> by
audit_number);
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">This is taking a long time to run.
I cannot see any entries in v$transaction
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">for the
delete.
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">There is no
indexes on the INVAUDEE table.
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I created an index on the primary
key column but it still takes forever to run.
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I do not have the space to
CTAS.
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Or should I write the duplicates
to an EXCEPTIONS table and perform the delete based on the entries
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">in<FONT
face=Arial> the EXCEPTIONS
table.
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Any help would be greatly
appreciated.
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Suhen
<SPAN
style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
Received on Wed Sep 12 2001 - 18:09:44 CDT
![]() |
![]() |