How does one eliminate duplicates rows from a table?

Body: 

Choose one of the following queries to identify or remove duplicate rows from a table leaving only unique records in the table:

Method 1:

SQL> DELETE FROM table_name A WHERE ROWID > (
  2    SELECT min(rowid) FROM table_name B
  3    WHERE A.key_values = B.key_values);

Delete all rowids that is BIGGER than the SMALLEST rowid value (for a given key).

Method 2:

SQL> create table table_name2 as select distinct * from table_name1;
SQL> drop table table_name1;
SQL> rename table_name2 to table_name1;

This method is usually faster. However, remember to recreate all indexes, constraints, triggers, etc. on the table when done.

Method 3:


(contributed by Dennis Gurnick)

SQL> delete from my_table t1
SQL> where  exists (select 'x' from my_table t2
SQL>                 where t2.key_value1 = t1.key_value1
SQL>                   and t2.key_value2 = t1.key_value2
SQL>                   and t2.rowid      > t1.rowid);

Note: One can eliminate N^2 unnecessary operations by creating an index on the joined fields in the inner loop (no need to loop through the entire table on each pass by a record). This will speed-up the deletion process.

Note 2: If you are comparing NOT-NULL columns, use the NVL function. Remember that NULL is not equal to NULL. This should not be a problem as all key columns should be NOT NULL by definition.

Comments

What about:

DELETE table_name
WHERE  rowid IN
       ( SELECT LEAD(rowid) OVER
                (PARTITION BY key_values ORDER BY NULL)
         FROM   table_name );

delete from <table_name> where rowid not in 
                   ( select min(rowid) 
                     from exp group by column1..,column2,...column3..);

This query removes rows based on the column names specified in the GROUP BY clause. If you specify only one column name it will remove all duplicate records for that column. If you want to delete exact replica's of the same row - use all the column names in the GROUP BY.

delete from <table_name> 
       where rowid not in ( select min(rowid) 
                            from <Table_name> 
                            group by column1..,column2,...column3..)

Another example using the dense_rank() function:

delete from table_name
where rowid in
(select rn from
(select rowid rn, dense_rank() over (partition by col1, col2, ..order by rowid) from table_name ) where rn <> 1 
)

Col1 and col2 are the primary key columns. To provide the correct SQL to use alias name "ln" for f dense_rank in the statement:

delete from <table_name> 
where rowid in (select rn from (Select rowid rn,
                                dense_rank() over (partition by col1,col2.. order by rowid) ln 
                                from <table_name>) 
                 where ln <> 1)

This statement deletes rows from the emp table where duplicate values of last_name appear.

delete from emp e

 where empno in(select empno from emp d 
                 where d.last_name=e.last_name

                minus

                select empno from emp f  
                 where f.last_name=e.last_name
                   and rownum=1)

delete  from emp where ('Delete',sal)  in (
select case when count(*)>1 then 'Delete' else 'No' end DeleteFlag ,sal from emp group by sal having count(*)>1)

It will be helpful especially when deletion is based on the date column:

DELETE FROM Table t1 WHERE COLUMN_NAME < (SELECT MAX(COLUMN_NAME) FROM Table T2 WHERE T1.COL= T2.COL);

DELETE FROM EMP13 WHERE EMPNO=(SELECT empno FROM emp13 WHERE ROWID NOT IN (SELECT MIN(ROWID) FROM emp13 GROUP BY EMPNO));