How does one eliminate duplicates rows from a table?
Submitted by admin on Wed, 2004-08-04 15:06
(contributed by Dennis Gurnick)
Body:
Choose one of the following queries to identify or remove duplicate rows from a table leaving only unique records in the table:
Method 1:
SQL> DELETE FROM table_name A WHERE ROWID > ( 2 SELECT min(rowid) FROM table_name B 3 WHERE A.key_values = B.key_values);
Delete all rowids that is BIGGER than the SMALLEST rowid value (for a given key).
Method 2:
SQL> create table table_name2 as select distinct * from table_name1; SQL> drop table table_name1; SQL> rename table_name2 to table_name1;
This method is usually faster. However, remember to recreate all indexes, constraints, triggers, etc. on the table when done.
Method 3:
(contributed by Dennis Gurnick)
SQL> delete from my_table t1 SQL> where exists (select 'x' from my_table t2 SQL> where t2.key_value1 = t1.key_value1 SQL> and t2.key_value2 = t1.key_value2 SQL> and t2.rowid > t1.rowid);
Note: One can eliminate N^2 unnecessary operations by creating an index on the joined fields in the inner loop (no need to loop through the entire table on each pass by a record). This will speed-up the deletion process.
Note 2: If you are comparing NOT-NULL columns, use the NVL function. Remember that NULL is not equal to NULL. This should not be a problem as all key columns should be NOT NULL by definition.
»
- Log in to post comments
Comments
What about:
Remove duplicate rows
Remove duplicate Rows (modified)
This query removes rows based on the column names specified in the GROUP BY clause. If you specify only one column name it will remove all duplicate records for that column. If you want to delete exact replica's of the same row - use all the column names in the GROUP BY.
Using dense_rank()
Another example using the dense_rank() function:
Using dense_rank()
Col1 and col2 are the primary key columns. To provide the correct SQL to use alias name "ln" for f dense_rank in the statement:
deleting duplicate rows from a table
This statement deletes rows from the emp table where duplicate values of last_name appear.
This Command will delete all duplicate rows
Another Option
It will be helpful especially when deletion is based on the date column:
DELETE FROM Table t1 WHERE COLUMN_NAME < (SELECT MAX(COLUMN_NAME) FROM Table T2 WHERE T1.COL= T2.COL);
Eliminte Duplicate record
DELETE FROM EMP13 WHERE EMPNO=(SELECT empno FROM emp13 WHERE ROWID NOT IN (SELECT MIN(ROWID) FROM emp13 GROUP BY EMPNO));