Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Removing duplicated rows revisited - I'm feeln' mighty lame

Re: Removing duplicated rows revisited - I'm feeln' mighty lame

From: James <james_lorenzen_at_allianzlife.com>
Date: Thu, 18 Mar 1999 15:40:58 GMT
Message-ID: <7cr6q8$oqd$1@nnrp1.dejanews.com>


In article <36ee69fc.4304658_at_usenet.acw.vcu.edu>,   bgwillia_at_vcu.edu (Boyce G. Williams, Jr.) wrote:
> Hi,
>
> I found a table where there is a lot of duplicated telephone calls in
> it. To be more precise: 34,000 duplicate calls from a table containing

  [snip]
>

Boyce, if you can use PL*SQL the duplicate deletion can be accomplished without creating a separate table.

The following is a piece of code that I've used to remove large duplicate sets. I use a commit every 10,000 rows deleted. If that is not a concern, you could pull the commit logic.

Please note that the last line of the where clause si a not equal, this will leave the desired row on the table. You also need to "set server output on size 1000000" to have the "DBMS_OUTPUT.PUT_LINE" functioning.

DECLARE CURSOR dup_csr IS SELECT person_id , date_called , origin_phone ,  destination_phone , duration_minutes , duration_seconds , MIN(date_posted) AS date_posted FROM call_table GROUP BY person_id,

date_called, origin_phone, destination_phone ,	duration_minutes,
duration_seconds  HAVING Count(*)>1  ;	delete_cnt  NUMBER := 0 ; 
delete_commit NUMBER := 0 ; BEGIN DBMS_OUTPUT.PUT_LINE('Start at ' || TO_CHAR(SYSDATE,'hh24:mi:ss')) ; FOR dup_rec IN dup_csr LOOP DELETE FROM call_table WHERE person_id = duprec.person_id AND date_called = duprec.date_called AND origin_phone = duprec.origin_phone AND destination_phone = duprec.destination_phone AND duration_minutes = duprec.duration_minutes AND duration_seconds = duprec.duration_seconds AND date_posted <> duprec.date_posted; delete_cnt := delete_cnt + SQL%ROWCOUNT ; -- this is the rows deleted delete_commit := delete_commit + SQL%ROWCOUNT ; IF delete_commit >= 10000 THEN COMMIT ; DBMS_OUTPUT.PUT_LINE('Commit taken at ' || TO_CHAR(SYSDATE,'hh24:mi:ss') || ' rows = ' || delete_commit || ' so far ' || delete_cnt || ' rows deleted') ; delete_commit := 0 ; END IF ; END LOOP ; DBMS_OUTPUT.PUT_LINE('End at' || TO_CHAR(SYSDATE,'hh24:mi:ss') || ' ' || delete_cnt || ' rows deleted') ; END ; -----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own Received on Thu Mar 18 1999 - 09:40:58 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US