Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: Removing duplicated rows revisited - I'm feeln' mighty lame

Re: Removing duplicated rows revisited - I'm feeln' mighty lame

From: Mike Burden <michael.burden_at_capgemini.co.uk>
Date: Thu, 18 Mar 1999 08:47:03 +0000
Message-ID: <36F0BD86.EFB309FD@capgemini.co.uk>


Would this work. Saves a group by, sort etc.All done in one step.

DELETE FROM CALL_TABLE
WHERE
 EXISTS

        ( SELECT null
            FROM CALL_TABLE Tmp
             where
              Tmp.person_id = person_id
              And Tmp.date_called = date_called
              And Tmp.destination_phone = destination_phone
              And Tmp.origin_phone = origin_phone
              And Tmp.duration_minutes = duration_minutes
              And Tmp.duration_seconds = duration_seconds
              and Tmp.rowid > rowid) ;


"Boyce G. Williams, Jr." wrote:

> Hi,
>
> I found a table where there is a lot of duplicated telephone calls in
> it. To be more precise: 34,000 duplicate calls from a table containing
> 1.5 million rows. Some were posted the same day as its duplicate,
> others were posted on different days. I wrote a series of queries to
> extract the first posted call of a duplicated set, delete all the
> duplicated calls, then reinsert the first call back in. My concern is
> the actual delete step: it uses a correlated subquery. Previous
> experience with our machine (a RS/6000 running Oracle) showed me it
> has a hard time working with correlated subqueries of any kind. At
> one point I ran a SELECT correlated subquery for six hours without it
> ever completing. When I broke up the query into two, the inner query
> being a VIEW and the outer query being a SELECT based on the VIEW, it
> took two hours and completed. Go figure.
>
> I'm presenting what I'm planning to do in the hopes someone may give
> advice on improving it. I'm mainly concerned about the DELETE step
> since every source I've found on deleting rows from one table based on
> another uses the correlated subquery method. Maybe someone knows a
> better way...
>
> --
> -- create a table where the first call of a duplicated set is
> -- stored based on the earliest posted date
> --
> DROP TABLE TEMP_TABLE
> CREATE TABLE TEMP_TABLE AS
> SELECT DISTINCT
> person_id ,
> date_called ,
> origin_phone ,
> destination_phone ,
> duration_minutes ,
> duration_seconds ,
> MIN(date_posted) AS date_posted
> FROM CALL_TABLE
> GROUP BY person_id ,
> date_called ,
> origin_phone ,
> destination_phone ,
> duration_minutes ,
> duration_seconds
> HAVING Count(*)>1 ;
> --
> -- Delete all duplicated calls from a table
> --
> DELETE FROM CALL_TABLE
> WHERE EXISTS
> ( SELECT *
> FROM CALL_TABLE Tmp
> GROUP BY Tmp.person_id ,
> Tmp.date_called ,
> Tmp.destination_phone ,
> Tmp.origin_phone ,
> Tmp.duration_minutes ,
> Tmp.duration_seconds
> HAVING Count(*)>1
> And Tmp.person_id = person_id
> And Tmp.date_called = date_called
> And Tmp.destination_phone = destination_phone
> And Tmp.origin_phone = origin_phone
> And Tmp.duration_minutes = duration_minutes
> And Tmp.duration_seconds = duration_seconds ) ;
> --
> -- Reinsert the first posted record
> --
> INSERT INTO CALL_TABLE
> ( person_id ,
> date_called ,
> origin_phone ,
> destination_phone ,
> duration_minutes ,
> duration_seconds ,
> date_posted )
> SELECT person_id ,
> date_called ,
> origin_phone ,
> destination_phone ,
> duration_minutes ,
> duration_seconds ,
> date_posted
> FROM TEMP_TABLE ;
> --
> -- end of procedure
> --
>
> Thanks in advance,
> Boyce G. Williams, Jr.
> Boyce G. Williams, Jr.
>
> .--------------------------------------------------------------------.
> | "People should have two virtues: purpose- the courage to envisage |
> | and pursue valued goals uninhibited by the defeat of infantile |
> | fantasies, by guilt and the failing fear punishment; and wisdom- a|
> | detached concern with life itself, in the face of death itself." |
> | Norman F. Dixon|
> '--------------------------------------------------------------------'
Received on Thu Mar 18 1999 - 02:47:03 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US