Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: How to find Duplicate records
You will have to create a kind of key, that will give you an id which gives
you an id
which records look similar (the key is the same).
after doing that you compare various fields off those records with each
other giving points
for each field that match. The amount off points for a field depends on the
importance of that field (ex. a matching telephone field is more sure than a
matching street). After setting a max and min boundary you can see which
records are doubles, which aren't and which are the maybe's. The initial key
and the setting off the points and boundaries are process of trial and
error.
example
for the key we take the first 4 letters off the surname, the first 2 of the firstname and the first 5 of the street. we see that the key is the same for the records 1, 2 and 3
5 and6
conclusion: record 1 and 2 is a double
record 3 is unique surname firstname street housekey
dhooge tim aardeken 2 dhootiaarde dhooge tin aardeken 2 dhootiaarde dhoor timothy aardeweg 9 dhootiaarde dhooghe jan aardeken dhoojaaarde jansen jan nijverheid 15 jansjanijve janssen jan nijverheid 15jansjanijve
"Shahid Mahmood" <Shahid.Mahmood_at_team.telstra.com> wrote in message
news:2a89f9.0107232026.481337f_at_posting.google.com...
> Hi
>
> I am trying to find out the duplicate records in the table which
> contains over 10 million records. Could you please let me know the
> easiest and quickest way to get it done.
>
> Regards
>
>
> Shahid
Received on Tue Jul 24 2001 - 14:56:26 CDT
![]() |
![]() |