Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.misc -> Re: Selecting SIMILAR, not the same records (PROBABLE) duplicates
>
That is a good point though.
What finally solved the case for me, is (adding city as an additional column for example):
select min(t1.id) from test t1, test t2
where t2.name like '%' || t1.name || '%'
and soundex(t2.city)=soundex(t1.city)
--and some other conditions
and t2.id != t1.id group by t1.name, soundex(t1.city)
This way I'm able to receive the ids of 'parents' for all other duplicates.
Referring to the table you gave as an example, this query returns rows 1, 5 and 8, and that I'm entirely happy with.
Now, having the ids of rows that are kind of 'origins' for other duplicates, I can get the others easily..
Thanks again to all of you for all this discussion.. It did really good to me :)
Another thing is that the query is quite slow when going through a table
with 500k records..
But here I'm going to play with reverse index function as someone suggested
in the other post..
Best regards,
Kroger
Received on Fri Sep 15 2006 - 16:14:34 CDT
![]() |
![]() |