Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Detecting typos
Hi,
I'm working on a project which involves matching customers against an existing database using (amonst other things) address, date of birth and name.
One problem I have to overcome is allowing for typographical errors. E.g. matching "CATHERINE" with "CSTHERINE". So what I though I'd do is have a little function to compare 2 strings and return the number of differences:
CREATE OR REPLACE function string_compare
(string1 IN VARCHAR2,
string2 IN VARCHAR2) RETURN NUMBER IS
diffs BINARY_INTEGER := 0;
BEGIN
IF least(length(string1),length(string2)) > 0 THEN
FOR i IN 1..least(length(string1),length(string2)) LOOP
IF substr(string1,i,1) <> substr(string2,i,1) THEN
diffs := diffs + 1;
END IF;
END LOOP;
END IF;
RETURN diffs;
END;
/
This works ok (I've got to watch out for comparing MARY and MARK etc, though), but doesn't handle a comparison where there might be a missing letter (e.g. "CATHERINE" and "CTHERINE"). Can anybody think of anything a little bit cleverer?
Also if anyone could recommend a book on the rationale/logic behind such customer matching/de-duplicating, I'd also be grateful. Thanks
CE Received on Wed Dec 15 2004 - 05:28:08 CST
![]() |
![]() |