Home » Server Options » Text & interMedia » Problem with Fuzzy Query (10g)
Problem with Fuzzy Query [message #350357] |
Wed, 24 September 2008 16:59 |
formsdev
Messages: 6 Registered: June 2005
|
Junior Member |
|
|
Hi,
I wrote the following query.
my indexes look like this. I have the lexer so that it takes away any . - , etc.
v_indexscript := 'create index fuzzy_idx_addr1 on gp(address1) indextype is ctxsys.context parameters (''WORDLIST Stem_Fuzzy_pref LEXER Fuzzy_Lexer'')';
c:= dbms_sql.open_cursor;
dbms_sql.parse(c,v_indexscript, dbms_sql.native);
n := dbms_sql.execute(c);
dbms_sql.close_cursor(c);
I have similar indexes on all the columns of my where clause.
This is the SQL where I pass everything as parameters including the score and variations.
select col1, col2 ......score(1)
from gp
where CONTAINS(firstname, 'fuzzy('||pFirstname||', '||pScore||','||pVariants||',weight)',1)>0
and CONTAINS(lastname, 'fuzzy('||pLastname||', '||pScore||','||pVariants||', weight)',2)> 0
and (pAddress1 is null or CONTAINS(address1, 'fuzzy('||replace(paddress1,' ','')||', '||pScore||','||pVariants||', weight)',3)>0)
and (pAddress2 is null or CONTAINS(address2, 'fuzzy('||replace(pAddress2,' ','')||', '||pScore||','||pVariants||', weight)',4)>0)
and (pCity is null or CONTAINS(city, 'fuzzy('||replace(pCity,' ','')||', '||pScore||','||pVariants||', weight)',5)>0)
and (pState is null or CONTAINS(state, 'fuzzy('||pState||', '||pScore||','||pVariants||', weight)',6)>0)
and (pZip is null or CONTAINS(zipcode, 'fuzzy('||pZip||', '||pScore||','||pVariants||', weight)',7)>0)
and (pPhone is null or CONTAINS(phone1, 'fuzzy('||replace(pPhone,' ','')||', '||pScore||','||pVariants||', weight)',8)>0)
and (pEmail is null or CONTAINS(email, 'fuzzy('||replace(pEmail,' ','')||', '||pScore||','||pVariants||', weight)',9)>0) ;
** I am using replace because the lexer doesnt seem to take away any spaces.
My table has this data
First Last addr1 addr2 city state zip phone email
SS VV 21.First St Los angeles CA 90001 123-456-7890 ss@yahoo.com
SS VV
SS VV
SS VV
I would like the above query to return all the 4 rows when I pass it SS,VV and 21.First St as pFirstname, pLastname and pAddress1.
Is there a way to do that?
thanks
|
|
|
Re: Problem with Fuzzy Query [message #350674 is a reply to message #350357] |
Thu, 25 September 2008 21:11 |
|
Barbara Boehmer
Messages: 9101 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Why are you creating your indexes dynamically? This is generally unnecessary and a bad practice.
I am not sure what you mean about your lexer. I don't know if you have specified the period, hyphen, and comma as whitespace or printjoins or punctuation or what. It might help if you provided the code that you used to set the attributes for your lexer. That might be part of your problem.
You should be using a multi_column_datastore with one index and one contains clause using WITHIN. Multiple contains clauses may result in slow performance or errors.
I have provided a partial demo below of what you seem to need. You should be able to modify it to add your other fields.
SCOTT@orcl_11g> -- table:
SCOTT@orcl_11g> CREATE TABLE gp
2 (first VARCHAR2 ( 5),
3 last VARCHAR2 ( 9),
4 addr1 VARCHAR2 (11),
5 addr2 VARCHAR2 ( 5),
6 city VARCHAR2 (11),
7 state VARCHAR2 ( 5),
8 zip VARCHAR2 (10),
9 phone VARCHAR2 (12),
10 email VARCHAR2 (12),
11 searchcols VARCHAR2 ( 1))
12 /
Table created.
SCOTT@orcl_11g> -- data:
SCOTT@orcl_11g> INSERT ALL
2 INTO gp (first, last) VALUES ('SS', 'VV')
3 INTO gp VALUES ('SS', 'VV', '21 First St', NULL, 'Los Angeles', 'CA', '90001', '123-456-7890', 'SS@yahoo.com', NULL)
4 INTO gp (first, last) VALUES ('forms', 'dev')
5 INTO gp (first, last) VALUES ('form', 'developer')
6 INTO gp (first) VALUES ('farms')
7 SELECT * FROM DUAL
8 /
5 rows created.
SCOTT@orcl_11g> -- basic_wordlist, multi_column_datastore AND basic_section_group:
SCOTT@orcl_11g> begin
2 ctx_ddl.create_preference('STEM_FUZZY_PREF', 'BASIC_WORDLIST');
3 ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_MATCH','ENGLISH');
4 ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_SCORE','0');
5 ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_NUMRESULTS','5000');
6 ctx_ddl.set_attribute('STEM_FUZZY_PREF','SUBSTRING_INDEX','TRUE');
7 ctx_ddl.set_attribute('STEM_FUZZY_PREF','STEMMER','ENGLISH');
8 end;
9 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> BEGIN
2 CTX_DDL.CREATE_PREFERENCE ('your_multi', 'MULTI_COLUMN_DATASTORE');
3 CTX_DDL.SET_ATTRIBUTE ('your_multi', 'COLUMNS', 'first, last, addr1');
4 END;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> BEGIN
2 CTX_DDL.CREATE_SECTION_GROUP ('your_sec_group', 'BASIC_SECTION_GROUP');
3 CTX_DDL.ADD_FIELD_SECTION ('your_sec_group', 'first', 'first', TRUE);
4 CTX_DDL.ADD_FIELD_SECTION ('your_sec_group', 'last', 'last', TRUE);
5 CTX_DDL.ADD_FIELD_SECTION ('your_sec_group', 'addr1', 'addr1', TRUE);
6 END;
7 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> -- index:
SCOTT@orcl_11g> CREATE INDEX fuzzy_idx_addr1 ON gp (searchcols)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS
4 ('WORDLIST stem_fuzzy_pref
5 DATASTORE your_multi
6 SECTION GROUP your_sec_group')
7 /
Index created.
SCOTT@orcl_11g> -- procedure:
SCOTT@orcl_11g> CREATE OR REPLACE PROCEDURE your_proc
2 (presult OUT SYS_REFCURSOR,
3 pfirst IN VARCHAR2 DEFAULT NULL,
4 plast IN VARCHAR2 DEFAULT NULL,
5 paddress1 IN VARCHAR2 DEFAULT NULL,
6 paddress2 IN VARCHAR2 DEFAULT NULL)
7 AS
8 BEGIN
9 OPEN presult FOR
10 SELECT gp.*, SCORE (1)
11 FROM gp
12 WHERE CONTAINS
13 (searchcols,
14 RTRIM
15 (NVL2 (pfirst, 'FUZZY (' || REPLACE (pfirst, ' ', ') FUZZY (') || ') WITHIN first,' , NULL) ||
16 NVL2 (plast, 'FUZZY (' || REPLACE (plast, ' ', ') FUZZY (') || ') WITHIN last ,' , NULL) ||
17 NVL2 (paddress1, 'FUZZY (' || REPLACE (paddress1, ' ', ') FUZZY (') || ') WITHIN addr1,' , NULL) ||
18 NVL2 (paddress2, 'FUZZY (' || REPLACE (paddress2, ' ', ') FUZZY (') || ') WITHIN addr2' , NULL),
19 ','),
20 1) > 0
21 ORDER BY SCORE (1) DESC;
22 END your_proc;
23 /
Procedure created.
SCOTT@orcl_11g> SHOW ERRORS
No errors.
SCOTT@orcl_11g> -- execution:
SCOTT@orcl_11g> VARIABLE gresults REFCURSOR
SCOTT@orcl_11g> SET AUTOPRINT ON
SCOTT@orcl_11g> EXEC your_proc (:gresults, 'SS', 'VV', '21.First St')
PL/SQL procedure successfully completed.
FIRST LAST ADDR1 ADDR2 CITY STATE ZIP PHONE EMAIL S SCORE(1)
----- --------- ----------- ----- ----------- ----- ---------- ------------ ------------ - ----------
SS VV 21 First St Los Angeles CA 90001 123-456-7890 SS@yahoo.com 68
SS VV 35
SCOTT@orcl_11g> EXEC your_proc (:gresults, 'forms', 'dev')
PL/SQL procedure successfully completed.
FIRST LAST ADDR1 ADDR2 CITY STATE ZIP PHONE EMAIL S SCORE(1)
----- --------- ----------- ----- ----------- ----- ---------- ------------ ------------ - ----------
forms dev 52
farms 2
form developer 2
SCOTT@orcl_11g>
|
|
|
Goto Forum:
Current Time: Sun Dec 22 23:05:39 CST 2024
|