Special characters issues with CATSEARCH query [message #478314] |
Thu, 07 October 2010 18:13 |
lkngstr82is
Messages: 33 Registered: January 2010 Location: USA
|
Member |
|
|
We have names stored in a table with special character. for eg.
name SUSANB is stored as susan-b, su-sanb. When we run query using CATSEARCH, it doesn't bring all the results
CREATE TABLE bs_orgnl_sbmsn
(nm_tx VARCHAR2 (30))
/
Table created.
INSERT INTO bs_orgnl_sbmsn VALUES's-usanb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES'su-sanb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES('sus-anb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES'susa-nb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES('susan-b');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES('s.usanb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES('su.sanb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES('su.sanb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES('sus.anb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES('susa.nb');
/
1 row created.
INSERT INTO bs_orgnl_sbmsn VALUES ('susan.b');
/
1 row created.
Commit;
CREATE INDEX INDX_BS_ORGNL_SBMSN_SHRT
ON BS_ORGNL_SBMSN(NM_TX)
INDEXTYPE IS CTXSYS.CTXCAT PARALLEL;
select * FROM BS_ORGNL_SBMSN
Where catsearch (NM_TX,'SUSANB', '' ) > 0;
This will only give one result for SUSANB.
|
|
|
Re: Special characters issues with CATSEARCH query [message #478335 is a reply to message #478314] |
Thu, 07 October 2010 22:13 |
|
Barbara Boehmer
Messages: 9101 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
You can create a basic_lexer, set the skipjoins attribute for that lexer to include the hyphen and period, then use that lexer in your parameters during index creation. That will cause the strings on either side of those characters to be joined, ignoring the hyphen or period inbetween. You can check what is tokenized, indexed, and searchable by checking the dr$your_index_name$i domain index table that is created by indexing. Please see the reproduction and solution below.
-- table and data you provided:
SCOTT@orcl_11gR2> CREATE TABLE bs_orgnl_sbmsn
2 (nm_tx VARCHAR2 (30))
3 /
Table created.
SCOTT@orcl_11gR2> INSERT ALL
2 INTO bs_orgnl_sbmsn VALUES ('s-usanb')
3 INTO bs_orgnl_sbmsn VALUES ('su-sanb')
4 INTO bs_orgnl_sbmsn VALUES ('sus-anb')
5 INTO bs_orgnl_sbmsn VALUES ('susa-nb')
6 INTO bs_orgnl_sbmsn VALUES ('susan-b')
7 INTO bs_orgnl_sbmsn VALUES ('s.usanb')
8 INTO bs_orgnl_sbmsn VALUES ('su.sanb')
9 INTO bs_orgnl_sbmsn VALUES ('su.sanb')
10 INTO bs_orgnl_sbmsn VALUES ('sus.anb')
11 INTO bs_orgnl_sbmsn VALUES ('susa.nb')
12 INTO bs_orgnl_sbmsn VALUES ('susan.b')
13 SELECT * FROM DUAL
14 /
11 rows created.
-- reproduction of problem:
SCOTT@orcl_11gR2> CREATE INDEX INDX_BS_ORGNL_SBMSN_SHRT
2 ON BS_ORGNL_SBMSN (NM_TX)
3 INDEXTYPE IS CTXSYS.CTXCAT PARALLEL
4 /
Index created.
SCOTT@orcl_11gR2> SELECT dr$token FROM dr$indx_bs_orgnl_sbmsn_shrt$i
2 /
DR$TOKEN
----------------------------------------------------------------
ANB
ANB
B
B
NB
NB
SANB
SANB
SANB
SU
SU
SU
SUS
SUS
SUSA
SUSA
SUSAN
SUSAN
USANB
USANB
20 rows selected.
SCOTT@orcl_11gR2> select * FROM BS_ORGNL_SBMSN
2 Where catsearch (NM_TX,'SUSANB', '' ) > 0
3 /
no rows selected
-- solution:
SCOTT@orcl_11gR2> BEGIN
2 CTX_DDL.CREATE_PREFERENCE ('your_lexer', 'BASIC_LEXER');
3 CTX_DDL.SET_ATTRIBUTE ('your_lexer', 'SKIPJOINS', '-.');
4 END;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> DROP INDEX indx_bs_orgnl_sbmsn_shrt
2 /
Index dropped.
SCOTT@orcl_11gR2> CREATE INDEX INDX_BS_ORGNL_SBMSN_SHRT
2 ON BS_ORGNL_SBMSN (NM_TX)
3 INDEXTYPE IS CTXSYS.CTXCAT PARALLEL
4 PARAMETERS ('LEXER your_lexer')
5 /
Index created.
SCOTT@orcl_11gR2> SELECT dr$token FROM dr$indx_bs_orgnl_sbmsn_shrt$i
2 /
DR$TOKEN
----------------------------------------------------------------
SUSANB
SUSANB
SUSANB
SUSANB
SUSANB
SUSANB
SUSANB
SUSANB
SUSANB
SUSANB
SUSANB
11 rows selected.
SCOTT@orcl_11gR2> select * FROM BS_ORGNL_SBMSN
2 Where catsearch (NM_TX,'SUSANB', '' ) > 0
3 /
NM_TX
------------------------------
s-usanb
su-sanb
sus-anb
susa-nb
susan-b
s.usanb
su.sanb
su.sanb
sus.anb
susa.nb
susan.b
11 rows selected.
SCOTT@orcl_11gR2>
|
|
|
|
Re: Special characters issues with CATSEARCH query [message #478424 is a reply to message #478335] |
Fri, 08 October 2010 11:53 |
lkngstr82is
Messages: 33 Registered: January 2010 Location: USA
|
Member |
|
|
One more thing: How can I add space (like -,.) to the SKIPJOIN?
If I have record SUSAN B, how can I make CATSEARCH list it?
INSERT INTO bs_tx VALUES ('susan b');
/
1 row created.
commit;
select * FROM BS_TX
Where catsearch (NM_TX,'SUSANB', '' ) > 0
NM_TX
susan-b
s.usanb
su.sanb
su.sanb
sus.anb
susa.nb
susan.b
susan$b
it will not list susan b.
|
|
|
Re: Special characters issues with CATSEARCH query [message #478425 is a reply to message #478424] |
Fri, 08 October 2010 12:15 |
|
Barbara Boehmer
Messages: 9101 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
lkngstr82is wrote on Fri, 08 October 2010 09:53
One more thing: How can I add space (like -,.) to the SKIPJOIN?
If I have record SUSAN B, how can I make CATSEARCH list it?
You can't, nor should you. The space is what is used to separate the tokens/words. If you could make the space a skipjoin, then every row would be all one token/word. You can't solve everything with generic searches. You have to expect reasonable input and searches.
|
|
|