Text search to eliminate "?" using Contains Clause [message #566326] |
Wed, 12 September 2012 14:35 |
rajivn786
Messages: 161 Registered: January 2010
|
Senior Member |
|
|
Hi,
I have a clob field in a table..I need to eliminate text which has a "?" followed by text..
The clob has the following text..(truncated to fit here)
Ex : "? 459.2N Inferior Vena Cava Obstruction
Cont lovenox but suspect IVC clot is exacerbating LE edema" -- Text
Select * from t
where contains(report_text,'near((clot, inferior vena cava), 10)') > 0
and not(contains(report_text,'?clot') > 0)
The above doesn't have "?clot" in the text...
So we should get the above text if we use the above query..
but I couldn't figure it out why the above record is not getting returned..
|
|
|
Re: Text search to eliminate "?" using Contains Clause [message #566329 is a reply to message #566326] |
Wed, 12 September 2012 15:48 |
|
Barbara Boehmer
Messages: 9101 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
The ? is not tokenized and indexed by default. In order to index it, you have to create a lexer that sets it as a printjoin, then use that lexer in your index creation. Also, the ? means fuzzy to Oracle Text. In order to treat ? as regular text, you have to escape it by either placing a \ in front of it or enclosing the entire token between { and }. Please see the demonstration below.
-- table and test data:
SCOTT@orcl_11gR2> create table t (report_text clob)
2 /
Table created.
SCOTT@orcl_11gR2> insert into t values (
2 '? 459.2N Inferior Vena Cava Obstruction
3 Cont lovenox but suspect IVC clot is exacerbating LE edema')
4 /
1 row created.
SCOTT@orcl_11gR2> insert into t values ('?clot')
2 /
1 row created.
SCOTT@orcl_11gR2> insert into t values ('other data')
2 /
1 row created.
SCOTT@orcl_11gR2> select * from t
2 /
REPORT_TEXT
--------------------------------------------------------------------------------
? 459.2N Inferior Vena Cava Obstruction
Cont lovenox but suspect IVC clot is exacerbating LE edema
?clot
other data
3 rows selected.
-- lexer, index, and what is tokenized and indexed:
SCOTT@orcl_11gR2> begin
2 ctx_ddl.create_preference ('test_lex', 'basic_lexer');
3 ctx_ddl.set_attribute ('test_lex', 'printjoins', '?');
4 end;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> create index t_idx on t (report_text)
2 indextype is ctxsys.context
3 parameters ('lexer test_lex')
4 /
Index created.
SCOTT@orcl_11gR2> select token_text from dr$t_idx$i
2 /
TOKEN_TEXT
----------------------------------------------------------------
459.2N
?CLOT
CAVA
CLOT
CONT
DATA
EDEMA
EXACERBATING
INFERIOR
IVC
LE
LOVENOX
OBSTRUCTION
OTHER
SUSPECT
VENA
16 rows selected.
-- sample queries:
SCOTT@orcl_11gR2> Select * from t
2 where contains (report_text, 'near ((clot, inferior vena cava), 10)') > 0
3 /
REPORT_TEXT
--------------------------------------------------------------------------------
? 459.2N Inferior Vena Cava Obstruction
Cont lovenox but suspect IVC clot is exacerbating LE edema
1 row selected.
SCOTT@orcl_11gR2> Select * from t
2 where not contains (report_text, '\?clot') > 0
3 /
REPORT_TEXT
--------------------------------------------------------------------------------
? 459.2N Inferior Vena Cava Obstruction
Cont lovenox but suspect IVC clot is exacerbating LE edema
other data
2 rows selected.
SCOTT@orcl_11gR2> Select * from t
2 where not contains (report_text, '{?clot}') > 0
3 /
REPORT_TEXT
--------------------------------------------------------------------------------
? 459.2N Inferior Vena Cava Obstruction
Cont lovenox but suspect IVC clot is exacerbating LE edema
other data
2 rows selected.
SCOTT@orcl_11gR2> Select * from t
2 where contains (report_text, 'near ((clot, inferior vena cava), 10)') > 0
3 and not contains (report_text, '\?clot') > 0
4 /
REPORT_TEXT
--------------------------------------------------------------------------------
? 459.2N Inferior Vena Cava Obstruction
Cont lovenox but suspect IVC clot is exacerbating LE edema
1 row selected.
SCOTT@orcl_11gR2> Select * from t
2 where contains (report_text, 'near ((clot, inferior vena cava), 10)') > 0
3 and not contains (report_text, '{?clot}') > 0
4 /
REPORT_TEXT
--------------------------------------------------------------------------------
? 459.2N Inferior Vena Cava Obstruction
Cont lovenox but suspect IVC clot is exacerbating LE edema
1 row selected.
|
|
|
Re: Text search to eliminate "?" using Contains Clause [message #566330 is a reply to message #566329] |
Wed, 12 September 2012 15:55 |
rajivn786
Messages: 161 Registered: January 2010
|
Senior Member |
|
|
Barbara,
Thanks for the excellent explanation...I already had an index on report_text with stop list...
create /*+ parallel(t 32) */ index rep_idx on t(report_text) indextype
is ctxsys.context parameters ('stoplist ctxsys.empty_stoplist')
parallel 32;
So how do I add the lexer to the above>>
|
|
|
|
|
|