Home » Server Options » Text & interMedia » CTX_DOC.Snippet not recognized special characters like ',""
CTX_DOC.Snippet not recognized special characters like ',"" [message #382431] |
Thu, 22 January 2009 05:05 |
miharoscabv
Messages: 5 Registered: January 2009
|
Junior Member |
|
|
I discovered an issue on CTX_Doc.Snippet. I have a query like this:
SELECT
CTX_DOC.SNIPPET('FTS_UA_TOPICSEARCH', TO_CHAR(UAS.FILEID), 'view'),
UAS.FILEPATH,
SCORE(1)
FROM UA_TOPICSEARCH UAS
WHERE CONTAINS(UAS.FILEPATH , 'view', 1)>0;
And I want to search in a file which content is:
"but not be able to click the link because the user doesn’t have access to view the data. If the user want to create"
and if I search for word "view" the CTX_DOC.Snippet returns "because the user doesn" because it did not recongnize the character "'" from doesn't.
If I look for a word like "user" it doesn't display anything.
Have anyone encountered this issue? Can you give me a hint about how to handle this?
Thank You!
|
|
|
Re: CTX_DOC.Snippet not recognized special characters like ',"" [message #382486 is a reply to message #382431] |
Thu, 22 January 2009 10:50 |
|
Barbara Boehmer
Messages: 9104 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
I am unable to reproduce the error. It might be due to a difference in versions or parameters used in index creation of character set. It might help if you provide a copy and paste of a complete run of a small test, as I have done below, that reproduces your problem.
SCOTT@orcl_11g> SELECT banner FROM v$version
2 /
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for 32-bit Windows: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production
SCOTT@orcl_11g> CREATE TABLE ua_topicsearch
2 (fileid NUMBER PRIMARY KEY,
3 filepath VARCHAR2 (30))
4 /
Table created.
SCOTT@orcl_11g> INSERT INTO ua_topicsearch VALUES
2 (1, 'c:\oracle11g\test.dat')
3 /
1 row created.
SCOTT@orcl_11g> CREATE INDEX fts_ua_topicsearch ON ua_topicsearch (filepath)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS ('DATASTORE CTXSYS.FILE_DATASTORE')
4 /
Index created.
SCOTT@orcl_11g> SELECT CTX_DOC.SNIPPET
2 ('FTS_UA_TOPICSEARCH',
3 TO_CHAR (UAS.FILEID), 'view'),
4 UAS.FILEPATH,
5 SCORE(1)
6 FROM UA_TOPICSEARCH UAS
7 WHERE CONTAINS (UAS.FILEPATH, 'view', 1) > 0
8 /
CTX_DOC.SNIPPET('FTS_UA_TOPICSEARCH',TO_CHAR(UAS.FILEID),'VIEW')
--------------------------------------------------------------------------------
FILEPATH SCORE(1)
------------------------------ ----------
because the user doesn’t have access to <b>view</b> the data. If the user want t
o create
c:\oracle11g\test.dat 3
SCOTT@orcl_11g> SELECT CTX_DOC.SNIPPET
2 ('FTS_UA_TOPICSEARCH',
3 TO_CHAR (UAS.FILEID), 'user'),
4 UAS.FILEPATH,
5 SCORE(1)
6 FROM UA_TOPICSEARCH UAS
7 WHERE CONTAINS (UAS.FILEPATH, 'user', 1) > 0
8 /
CTX_DOC.SNIPPET('FTS_UA_TOPICSEARCH',TO_CHAR(UAS.FILEID),'USER')
--------------------------------------------------------------------------------
FILEPATH SCORE(1)
------------------------------ ----------
click the link because the <b>user</b> doesn’t have access to view the data. If
the <b>user</b> want to create
c:\oracle11g\test.dat 6
SCOTT@orcl_11g>
|
|
|
Re: CTX_DOC.Snippet not recognized special characters like ',"" [message #382587 is a reply to message #382486] |
Fri, 23 January 2009 02:11 |
miharoscabv
Messages: 5 Registered: January 2009
|
Junior Member |
|
|
Thank you for your quick answer!
Indeed I was not so clear because I did not specify that the file that I used are html and is much bigger that I wrote and the think that make the difference is that the file must be saved using ANSI Encoding.
So,if I have 2 files one saved in a UTF-8 format and another one in as a ANSI format and the same scripts that you wrote:
SQL> CREATE TABLE ua_topicsearch
2 (fileid NUMBER PRIMARY KEY,
3 filepath VARCHAR2 (30))
4 /
Table created
SQL> INSERT INTO ua_topicsearch VALUES (2, 'c:\oracle11g\testUTF.dat');
1 row inserted
SQL> INSERT INTO ua_topicsearch VALUES (3, 'c:\oracle11g\testANSI.dat');
1 row inserted
SQL> CREATE INDEX fts_ua_topicsearch ON ua_topicsearch (filepath)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS ('DATASTORE CTXSYS.FILE_DATASTORE')
4 /
Index created
SQL> SELECT CTX_DOC.SNIPPET
2 ('FTS_UA_TOPICSEARCH',
3 TO_CHAR (UAS.FILEID), 'view'),
4 UAS.FILEPATH,
5 SCORE(1)
6 FROM UA_TOPICSEARCH UAS
7 WHERE CONTAINS (UAS.FILEPATH, 'view', 1) > 0
8 /
CTX_DOC.SNIPPET('FTS_UA_TOPICS FILEPATH SCORE(1)
-------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ----------
because the user doesn’t have access to <b>view</b> the data. If the user want t c:\oracle11g\testUTF.dat 3
because the user doesn c:\oracle11g\testANSI.dat
It can see that for ANSI Encoding the Snippet doesn't display the character.
I must use these type of encoding because I have the same issue with offset characters described in "Problems with CTX_DOC.SNIPPET on HTML documents" topic but I can not use CLOB as column because I must saved in my table also: pdf and mp3 files.
Thank you once again!
|
|
|
|
Re: CTX_DOC.Snippet not recognized special characters like ',"" [message #382880 is a reply to message #382732] |
Mon, 26 January 2009 01:00 |
miharoscabv
Messages: 5 Registered: January 2009
|
Junior Member |
|
|
I am using Oracle 11:
SQL> SELECT banner FROM v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
PL/SQL Release 11.1.0.7.0 - Production
CORE 11.1.0.7.0 Production
TNS for 32-bit Windows: Version 11.1.0.7.0 - Production
NLSRTL Version 11.1.0.7.0 - Production
and the preferences for my index are:
SELECT CTX_REPORT.CREATE_INDEX_SCRIPT ('fts_ua_topicsearch') FROM DUAL;
begin
ctx_ddl.create_preference('"FTS_UA_TOPICSEARCH_DST"','FILE_DATASTORE');
end;
/
begin
ctx_ddl.create_preference('"FTS_UA_TOPICSEARCH_FIL"','AUTO_FILTER');
end;
/
begin
ctx_ddl.create_section_group('"FTS_UA_TOPICSEARCH_SGP"','HTML_SECTION_GROUP');
end;
/
begin
ctx_ddl.create_preference('"FTS_UA_TOPICSEARCH_LEX"','BASIC_LEXER');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_LEX"','MIXED_CASE','YES');
end;
/
begin
ctx_ddl.create_preference('"FTS_UA_TOPICSEARCH_WDL"','BASIC_WORDLIST');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_WDL"','STEMMER','ENGLISH');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_WDL"','FUZZY_MATCH','GENERIC');
end;
/
begin
ctx_ddl.create_stoplist('"FTS_UA_TOPICSEARCH_SPL"','BASIC_STOPLIST');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','Mr');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','Mrs');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','Ms');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','a');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','all');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','almost');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','also');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','although');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','an');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','and');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','any');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','are');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','as');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','at');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','be');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','because');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','been');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','both');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','but');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','by');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','can');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','could');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','d');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','did');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','do');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','does');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','either');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','for');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','from');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','had');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','has');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','have');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','having');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','he');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','her');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','here');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','hers');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','him');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','his');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','how');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','however');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','i');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','if');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','in');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','into');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','is');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','it');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','its');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','just');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','ll');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','me');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','might');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','my');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','no');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','non');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','nor');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','not');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','of');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','on');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','one');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','only');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','onto');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','or');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','our');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','ours');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','s');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','shall');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','she');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','should');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','since');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','so');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','some');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','still');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','such');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','t');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','than');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','that');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','the');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','their');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','them');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','then');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','there');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','therefore');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','these');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','they');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','this');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','those');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','though');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','through');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','thus');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','to');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','too');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','until');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','ve');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','very');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','was');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','we');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','were');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','what');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','when');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','where');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','whether');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','which');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','while');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','who');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','whose');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','why');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','will');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','with');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','would');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','yet');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','you');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','your');
ctx_ddl.add_stopword('"FTS_UA_TOPICSEARCH_SPL"','yours');
end;
/
begin
ctx_ddl.create_preference('"FTS_UA_TOPICSEARCH_STO"','BASIC_STORAGE');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_STO"','I_TABLE_CLAUSE','TABLESPACE watindex');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_STO"','K_TABLE_CLAUSE','TABLESPACE watindex');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_STO"','R_TABLE_CLAUSE','TABLESPACE watindex lob(data) store as (cache)');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_STO"','N_TABLE_CLAUSE','TABLESPACE watindex');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_STO"','I_INDEX_CLAUSE','TABLESPACE watindex compress 2');
ctx_ddl.set_attribute('"FTS_UA_TOPICSEARCH_STO"','P_TABLE_CLAUSE','TABLESPACE watindex');
end;
/
begin
ctx_output.start_log('FTS_UA_TOPICSEARCH_LOG');
end;
/
create index "EVEREST"."FTS_UA_TOPICSEARCH"
on "EVEREST"."UA_TOPICSEARCH"
("FILEPATH")
indextype is ctxsys.context
parameters('
datastore "FTS_UA_TOPICSEARCH_DST"
filter "FTS_UA_TOPICSEARCH_FIL"
section group "FTS_UA_TOPICSEARCH_SGP"
lexer "FTS_UA_TOPICSEARCH_LEX"
wordlist "FTS_UA_TOPICSEARCH_WDL"
stoplist "FTS_UA_TOPICSEARCH_SPL"
storage "FTS_UA_TOPICSEARCH_STO"
sync (every "SYSDATE + 1/1440" memory 12582912)
')
/
begin
ctx_output.end_log;
end;
/
I don't know which of these could affect the snippet result?
|
|
|
|
Goto Forum:
Current Time: Thu Jan 30 13:19:17 CST 2025
|