PDF Document search & Indexing NOT WORKING [message #244775] |
Thu, 14 June 2007 01:41 |
sachin sharma
Messages: 9 Registered: February 2004
|
Junior Member |
|
|
I am using ORACLE 10g Rel 2 Express Edition i.e. Oracle XE.
I have a table structure like
Table = docs
doc_id NUMBER
document BLOB
We are using this table to store MULTILINGUAL PDF, MS-Excel, MS-Word documents.
I created an index as shown below using Oracle Text to search documents as per data in the documents.
Index:
CTX_DDL.CREATE_PREFERENCE('doc_search_lexer', 'WORLD_LEXER');
CREATE INDEX doc_search_binary_idx on docs(document)
INDEXTYPE is CTXSYS.CONTEXT
PARAMETERS ('LEXER doc_search_lexer
STOPLIST CTXSYS.EMPTY_STOPLIST');
I used the following two procedures to optimize & synchronize the Index:
CTX_DDL.SYNC_INDEX('doc_search_binary_idx');
CTX_DDL.OPTIMIZE_INDEX('doc_search_binary_idx', 'FULL');
The SELECT query that I use is:
SELECT * from docs
WHERE CONTAINS(document, 'search string') > 0;
To test the Index, I uploaded two exactly same documents in MS-WORD & PDF format, which were in Hindi (Devnagari script) & executed the optimize & synchronize procedures & then executed the SELECT SQL. The result set lists only the matching MS-word documents, PDF documents (which are exactly same w.r.t. data inj document) are not returned. The PDF documents attributes under "File --> Document Properties --> Font" are OK w.r.t. Oracle Text documentation.
I referred the table "DR$DOC_SEARCH_BINARY_IDX$I", the tokens are not being generated for PDF documents.
What is the way out to Index documents (PDF, MS-Excel, MS-Word) (MULTILINGUAL)???
|
|
|
|
|
Re: PDF Document search & Indexing NOT WORKING [message #570829 is a reply to message #245514] |
Fri, 16 November 2012 03:20 |
acikus
Messages: 3 Registered: November 2012 Location: Belgrade
|
Junior Member |
|
|
Maybe this?
GRANT RESOURCE, CONNECT, CTXAPP TO MYUSER;
GRANT EXECUTE ON CTXSYS.CTX_CLS TO myuser;
GRANT EXECUTE ON CTXSYS.CTX_DDL TO myuser;
GRANT EXECUTE ON CTXSYS.CTX_DOC TO myuser;
GRANT EXECUTE ON CTXSYS.CTX_OUTPUT TO myuser;
GRANT EXECUTE ON CTXSYS.CTX_QUERY TO myuser;
GRANT EXECUTE ON CTXSYS.CTX_REPORT TO myuser;
GRANT EXECUTE ON CTXSYS.CTX_THES TO myuser;
GRANT EXECUTE ON CTXSYS.CTX_ULEXER TO myuser;
|
|
|
|