OraFAQ Forum: Text & interMedia » Multi-lingual index for blob column

Home » Server Options » Text & interMedia » Multi-lingual index for blob column (Oracle 10g R2 10.2.0.1.0)

Show: Today's Messages :: Polls :: Message Navigator
E-mail to friend

Multi-lingual index for blob column [message #513612]

Tue, 28 June 2011 06:56

dkdms2124
Messages: 369
Registered: April 2010
Location: INDIA

Senior Member

Hi All,

I have a test table with three columns (id,name,doc) where doc column is of blob type.

I am mentioning the step I have followed to create index and then search the document containing the keyword using the CONTAIN keyword.

I am able to find the documents in English Language but not able to find documents in other langauges.

Please help me out.

SQL> conn sample/sample
Connected.
SQL> create table test(
  2  id number primary key,
  3  name varchar2(2000),
  4  doc blob);

Table created.

SQL> create sequence sample_seq;

SQL> conn sys/oracle as sysdba
Connected.
SQL> create or replace directory documents as 'C:\sample_work';

Directory created.

SQL> grant read,write on directory documents to sample;

SQL> create or replace procedure load_data ( p_file_name IN test.name%type) AS
  2     v_bfile bfile;
  3     v_blob blob;
  4     begin
  5     insert into test (id,name,doc)
  6     values (sample_seq.nextval,p_file_name,empty_blob())
  7     return doc into v_blob;
  8     v_bfile := bfilename('DOCUMENTS',p_file_name);
  9     dbms_lob.fileopen(v_bfile,dbms_lob.file_readonly);
 10     dbms_lob.loadfromfile(v_blob,v_bfile,dbms_lob.getlength(v_bfile));
 11     dbms_lob.fileclose(v_bfile);
 12     commit;
 13     end;
 14  /

Procedure created.

SQL> EXEC load_data ('Clustering.doc');

PL/SQL procedure successfully completed.

SQL> exec load_data('connectivity.doc');

PL/SQL procedure successfully completed.

SQL> select id from test;

        ID
----------
        22
        23
        24

SQL> begin
  2   ctx_ddl.create_preference('est_lexer', 'WORLD_LEXER');
  3  end;
  4  /

PL/SQL procedure successfully completed.

SQL> create index sample_doc_idx on test (doc) indextype IS ctxsys.context parameters(' LEXER EST_LEXER ');

Index created.

SQL> exec load_data('backtrack_oracle_tutorial.pdf');

PL/SQL procedure successfully completed.

SQL> exec load_data('Reading Logs Spanish.pdf');

PL/SQL procedure successfully completed.

SQL> exec load_data('Pan-2.4-fr_FR.pdf');

PL/SQL procedure successfully completed.

SQL> exec load_data('Kitchen-2.4-fr_FR.pdf');

PL/SQL procedure successfully completed.

SQL> exec load_data('dutch.txt');

PL/SQL procedure successfully completed.

SQL> select count(*) from test;

  COUNT(*)
----------
         8


SQL> set autotrace on
SQL> ed
Wrote file afiedt.buf

  1  SELECT SCORE(1) score, id, name
  2  FROM   test
  3  WHERE  CONTAINS(doc, 'Tomcat', 1) > 0
  4* ORDER BY SCORE(1) DESC
SQL>
SQL> /

     SCORE         ID NAME
---------- ---------- --------------------------------------------------------------------------------
       100         23 Clustering.doc


Execution Plan
----------------------------------------------------------
Plan hash value: 2693406471

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     3 |  1632 |     1 (100)| 00:00:01 |
|   1 |  SORT ORDER BY               |                |     3 |  1632 |     1 (100)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TEST           |     3 |  1632 |     0   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX              | SAMPLE_DOC_IDX |       |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CTXSYS"."CONTAINS"("DOC",'Tomcat',1)>0)


Statistics
----------------------------------------------------------
         11  recursive calls
          0  db block gets
         19  consistent gets
          0  physical reads
          0  redo size
        532  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

but When i try to look for the documents in other languages "no rows selected"

SQL> ed
Wrote file afiedt.buf

  1  SELECT SCORE(1) score, id, name
  2  FROM   test
  3  WHERE  CONTAINS(doc, 'Oracle Application Express', 1) > 0
  4* ORDER BY SCORE(1) DESC
SQL> //

no rows selected


Execution Plan
----------------------------------------------------------
Plan hash value: 2693406471

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |   544 |     1 (100)| 00:00:01 |
|   1 |  SORT ORDER BY               |                |     1 |   544 |     1 (100)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| TEST           |     1 |   544 |     0   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX              | SAMPLE_DOC_IDX |       |       |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CTXSYS"."CONTAINS"("DOC",'Oracle Application Express',1)>0)


Statistics
----------------------------------------------------------
        148  recursive calls
          0  db block gets
        647  consistent gets
          0  physical reads
          0  redo size
        380  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          0  rows processed

Please help me how I can use world_lexer for documents in various languages

Thanks
Deepak