Home » Server Options » Text & interMedia » Oracle Text Index Problem (Oracle 11g)
Oracle Text Index Problem [message #481487] Wed, 03 November 2010 15:03 Go to next message
frnzkfka
Messages: 6
Registered: November 2010
Location: Turkey - Istanbul
Junior Member
Hi all,

Structure --
The problem is that i have created an index on a CLOB column. And the column is in JSon format which has key value pairs. We first parse the JSON to XML and then we create index on that data.
This is the structure we have.

Index part--
We needed a key value search feature, so i used oracle text index.
But index is getting broken in some period. Sometimes when we create some new records the index is not work for them. Even after recreating index, the problem sometimes also continues. But i could not indicate the problem because generally the indexes are works for also new records.

What can be the problem that gets broken the index?

Do you have any opinion for that case?

Thank you.
Re: Oracle Text Index Problem [message #481488 is a reply to message #481487] Wed, 03 November 2010 15:06 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
>Do you have any opinion for that case?
I suggest you submit Service Request to My Oracle Support
Re: Oracle Text Index Problem [message #481492 is a reply to message #481488] Wed, 03 November 2010 15:17 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
I see that you are a new user. Welcome to the OraFAQ forums. Since this post is about Oracle Text it will be moved there. Please read our forum guidelines, so that you know what you need to provide so that we are better able to help you.

Oracle Text indexes have to be synchronized in order for new data to be indexed. The synchronization can be done in various ways, such as on commit or at scheduled intervals. The column that the index is created on must be changed in order to trigger synchronization. Oracle Text indexes should also be periodically optimized and rebuilt to avoid fragmentation. It would help if you can post a simple reproducible test case, including statements to create the table, create the index, including whatever method of optimization you are using, some sample data, a sample search, what results you are getting, and what results you want. When you say that the index is "broken", do you mean that searches are just not returning all of the rows that you expect or something else? It may be that the index has not been synchronized or it may be that the search syntax is incorrect.
Re: Oracle Text Index Problem [message #481501 is a reply to message #481492] Wed, 03 November 2010 15:48 Go to previous messageGo to next message
frnzkfka
Messages: 6
Registered: November 2010
Location: Turkey - Istanbul
Junior Member
First of all thank you, i will be careful for your suggestions and i will review the Oracle Text topic.

As for the problem, i use the ctx_ddl.sync_index method by work it by a job per hour, and it generally works for the updated and new records.

But in some cases, for updated rows and new records the synchronization simply doesnt works. Old records ok but these new ones are being invisible for oracle text searches.

Do i need rebuilding or optimizing periodicly? is that can be the problem?

Re: Oracle Text Index Problem [message #481504 is a reply to message #481501] Wed, 03 November 2010 15:59 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
Ctx_ddl.sync_index should enable you to find the newly inserted or updated records, as long as the column that the index is created on has been updated and there has been enough time for the synchronization to finish. Is it a simple index on one column of one table or do you use a multi_column_datastore to index multiple columns in one table or do you use a user_datastore and procedure to index multiple columns of multiple tables? With multiple columns, the index is still created on only one column and that column must be updated or that row is not synchronized when you run ctx_ddl.sync_index, even if other columns are updated.

The more often that you synchronize the more fragmented your index becomes, which can cause slow searches but should not cause wrong results. Oracle Text indexes should be periodically optimized to eliminate fragmentation and rebuilt. There is a rebuild option for the most thorough optimization.

There are a lot of possibilities. It could be that the rows are not being synchronized because the correct column was not updated or it could that the synchronization has not finished or it could be that your search syntax should not return what you think it should or it could be that there is something else wrong.

It would really help if you could providing a copy and paste of an actual example, otherwise all I can do is suggest various possibilities.

Re: Oracle Text Index Problem [message #481506 is a reply to message #481504] Wed, 03 November 2010 16:39 Go to previous messageGo to next message
frnzkfka
Messages: 6
Registered: November 2010
Location: Turkey - Istanbul
Junior Member

I am not able to post any image or copy of the structure,

I use user_datastore and i am sure that the query that i use is absolutely true. But i think there is an consistency problem with the index by the time.

With more effective description and copy of the system i will be more clear to explain the problem. I will prepare these stuff soon.


Thank you for your interest.
Re: Oracle Text Index Problem [message #481510 is a reply to message #481506] Wed, 03 November 2010 17:50 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
The following simplified example may help. It demonstrates one of the possibilities that I was trying to explain. It has a procedure that concatenates two columns (col1 and col2) together. That procedure is used in a user_datastore, which is used as an index parameter on an index which is created on a different column (dummy). Both columns (col1 and col2) are then updated and the index is synchronized using ctx_ddl.sync_index, but the new values (testd) is not found, which is what you seem to be describing. Then the dummy column is updated, the index is synchronized again using ctx_ddl.sync_index and the value (testd) is then found. This is because when you have an index that searches multiple columns, the index is still created on only one column, and only when that column is updated is that row synchronized by ctx_ddl.sync_index.


SCOTT@orcl_11gR2> create table test_tab
  2    (id     number,
  3  	col1   varchar2 (15),
  4  	col2   varchar2 (15),
  5  	dummy  varchar2 ( 1))
  6  /

Table created.

SCOTT@orcl_11gR2> insert all
  2  into test_tab values (1, 'testa', 'testb', null)
  3  into test_tab values (2, 'testc', 'testa', null)
  4  select * from dual
  5  /

2 rows created.

SCOTT@orcl_11gR2> create or replace procedure test_proc
  2    (p_rowid in	      rowid,
  3  	p_clob	in out nocopy clob)
  4  as
  5  begin
  6    for r in (select * from test_tab) loop
  7  	 p_clob := r.col1 || ' ' || r.col2;
  8    end loop;
  9  end test_proc;
 10  /

Procedure created.

SCOTT@orcl_11gR2> show errors
No errors.
SCOTT@orcl_11gR2> begin
  2    ctx_ddl.create_preference ('test_datastore', 'user_datastore');
  3    ctx_ddl.set_attribute ('test_datastore', 'procedure', 'test_proc');
  4  end;
  5  /

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> create index test_idx on test_tab (dummy)
  2  indextype is ctxsys.context
  3  parameters ('datastore test_datastore')
  4  /

Index created.

SCOTT@orcl_11gR2> select * from test_tab
  2  where  contains (dummy, 'testa') > 0
  3  /

        ID COL1            COL2            D
---------- --------------- --------------- -
         1 testa           testb
         2 testc           testa

2 rows selected.

SCOTT@orcl_11gR2> update test_tab
  2  set    col1 = 'testd'
  3  where  col1 = 'testc'
  4  /

1 row updated.

SCOTT@orcl_11gR2> update test_tab
  2  set    col2 = 'testd'
  3  where  col2 = 'testb'
  4  /

1 row updated.

SCOTT@orcl_11gR2> commit
  2  /

Commit complete.

SCOTT@orcl_11gR2> exec ctx_ddl.sync_index ('test_idx')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> -- value testd is not found
SCOTT@orcl_11gR2> -- after update of col1 and col2
SCOTT@orcl_11gR2> -- and synchronization:
SCOTT@orcl_11gR2> select * from test_tab
  2  where  contains (dummy, 'testd') > 0
  3  /

no rows selected

SCOTT@orcl_11gR2> update test_tab
  2  set    dummy = dummy
  3  /

2 rows updated.

SCOTT@orcl_11gR2> commit
  2  /

Commit complete.

SCOTT@orcl_11gR2> exec ctx_ddl.sync_index ('test_idx')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> -- value testd is found
SCOTT@orcl_11gR2> -- after update of dummy
SCOTT@orcl_11gR2> -- and synchronization:
SCOTT@orcl_11gR2> select * from test_tab
  2  where  contains (dummy, 'testd') > 0
  3  /

        ID COL1            COL2            D
---------- --------------- --------------- -
         1 testa           testd
         2 testd           testa

2 rows selected.

SCOTT@orcl_11gR2>

Re: Oracle Text Index Problem [message #481678 is a reply to message #481510] Fri, 05 November 2010 04:21 Go to previous messageGo to next message
frnzkfka
Messages: 6
Registered: November 2010
Location: Turkey - Istanbul
Junior Member
This query below is the system that i described.


--Create Preferences Part

begin
CTX_DDL.CREATE_PREFERENCE ('xmldatastore1', 'USER_DATASTORE');
CTX_DDL.SET_ATTRIBUTE ('xmldatastore1', 'PROCEDURE', 'usertable_parse_proc');
CTX_DDL.CREATE_SECTION_GROUP('xmlpathgroup1', 'PATH_SECTION_GROUP');
CTX_DDL.CREATE_PREFERENCE ('charlexer', 'BASIC_LEXER');
CTX_DDL.SET_ATTRIBUTE ('charlexer', 'BASE_LETTER', 'yes');


--Index Create part--
execute immediate 'CREATE INDEX user_idx ON usertable (user_app_data)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS (''datastore xmldatastore1
section group xmlpathgroup1
filter ctxsys.null_filter
stoplist ctxsys.empty_stoplist
lexer charlexer
'')';

--Sample data in 'user_app_data'
(Note : We parse this data to xml via a parser)
{'age':'23','phoneNum':'13800138000','addresses':'Istanbul'}

--Overall
As i said before the only column we indexed is 'user_app_data' and we parse the data via a parser 'usertable_parse_proc' which we gave as a parameter to 'xmldatastore1'.

The main problem is consistency of search. It works for a while and then after an unknown period, the new records are starting to be invisible to Oracle Text search.






Re: Oracle Text Index Problem [message #481741 is a reply to message #481678] Fri, 05 November 2010 15:10 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
My best guess would be that the domain index tables have been so extremely fragmented due to frequent synchronization without optimization that synchronization is taking forever. Please see the demo below in which after 4 inserts followed by synchronization there are 24 rows in the dr$user_idx$i domain index table and after optimization that is reduced to 9 rows. I suggest that you schedule optimization with rebuild after each synchronization.


SCOTT@orcl_11gR2> CREATE TABLE usertable
  2    (id	       NUMBER,
  3  	user_app_data  CLOB)
  4  /

Table created.

SCOTT@orcl_11gR2> CREATE OR REPLACE PROCEDURE usertable_parse_proc
  2    (p_rowid IN  ROWID,
  3  	p_clob	OUT CLOB)
  4  AS
  5  	v_clob	    CLOB;
  6  	v_pair	    VARCHAR2 (32767);
  7  	v_tag	    VARCHAR2 (32767);
  8  	v_elem	    VARCHAR2 (32767);
  9  BEGIN
 10    SELECT user_app_data
 11    INTO   v_clob
 12    FROM   usertable
 13    WHERE  ROWID = p_rowid;
 14    v_clob := REPLACE (RTRIM (LTRIM (v_clob, '{'), '}'), '"', '') || ',';
 15    WHILE INSTR (v_clob, ',') > 0 LOOP
 16  	 v_pair := SUBSTR (v_clob, 1, INSTR (v_clob, ',') - 1);
 17  	 v_tag	:= SUBSTR (v_pair, 1, INSTR (v_pair, ':') - 1);
 18  	 v_elem := SUBSTR (v_pair, INSTR (v_pair, ':') + 1);
 19  	 p_clob := p_clob || '<' || v_tag || '>' || v_elem || '</' || v_tag || '>';
 20  	 v_clob := SUBSTR (v_clob, INSTR (v_clob, ',') + 1);
 21    END LOOP;
 22  END usertable_parse_proc;
 23  /

Procedure created.

SCOTT@orcl_11gR2> SHOW ERRORS
No errors.
SCOTT@orcl_11gR2> begin
  2    CTX_DDL.CREATE_PREFERENCE ('xmldatastore1', 'USER_DATASTORE');
  3    CTX_DDL.SET_ATTRIBUTE ('xmldatastore1', 'PROCEDURE', 'usertable_parse_proc');
  4    CTX_DDL.CREATE_SECTION_GROUP('xmlpathgroup1', 'PATH_SECTION_GROUP');
  5    CTX_DDL.CREATE_PREFERENCE ('charlexer', 'BASIC_LEXER');
  6    CTX_DDL.SET_ATTRIBUTE ('charlexer', 'BASE_LETTER', 'yes');
  7  end;
  8  /

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> CREATE INDEX user_idx
  2  ON usertable (user_app_data)
  3  INDEXTYPE IS CTXSYS.CONTEXT
  4  PARAMETERS
  5    ('datastore	xmldatastore1
  6  	 section group	xmlpathgroup1
  7  	 filter 	ctxsys.null_filter
  8  	 stoplist	ctxsys.empty_stoplist
  9  	 lexer		charlexer')
 10  /

Index created.

SCOTT@orcl_11gR2> INSERT INTO usertable (id, user_app_data) VALUES
  2    (1, '{"age":"23","phoneNum":"13800138000","addresses":"Istanbul"}')
  3  /

1 row created.

SCOTT@orcl_11gR2> EXEC CTX_DDL.SYNC_INDEX ('user_idx')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> INSERT INTO usertable (id, user_app_data) VALUES
  2    (2, '{"age":"23","phoneNum":"12345678900","addresses":"Wherever"}')
  3  /

1 row created.

SCOTT@orcl_11gR2> EXEC CTX_DDL.SYNC_INDEX ('user_idx')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> INSERT INTO usertable (id, user_app_data) VALUES
  2    (3, '{"age":"32","phoneNum":"13800138000","addresses":"Wherever"}')
  3  /

1 row created.

SCOTT@orcl_11gR2> EXEC CTX_DDL.SYNC_INDEX ('user_idx')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> INSERT INTO usertable (id, user_app_data) VALUES
  2    (4, '{"age":"32","phoneNum":"12345678900","addresses":"Istanbul"}')
  3  /

1 row created.

SCOTT@orcl_11gR2> EXEC CTX_DDL.SYNC_INDEX ('user_idx')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> COLUMN token_text FORMAT A30
SCOTT@orcl_11gR2> SELECT token_text, token_first, token_last FROM dr$user_idx$i
  2  /

TOKEN_TEXT                     TOKEN_FIRST TOKEN_LAST
------------------------------ ----------- ----------
12345678900                              2          2
12345678900                              4          4
13800138000                              1          1
13800138000                              3          3
23                                       1          1
23                                       2          2
32                                       3          3
32                                       4          4
ISTANBUL                                 1          1
ISTANBUL                                 4          4
WHEREVER                                 2          2
WHEREVER                                 3          3
addresses                                1          1
addresses                                2          2
addresses                                3          3
addresses                                4          4
age                                      1          1
age                                      2          2
age                                      3          3
age                                      4          4
phoneNum                                 1          1
phoneNum                                 2          2
phoneNum                                 3          3
phoneNum                                 4          4

24 rows selected.

SCOTT@orcl_11gR2> SELECT * FROM usertable
  2  WHERE  CONTAINS (user_app_data, 'Istanbul INPATH (addresses)') > 0
  3  /

        ID
----------
USER_APP_DATA
--------------------------------------------------------------------------------
         1
{"age":"23","phoneNum":"13800138000","addresses":"Istanbul"}

         4
{"age":"32","phoneNum":"12345678900","addresses":"Istanbul"}


2 rows selected.

SCOTT@orcl_11gR2> EXEC CTX_DDL.OPTIMIZE_INDEX ('user_idx', 'REBUILD')

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> SELECT token_text, token_first, token_last FROM dr$user_idx$i
  2  /

7TOKEN_TEXT                     TOKEN_FIRST TOKEN_LAST
------------------------------ ----------- ----------
12345678900                              2          4
13800138000                              1          3
23                                       1          2
32                                       3          4
ISTANBUL                                 1          4
WHEREVER                                 2          3
addresses                                1          4
age                                      1          4
phoneNum                                 1          4

9 rows selected.

SCOTT@orcl_11gR2> SELECT * FROM usertable
  2  WHERE  CONTAINS (user_app_data, 'Istanbul INPATH (addresses)') > 0
  3  /

        ID
----------
USER_APP_DATA
--------------------------------------------------------------------------------
         1
{"age":"23","phoneNum":"13800138000","addresses":"Istanbul"}

         4
{"age":"32","phoneNum":"12345678900","addresses":"Istanbul"}


2 rows selected.

SCOTT@orcl_11gR2>

Re: Oracle Text Index Problem [message #481774 is a reply to message #481741] Sat, 06 November 2010 09:32 Go to previous messageGo to next message
frnzkfka
Messages: 6
Registered: November 2010
Location: Turkey - Istanbul
Junior Member
I have already tried to optimization but it did not work. I can give a more clear sample about our system below.

for example i have 3 rows that we can name them as: a, b, c
The index works for them regularly.

Then i add 2 more rows : x, y
The system could not index x, y

i tried the options below one by one:

1. sync the index
2. optimize the index
3. rebuild the index
4. drop the index and all preferences and recreate the index

x and y could not be indexed again. But a, b, c are indexed regularly.

Solution -- Only after i delete these rows and insert again index is worked for these rows.
May these sample that i see in our system can explain the problem more clear?
Re: Oracle Text Index Problem [message #481777 is a reply to message #481774] Sat, 06 November 2010 10:14 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
Are there any triggers on the table that you insert into that would affect the process? Are there any constraints that might have been violated? Do you commit after each insert? Can you confirm that the rows are actually inserted, but just not indexed, by selecting them in some other way?

Re: Oracle Text Index Problem [message #481778 is a reply to message #481777] Sat, 06 November 2010 10:16 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
It would really help to see some sample data or as close as you can simulate and a sample search. Frequently, such problems are due to scoring algorithms that are partially dependent on the quantity of data.

Re: Oracle Text Index Problem [message #481779 is a reply to message #481778] Sat, 06 November 2010 10:18 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
Can you confirm whether the rows are not indexed or just not returned by a search, by selecting from the dr$your_index$i domain index table?
Re: Oracle Text Index Problem [message #481780 is a reply to message #481779] Sat, 06 November 2010 10:19 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
Are there any entries in the ctx_user_index_errors?
Re: Oracle Text Index Problem [message #482574 is a reply to message #481780] Fri, 12 November 2010 14:25 Go to previous message
frnzkfka
Messages: 6
Registered: November 2010
Location: Turkey - Istanbul
Junior Member
I tried all the things you said and checked, dr$your_index$i is ok. there is no constraint or trigger that can effect. But it is certain that when i update the datas which the indexing is not working on, or when i insert again, the index is working for them.

Thank you Barbara for your all comments and suggestions.
Previous Topic: Not wanted HTML tags in snippet searching multiple columns of a table
Next Topic: DRG-50901 error which cannot be reproduced
Goto Forum:
  


Current Time: Sat Nov 23 06:20:18 CST 2024