weird way Oracle text index gets created [message #177437] |
Wed, 14 June 2006 10:34 |
qzheng2005
Messages: 4 Registered: July 2000 Location: VA
|
Junior Member |
|
|
I just don't understand the way oracle text index created.
CREATE TABLE QZ_TEST
(
STNAME VARCHAR2(180 BYTE)
);
insert all
into qz_test (stname) values ('Oak 87 S')
into qz_test (stname) values ('Oak 87 N')
into qz_test (stname) values ('Oak 87 A')
into qz_test (stname) values ('Oak 87 B')
into qz_test (stname) values ('Oak 87 C')
into qz_test (stname) values ('Oak 87 D')
into qz_test (stname) values ('Oak 87 E')
into qz_test (stname) values ('Oak 87 F')
into qz_test (stname) values ('Oak 87 G')
into qz_test (stname) values ('Oak 87 H')
select * from dual;
CREATE INDEX QZ_TEST_INDEX ON QZ_TEST
(STNAME)
INDEXTYPE IS CTXSYS.CONTEXT;
SQL> select * from qz_test where contains(stname, 'Oak 87 S%',1) > 1;
no rows selected
SQL> select * from qz_test where contains(stname, 'Oak 87 N%',1) > 1;
STNAME
------------------------------------
Oak 87 N
SQL> select * from qz_test where contains(stname, 'Oak 87 S',1) >1;
STNAME
------------------------------------------------
Oak 87 S
Oak 87 N
Oak 87 A
Oak 87 B
Oak 87 C
Oak 87 D
Oak 87 E
Oak 87 F
Oak 87 G
Oak 87 H
10 rows selected.
SQL>
SQL> select token_text from DR$QZ_TEST_INDEX$I;
TOKEN_TEXT
----------------------------------------------------------------
87
B
C
E
F
G
H
N
OAK
========================
As you can see "S", "A" "D" not get included in the index
and cause "no row selected" when query on it with prefix and
returns lots of things when you do exact search.
Need your advice.
Thanks,
Qiang
|
|
|
Re: weird way Oracle text index gets created [message #177442 is a reply to message #177437] |
Wed, 14 June 2006 11:59 |
|
Barbara Boehmer
Messages: 9100 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
You have not specified a stoplist, so Oracle uses the deault stoplist which contains the single letters "A", "S", and "D", so Oracle does not tokenize those and ignores them when searching. So, if you search for "Oak 87 S", it ignores the "S" and searches for anything that contains "Oak 87". If you use a stopword with %, like "S%", then it confuses it. It is like asking for something that starts with something that doesn't exist. It would be nice if Oracle also ignored "some_stopword%" in the same manner that it ignores "some_stopword", but it doesn't. You can create your own stoplist or remove words from an existing stoplist or use an empty stoplist. Please see the demonstration below that uses an empty stoplist during index creation.
SCOTT@10gXE> CREATE TABLE QZ_TEST (STNAME VARCHAR2(180 BYTE))
2 /
Table created.
SCOTT@10gXE> insert all
2 into qz_test (stname) values ('Oak 87 S')
3 into qz_test (stname) values ('Oak 87 N')
4 into qz_test (stname) values ('Oak 87 A')
5 into qz_test (stname) values ('Oak 87 B')
6 into qz_test (stname) values ('Oak 87 C')
7 into qz_test (stname) values ('Oak 87 D')
8 into qz_test (stname) values ('Oak 87 E')
9 into qz_test (stname) values ('Oak 87 F')
10 into qz_test (stname) values ('Oak 87 G')
11 into qz_test (stname) values ('Oak 87 H')
12 select * from dual
13 /
10 rows created.
SCOTT@10gXE> CREATE INDEX QZ_TEST_INDEX ON QZ_TEST (STNAME)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS ('STOPLIST CTXSYS.EMPTY_STOPLIST')
4 /
Index created.
SCOTT@10gXE> SELECT token_text FROM dr$qz_test_index$i
2 /
TOKEN_TEXT
----------------------------------------------------------------
87
A
B
C
D
E
F
G
H
N
OAK
S
12 rows selected.
SCOTT@10gXE> COLUMN stname FORMAT A30
SCOTT@10gXE> select stname, score (1) from qz_test
2 where contains (stname, 'Oak 87 S%', 1) > 0
3 /
STNAME SCORE(1)
------------------------------ ----------
Oak 87 S 6
SCOTT@10gXE> select stname, score (1) from qz_test
2 where contains (stname, 'Oak 87 N%', 1) > 0
3 /
STNAME SCORE(1)
------------------------------ ----------
Oak 87 N 6
SCOTT@10gXE> select stname, score (1) from qz_test
2 where contains (stname, 'Oak 87 S', 1) > 0
3 /
STNAME SCORE(1)
------------------------------ ----------
Oak 87 S 6
SCOTT@10gXE>
|
|
|
|