Home » Server Options » Text & interMedia » Oracle Text Index using Soundex (11g)
Oracle Text Index using Soundex [message #501168] Sat, 26 March 2011 14:06 Go to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
Hi,

I'm a bit confused on why my soundex query is not returning what I expect. From what I have read, soundex uses the English phonetic language to return similar sounding searches.

The index I have is a CTXSYS.CTXCAT


Here's a couple of examples:

This works fine:

select distinct search_term from search s
where
catsearch(s.search_term, '<query><textquery grammar="context">!smythe</textquery></query>', '')>0;

RETURNS:
Smith
Smitt

However, if I try "lough" (looking for Rob Lowe):

RETURNS:
Rouse
Lorch
Longo


Can someone explain that to me? How do I configure the query or the index in order for "lough" to be interpreted as "low" or "lowe"??


Thanks in advance.
Re: Oracle Text Index using Soundex [message #501169 is a reply to message #501168] Sat, 26 March 2011 14:19 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
I have rewritten the query to look like this:

select distinct search_term from search s
where soundex(search_term) = soundex('lough');

I still do not get "lowe"

Here are the results:

"SEARCH_TERM"
"Lewis"
"Laske"
"Lowsky"
"Leckie"
"Luce"
"Lock"
"Lacayo"
"Lashay"
"Lakha"
"Laws"
"Lake"
"Lacy"


What is the difference between writing the query this was as opposed to the way I posted initially?

Why is "lowe" still not showing up in the results?

Re: Oracle Text Index using Soundex [message #501172 is a reply to message #501169] Sat, 26 March 2011 15:11 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
The traditional soundex algorithm is described here:

http://en.wikipedia.org/wiki/Soundex

The soundex of a word consists of the first letter of the word, followed by numbers substituted for consonants except h and w, padded with zeroes, until there are three numbers. So, for "Lowe", the first letter is L and there are no other consonants that are coded, so it is padded with 3 zeroes and the soundex for "Lowe" is L000. For "Lough", the first letter is L, 2 is substituted for g and there are no other consonants that are coded, so it is padded with 2 zeroes, and the soundex for "Lough" is L200. Since the soundex L000 for Lowe is not the same as the soundex L200 for Lough, searching for one does not find the other. The soundex that is used in SQL uses this algorithm. The soundex that is used in Oracle Text context grammar is different, and requires that the two words be much more similar, not just in the first letter and first 3 coded consonants. Please see the demonstration below that shows traditional soundex values and the results of searches using traditional soundex values and the results of searches using context grammar.

-- test table and data:
SCOTT@orcl_11gR2> create table search
  2    (search_term  varchar2 (30))
  3  /

Table created.

SCOTT@orcl_11gR2> insert all
  2  into search values ('Lewis')
  3  into search values ('Laske')
  4  into search values ('Lowsky')
  5  into search values ('Leckie')
  6  into search values ('Luce')
  7  into search values ('Lock')
  8  into search values ('Lacayo')
  9  into search values ('Lashay')
 10  into search values ('Lakha')
 11  into search values ('Laws')
 12  into search values ('Lake')
 13  into search values ('Lacy')
 14  into search values ('Lowe')
 15  into search values ('Lorch')
 16  into search values ('Longo')
 17  select * from dual
 18  /

15 rows created.


-- traditional soundex values:
 
SCOTT@orcl_11gR2> select soundex (search_term),
  2  	    search_term
  3  from   search
  4  order  by 1, 2
  5  /

SOUN SEARCH_TERM
---- ------------------------------
L000 Lowe
L200 Lacayo
L200 Lacy
L200 Lake
L200 Lakha
L200 Lashay
L200 Laske
L200 Laws
L200 Leckie
L200 Lewis
L200 Lock
L200 Lowsky
L200 Luce
L520 Longo
L620 Lorch

15 rows selected.


-- search using traditional soundex algorithm:
SCOTT@orcl_11gR2> select search_term
  2  from   search
  3  where  soundex (search_term) = soundex ('Lough')
  4  order  by 1
  5  /

SEARCH_TERM
------------------------------
Lacayo
Lacy
Lake
Lakha
Lashay
Laske
Laws
Leckie
Lewis
Lock
Lowsky
Luce

12 rows selected.


-- searches using context grammar:
SCOTT@orcl_11gR2> create index context_idx
  2  on search (search_term)
  3  indextype is ctxsys.context
  4  /

Index created.

SCOTT@orcl_11gR2> select search_term
  2  from   search
  3  where  contains (search_term, '!Lough') > 0
  4  /

SEARCH_TERM
------------------------------
Lorch
Longo

2 rows selected.

SCOTT@orcl_11gR2> create index ctxcat_idx
  2  on search (search_term)
  3  indextype is ctxsys.ctxcat
  4  /

Index created.

SCOTT@orcl_11gR2> select search_term
  2  from   search
  3  where  catsearch
  4  	      (search_term,
  5  	       '<query>
  6  		  <textquery grammar=''context''>
  7  		    !Lough
  8  		  </textquery>
  9  		</query>', '') > 0
 10  /

SEARCH_TERM
------------------------------
Lorch
Longo

2 rows selected.

SCOTT@orcl_11gR2>

Re: Oracle Text Index using Soundex [message #501174 is a reply to message #501172] Sat, 26 March 2011 15:18 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
If your searches are primarily on names, there are other algorithms that may be more suitable. For example, jaro_winkler is specifically designed for name comparisons. Please see the demonstration below.

SCOTT@orcl_11gR2> column score format 999
SCOTT@orcl_11gR2> select utl_match.jaro_winkler_similarity
  2  	      (search_term, 'Lough') as score,
  3  	    search_term
  4  from   search
  5  order  by score desc
  6  /

SCORE SEARCH_TERM
----- ------------------------------
   78 Lorch
   78 Longo
   70 Lowe
   70 Lock
   67 Luce
   66 Lowsky
   64 Lakha
   62 Lashay
   53 Laws
   53 Lake
   53 Lacy
   52 Laske
   52 Lewis
   51 Leckie
   51 Lacayo

15 rows selected.

SCOTT@orcl_11gR2>

Re: Oracle Text Index using Soundex [message #501182 is a reply to message #501174] Sat, 26 March 2011 15:51 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
Hi Barbara,

Thank you so much for your post. Can you explain how the jaro_winkler_similarity compares names? I ran the query against my test DB and 'Lowe' ends up scoring 70 and is pretty far down the list. Is there anyway I can configure Oracle to do better name matching? Or is this something I need to add into a dictionary?
Re: Oracle Text Index using Soundex [message #501185 is a reply to message #501182] Sat, 26 March 2011 15:56 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
Just to add:

In my results, I have:


78 Doug
78 Hugh

I'm wondering how the scoring works. I would expect "Lowe" to score higher than both of these, but it came back with a score of 70, thus being displayed further down. Any suggestions?
Re: Oracle Text Index using Soundex [message #501186 is a reply to message #501182] Sat, 26 March 2011 16:02 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
The Jaro-Winkler algorithm was originally created for the U.S. census to try to identify duplicate names, so people were not counted twice. It is described here:

http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance

Jaro_Winkler_Similarity takes that and expresses it as a percentage of 100.

There are other algorithms, such as Levenshtein distance, Damereau-Levenshtein, and metaphone. You may want to experiment with them and combine them. All of these methods are going to be slower than a search using a context or ctxcat index. You can use a very general search with a context index to return a small result set, then order that result set using one or more of the other methods. You and your users need to recognize that no search algorithm is going to be perfect and, if they don't find what they are looking for, then they may need to broaden their search. If you are searching for first and last names, you can use things like ACCUM with a context grammar search, to cause results that include both names to score higher, but include results that only contain one, so that if you search for "Rob Lough" you still might find "Rob Lowe".

[Updated on: Sat, 26 March 2011 16:04]

Report message to a moderator

Re: Oracle Text Index using Soundex [message #501187 is a reply to message #501186] Sat, 26 March 2011 16:06 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
With context grammar, you can also use fuzzy, which returns similarly spelled words that don't necessarily sound alike.
Re: Oracle Text Index using Soundex [message #501190 is a reply to message #501187] Sat, 26 March 2011 18:33 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
Ok...great, that makes sense. I'll look up the 3 different algorithms and play with them a bit.


I do have a fuzzy search as well. Is there a more efficient way of running a fuzzy and a soundex together in one sql statement?

Also, in the catsearch: (s.search_term, '<query><textquery grammar="context">!lough</textquery></query>', '')> 0

what is the 3rd parameter for and the "> 0" represent? Can I use either of these two to tweak my results?
Re: Oracle Text Index using Soundex [message #501191 is a reply to message #501190] Sat, 26 March 2011 20:01 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
To search for soundex or fuzzy:

SCOTT@orcl_11gR2> select search_term
  2  from   search
  3  where  catsearch
  4  	      (search_term,
  5  	       '<query>
  6  		  <textquery grammar=''context''>
  7  		    !Lough OR ?Lough
  8  		  </textquery>
  9  		</query>', '') > 0
 10  /

SEARCH_TERM
------------------------------
Lorch
Longo

2 rows selected.

SCOTT@orcl_11gR2> 


The third parameter of catsearch on a ctxcat index is for a structured query if you have a subindex created on a structured column. It is similar to the new sdata feature with context indexes and contains.

If you were using a context index, contains would return a score and you could use > some number to limit the result set or order by the descending score in an inner query and limit the rows using rownum in an outer query. These things and many others are not available with ctxcat indexes and catsearch. There is also a problem with ctxcat indexes and catsearch, when the optimizer chooses functional invocation, resulting in an error without returning rows. I always recommend context and contains instead of ctxcat and catsearch. With the new features, you can make the context index transactional and use sdata for structured data, so it has most of the advantages of catsearch and a whole lot more capabilities, without resorting to a query template to enable some of the context features.


Re: Oracle Text Index using Soundex [message #501203 is a reply to message #501191] Sun, 27 March 2011 10:54 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
If I change my index to a context:

1. Can I still use the lexer skipjoins:

begin
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.create_preference('mywordlist', 'BASIC_WORDLIST');
end;
/

begin
ctx_ddl.set_attribute('mylex', 'skipjoins', '`?#\[]{}~!@$%^&*()-_=+|;:,"<>./''');
end;
/

CREATE INDEX SEARCH_TXTIX ON SEARCH (SEARCH_TERM)
INDEXTYPE IS CTXSYS.CTXCAT PARAMETERS ('lexer mylex wordlist mywordlist');
commit;
/


2. Will there be a performance degradation? I've read that ctxcat is for varchar columns whereas context is for clobs.

3. How do I make the context index transactional?

4. Can you give me an example (or point to me where online I can read up on) of sdata for structured data?


Thanks again, you've been a tremendous help!
Re: Oracle Text Index using Soundex [message #501207 is a reply to message #501203] Sun, 27 March 2011 12:25 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
Quote:

1. Can I still use the lexer skipjoins?


Yes, as demonstrated in the example at the bottom of this post. The search finds L'owe as well as Lowe.

Quote:

2. Will there be a performance degradation? I've read that ctxcat is for varchar columns whereas context is for clobs.


I don't think there should be a performance degradation, but you should test and see. If you find there is a degradation, there are probably things that can be done to speed it up.

A ctxcat index cannot be used to index a clob. A context index can be used on either varchar2 or clob and other data types. I used a varchar2 column in the example at the bottom of this post.

Quote:

3. How do I make the context index transactional?


By adding the word transactional to your index parameters, as demonstrated in the example at the bottom of this post.

Quote:

4. Can you give me an example (or point to me where online I can read up on) of sdata for structured data?


I have included usage of an sdata column in the example at the bottom of this post. In the example, I added a "division" column of number datatype, which is structured data. Without sdata, if you wanted to filter your search by such a column, you would have a separate non-text index on that column, and the optimizer would likely choose one index or the other to execute your query. By adding "filter by division" during index creation, it makes that column part of the index, so that any search on the text column and the sdata column only requires one index hit and results in very fast searches. In the example below, it finds only those rows in division 1, by adding the sdata criteria to the contains clause. There are many more options. This is just a very simple example. I have added 75,978 rows of data by selecting from all_objects, in order to provide a realistic example, and demonstrated, using autotrace, how it uses only one index hit.

You can find more information about sdata in the Oracle Text Reference and Oracle Text Application Developer's Gudie of the searchable online documentation, for example:

http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cqoper.htm#CCREF2089


In the example below, I have also combined usage of accum and soundex and fuzzy and jaro_winkler_similarity to show how you can use the first three in a contains clause to return a small result set with one hit on a context index in an inner sub-query, then use the last one in an outer sub-query to order the rows from the inner query.

SCOTT@orcl_11gR2> create table search
  2    (division     number,
  3  	search_term  varchar2 (30))
  4  /

Table created.

SCOTT@orcl_11gR2> insert all
  2  into search values (1, 'Lewis')
  3  into search values (1, 'Laske')
  4  into search values (1, 'Lowsky')
  5  into search values (1, 'Leckie')
  6  into search values (1, 'Luce')
  7  into search values (1, 'Lock')
  8  into search values (1, 'Lacayo')
  9  into search values (1, 'Lashay')
 10  into search values (1, 'Lakha')
 11  into search values (1, 'Laws')
 12  into search values (1, 'Lake')
 13  into search values (1, 'Lacy')
 14  into search values (1, 'Rob Lowe')
 15  into search values (1, 'Rob L''owe')
 16  into search values (1, 'Lorch')
 17  into search values (1, 'Longo')
 18  into search values (1, 'Doug')
 19  into search values (1, 'Hugh')
 20  into search values (2, 'Rob Lowe')
 21  into search values (2, 'Rob L''owe')
 22  into search values (2, 'Lorch')
 23  into search values (2, 'Longo')
 24  select * from dual
 25  /

22 rows created.

SCOTT@orcl_11gR2> insert into search
  2  select object_id, object_name
  3  from   all_objects
  4  /

75978 rows created.

SCOTT@orcl_11gR2> commit
  2  /

Commit complete.

SCOTT@orcl_11gR2> begin
  2    ctx_ddl.create_preference ('mylex', 'BASIC_LEXER');
  3    ctx_ddl.set_attribute
  4  	 ('mylex',
  5  	  'skipjoins',
  6  	  '`?#\[]{}~!@$%^&*()-_=+|;:,"<>./''');
  7    ctx_ddl.create_preference ('mywordlist', 'BASIC_WORDLIST');
  8  end;
  9  /

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> create index context_idx
  2  on search (search_term)
  3  indextype is ctxsys.context
  4  filter by division
  5  parameters
  6    ('lexer	   mylex
  7  	 wordlist  mywordlist
  8  	 transactional')
  9  /

Index created.

SCOTT@orcl_11gR2> set autotrace on explain
SCOTT@orcl_11gR2> select utl_match.jaro_winkler_similarity
  2  	      (search_term, 'Rob Lough') jws,
  3  	    score (1), division, search_term
  4  from   (select score (1), division, search_term
  5  	     from   search
  6  	     where  contains
  7  		      (search_term,
  8  		       '(Rob accum (!Lough OR ?Lough))
  9  			and sdata (division = 1)',
 10  		       1) > 0)
 11  order  by jws desc, score (1) desc
 12  /

       JWS   SCORE(1)   DIVISION SEARCH_TERM
---------- ---------- ---------- ------------------------------
        88          8          1 Rob Lowe
        86          8          1 Rob L'owe
        54          8          1 Longo
        43          8          1 Lorch

4 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2149790340

--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |    40 |  1680 |    13   (8)| 00:00:01 |
|   1 |  SORT ORDER BY               |             |    40 |  1680 |    13   (8)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SEARCH      |    40 |  1680 |    12   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX              | CONTEXT_IDX |       |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CTXSYS"."CONTAINS"("SEARCH_TERM",'(Rob accum (!Lough OR ?Lough))
                             and sdata (division = 1)',1)>0)

Note
-----
   - dynamic sampling used for this statement (level=2)

SCOTT@orcl_11gR2>



Re: Oracle Text Index using Soundex [message #501210 is a reply to message #501207] Sun, 27 March 2011 13:06 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
You might be able to get the same execution plan without a sub-query, as shown below. However, sometimes Oracle Text is a little quirky and it is best to separate the text query from other things. What you want to avoid is having it apply the jaro_winkler_similarity to all of the rows, before it limits the rows using the text index.

SCOTT@orcl_11gR2> select utl_match.jaro_winkler_similarity
  2  	      (search_term, 'Rob Lough') jws,
  3  	    score (1), division, search_term
  4  from   search
  5  where  contains
  6  	      (search_term,
  7  	       '(Rob accum (!Lough OR ?Lough))
  8  		and sdata (division = 1)',
  9  	       1) > 0
 10  order  by jws desc, score (1) desc
 11  /

       JWS   SCORE(1)   DIVISION SEARCH_TERM
---------- ---------- ---------- ------------------------------
        88          8          1 Rob Lowe
        86          8          1 Rob L'owe
        54          8          1 Longo
        43          8          1 Lorch

4 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2149790340

--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |    39 |  1638 |    13   (8)| 00:00:01 |
|   1 |  SORT ORDER BY               |             |    39 |  1638 |    13   (8)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SEARCH      |    39 |  1638 |    12   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX              | CONTEXT_IDX |       |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CTXSYS"."CONTAINS"("SEARCH_TERM",'(Rob accum (!Lough OR ?Lough))
                     and sdata (division = 1)',1)>0)

Note
-----
   - dynamic sampling used for this statement (level=2)

SCOTT@orcl_11gR2>

Re: Oracle Text Index using Soundex [message #501213 is a reply to message #501203] Sun, 27 March 2011 16:09 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
Barbara:

You stated:

"The soundex that is used in Oracle Text context grammar is different, and requires that the two words be much more similar, not just in the first letter and first 3 coded consonants."

Do you know what the algorithm is or where I can read about it?

Re: Oracle Text Index using Soundex [message #501214 is a reply to message #501213] Sun, 27 March 2011 16:38 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
kymmi13 wrote on Sun, 27 March 2011 14:09
Barbara:

You stated:

"The soundex that is used in Oracle Text context grammar is different, and requires that the two words be much more similar, not just in the first letter and first 3 coded consonants."

Do you know what the algorithm is or where I can read about it?



As far as I know, Oracle keeps that a secret. I have just heard that it uses a combination of things. I have heard speculation that one of those things is meataphone, perhaps based on similar search results.


Re: Oracle Text Index using Soundex [message #502036 is a reply to message #501214] Sun, 03 April 2011 11:49 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
Hi Barbara,

In one of your examples:

SCOTT@orcl_11gR2> select utl_match.jaro_winkler_similarity
2 (search_term, 'Rob Lough') jws,
3 score (1), division, search_term
4 from search
5 where contains
6 (search_term,
7 '(Rob accum (!Lough OR ?Lough))
8 and sdata (division = 1)',
9 1) > 0
10 order by jws desc, score (1) desc
11 /

JWS SCORE(1) DIVISION SEARCH_TERM
---------- ---------- ---------- ------------------------------
88 8 1 Rob Lowe
86 8 1 Rob L'owe
54 8 1 Longo
43 8 1 Lorch


Can you explain why Longo and Lorch are still be returned when you're using the accum with "Rob"? I was under the impression the result had to have "Rob" in it.
Re: Oracle Text Index using Soundex [message #502037 is a reply to message #502036] Sun, 03 April 2011 11:59 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
Only one term has to match. See the following section of the online documentation for a detailed explanation of how accum works, how it is scored, and weighting options.

http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cqoper.htm#CCREF0302
Re: Oracle Text Index using Soundex [message #502039 is a reply to message #502036] Sun, 03 April 2011 12:17 Go to previous messageGo to next message
kymmi13
Messages: 9
Registered: March 2011
Junior Member
ok. I'm trying to get execution times to compare the different algorithms....is there a sql command that clears the cache results? I'm assuming my queries are being cached after running them once.
Re: Oracle Text Index using Soundex [message #502040 is a reply to message #502039] Sun, 03 April 2011 12:31 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
SCOTT@orcl_11gR2> alter system flush shared_pool;

System altered.

SCOTT@orcl_11gR2>
Re: Oracle Text Index using Soundex [message #502144 is a reply to message #502040] Mon, 04 April 2011 12:19 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9100
Registered: November 2002
Location: California, USA
Senior Member
You might also be interested in the new ndata available in 11.2 demonstrated below. It requires a section with tags. If you do not already have tags in your data, you can easily accomplish this by adding a multi_column_datastore, as in the example below.

SCOTT@orcl_11gR2> create table search
  2    (search_term  varchar2 (30))
  3  /

Table created.

SCOTT@orcl_11gR2> insert all
  2  into search values ('Lewis')
  3  into search values ('Laske')
  4  into search values ('Lowsky')
  5  into search values ('Leckie')
  6  into search values ('Luce')
  7  into search values ('Lock')
  8  into search values ('Lacayo')
  9  into search values ('Lashay')
 10  into search values ('Lakha')
 11  into search values ('Laws')
 12  into search values ('Lake')
 13  into search values ('Lacy')
 14  into search values ('Lowe')
 15  into search values ('L''owe')
 16  into search values ('Lorch')
 17  into search values ('Longo')
 18  into search values ('Doug')
 19  into search values ('Hugh')
 20  into search values ('Rob Lowe')
 21  into search values ('Rob L''owe')
 22  select * from dual
 23  /

20 rows created.

SCOTT@orcl_11gR2> begin
  2    ctx_ddl.create_preference ('mylex', 'BASIC_LEXER');
  3    ctx_ddl.set_attribute
  4  	 ('mylex',
  5  	  'skipjoins',
  6  	  '`?#\[]{}~!@$%^&*()-_=+|;:,"<>./''');
  7    ctx_ddl.create_preference ('mywordlist', 'BASIC_WORDLIST');
  8    ctx_ddl.create_section_group ('for_ndata', 'basic_section_group');
  9    ctx_ddl.add_ndata_section ('for_ndata', 'nd', 'nd');
 10    ctx_ddl.create_preference ('mcds', 'multi_column_datastore');
 11    ctx_ddl.set_attribute ('mcds', 'columns', 'search_term nd');
 12  end;
 13  /

PL/SQL procedure successfully completed.

SCOTT@orcl_11gR2> create index context_idx
  2  on search (search_term)
  3  indextype is ctxsys.context
  4  parameters
  5    ('lexer		mylex
  6  	 wordlist	mywordlist
  7  	 section group	for_ndata
  8  	 datastore	mcds
  9  	 transactional')
 10  /

Index created.

SCOTT@orcl_11gR2> select score (1), search_term
  2  from   search
  3  where  contains
  4  	      (search_term,
  5  	       'ndata (nd, Rob Lough, noorder, proximity)',
  6  	       1) > 0
  7  order  by score (1) desc
  8  /

  SCORE(1) SEARCH_TERM
---------- ------------------------------
        47 Rob L'owe
        47 Rob Lowe

2 rows selected.

SCOTT@orcl_11gR2>

[Updated on: Mon, 04 April 2011 13:17]

Report message to a moderator

Re: Oracle Text Index using Soundex [message #503260 is a reply to message #501203] Wed, 13 April 2011 20:54 Go to previous message
johanelmander457
Messages: 1
Registered: March 2011
Junior Member
Hi

This topic help me a lot in developing my project. I will contribute more when I finished it.
Previous Topic: Near Clause
Next Topic: query with clob
Goto Forum:
  


Current Time: Sat Nov 23 06:18:19 CST 2024