Oracle Text - wildcard expansion too many rows [message #312660] |
Wed, 09 April 2008 10:00 |
SaraC
Messages: 81 Registered: August 2005
|
Member |
|
|
Just typed a big message and lost it all so will try to keep this shorter.
Using Oracle Text 10g, implementing a CONTAINS search. Have created a stoplist using ctx.ddl however it doesnt get any terms added to it since we want to search on all terms including stopwords. so all terms are indexed for searching. ctxsys.context Index has been created, with printjoins specified so special chars are indexed too.
I want to search for all terms in any order.
Trying to run a search using
CONTAINS(term, '(%pain% AND %in% AND %ear%)
gets the message Wildcard query expansion returns too many results error - due to the use of 'in' (ive tried the other terms without in it returns results.
'In' matches a very large number of terms in our dataset >15000 ( there is no max rows defined explicitly as I thought the default was 5000)
The wierd thing is that when I use
CONTAINS(term, '(%pain% AND %the% AND %chest%)
where %the% on its own matches well over 15000 records, a result is returned with no error.
Is this a problem with %in%?
Thanks for any info
Sara
|
|
|
Re: Oracle Text - wildcard expansion too many rows [message #312751 is a reply to message #312660] |
Wed, 09 April 2008 13:03 |
|
Barbara Boehmer
Messages: 9101 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
In Oracle 10g, the default value for wildcard_maxterms is 5,000 and the maximum that you can set it to is 15,000. In Oracle 11g, the default is 20,000 and the maximum is 50,000. So, until you upgrade, all you can do is set it to 15,000 and find a way to narrow your search terms, perhaps settling for adding "in" as a stopword.
|
|
|
Re: Oracle Text - wildcard expansion too many rows [message #312991 is a reply to message #312751] |
Thu, 10 April 2008 05:18 |
SaraC
Messages: 81 Registered: August 2005
|
Member |
|
|
Thanks for your reply Barbara.
I just checked and the database that the Oracle Text query is running on is on 11g now and still with no wildcard_maxterms set, when creating the preferences etc.
Ive checked the tokens created in the index
IN has more than 20000 tokens
CONTAINS(term, '(%in%)', 1) > 0
returns an error - wildcard expansion too many terms
OF has more than 50000 tokens
CONTAINS(term,'(%of%)',1) > 0
returns more than 50,000 records
CONTAINS(term, '(%pain% AND %in% AND %ear%)', 1) > 0
I expect to return only 2or 3 rows but get the expansion error
CONTAINS(term, '(%pain% AND %of% AND %back%)', 1) > 0
Where the %of% has more tokens than the 50,000 max
Including the above statement in my query still manages to return 6 records.
I cant see how the %of% query can return records where part of the search text has to process more than the max records, but the %in% query cant
I presume it could be because '%in%' is part of more words in my dataset that '%of%' is.
Thx
Sara
|
|
|
|
Re: Oracle Text - wildcard expansion too many rows [message #313917 is a reply to message #312660] |
Tue, 15 April 2008 03:07 |
SaraC
Messages: 81 Registered: August 2005
|
Member |
|
|
Thanks Barbara
We have set the wildcard_maxterms to the maximum of 50000 which allows the search to return results.
I realise that the error could still be returned depending on the terms in the database and the search terms, but its less likely now. So its resolved - for now.
I know it would be easier to use the stoplist - but the requirements say I have to search on stopword terms.
Thanks
|
|
|
Re: Oracle Text - wildcard expansion too many rows [message #318249 is a reply to message #312660] |
Tue, 06 May 2008 04:15 |
SaraC
Messages: 81 Registered: August 2005
|
Member |
|
|
Another related question.
After changing the wildcard_maxterms to solve the immediate problem, I was asked these questions relating to upping the wildcard_maxterms attribute from default of 20000 to max of 50000.
Will changing this attribute add any additional stress to the db server?
Will it affect the db configuration in any way? - I thought not since it is an attribute of the index and can be changed at index build time.
Will the change result in any additional SGA usage on the db?
Can anyone point me to somewhere where I can find this sort of information please?
Ive had a look in Oracle Text Developers Guide.pdf but couldn't see anything to answer my queries.
Many thanks
Sara
[Updated on: Tue, 06 May 2008 04:19] Report message to a moderator
|
|
|
|