index performing bad [message #133498] |
Fri, 19 August 2005 06:54 |
sebastianR
Messages: 33 Registered: August 2005
|
Member |
|
|
hi,
i want to tune a select query by using a concatenated index on two columns of my table.
one of the fields the index is on is KEY_VAL a VARCHAR2 with selectivity: 640 000 / 870 000
The other one (END_DATE) is rather unselective but in combination with it should be selective?
15 / 870 000
My index looks like this:
CREATE INDEX IDX_1 ON TABLE_OBJ_KEY
("TABLE"."ISDATENULL"("END_DATE"), UPPER("KEY_VAL"))
...
Where isdatenull is a simple function which asks if "date is null" ... (FBI for date comparison).
my query:
select distinct COL_1, 3742
from TABLE_1 T1
where T1.end_date>sysdate
--- THE FOLLOWING QUERY IS INDEXED ----
UNION ALL SELECT DISTINCT COL_1, 3742
FROM TABLE_1 T1
WHERE isdatenull(T1.END_DATE) = '1'
AND UPPER(T1.KEY_VAL) LIKE UPPER('%125')
-- THE CODE BELOW IS NOT IMPORTANT ------
UNION ALL SELECT DISTINCT T2.KEY_ID, 3742
FROM TABLE_2 T2
WHERE UPPER(T2.PUID) LIKE UPPER('%125')
Ok, the index is working, and it's faster as a table access full scan, but only in the following cases:
KEY_VAL = '%12345%'
KEY_VAL = '%1234'
KEY_VAL = '123%'
So the problem is, unlike the table scan which performs with a constant execution time of about 1,7 seconds, my new index performs in dependece on what the user enters at KEY_VAL.
Deadly slow is selectivity when KEY_VAL = '%12' for example (34 seconds).
I understand why, at least I guess, because nearly every row matches '%12', the index range scan will return nearly the whole table, right? So there are many many table accesses by index rowid, right?
I do fully understand why the index is slower in this case, but I would appreciate an explanation WHY and HOW this slowness occurs.
index scan returns many rows (nearly all), for each row there is a table scan by rowid, ... why is this slower than a full table scan?
And why doesn't the CBO do a full table scan in this case?
Cost of index explain plan is 50.
Cost of TAF explain plan is 588.
how am i going to control when the table access full scan should be made?
Thank you for helping me out,
I appreciate every comment, suggestion, etc.,
Sebastian
|
|
|
Re: index performing bad [message #133512 is a reply to message #133498] |
Fri, 19 August 2005 07:26 |
sebastianR
Messages: 33 Registered: August 2005
|
Member |
|
|
Ok, I figured out that I didn't analyze the table and the index, so the cost calculation was wrong.
I know did it, but it seems the CBO isn't accurate enough, it switches to full table access at KEY_VALUE = '1%' but still uses the index when KEY_VALUE = '12%' but the time needed is bigger as the full table access.
sebastian
|
|
|
|
|
Re: index performing bad [message #133530 is a reply to message #133498] |
Fri, 19 August 2005 08:28 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
Jim hit it on the multiblock advantage of the fts. Although, a FFS index scan can do multiblock as well, it would mean reading the entire index, leaf and branch in all, into memory and treating that as if it were a table.
Also, I would think histograms would be very important in your case. How exactly did you gather statistics? I'd suggest something along the lines of
dbms_stats.gather_table_stats(owner,table,cascade=>true,method_opt=>'for all indexed columns size 250');
Also I would think there will be a big difference between a query containing
like '%12'
and one containing
like '12%'
Unless I'm misreading the above, it looks like you are using the two interchangeably, and while I don't currently have a test to prove they will perform very differently (maybe I'll work on one and find out), I'm pretty sure having the wildcard at the beginning will be very different than having it at the end, meaning they aren't interchangeable performance wise.
|
|
|
Re: index performing bad [message #133540 is a reply to message #133530] |
Fri, 19 August 2005 09:16 |
sebastianR
Messages: 33 Registered: August 2005
|
Member |
|
|
Thank you for the link @JSI2001, I'm studying right now.
As a matter of fact I didn't collect any statistics at all except for using ANALYZE on my table and index on which i try to improve performance (I've read know, that it is better to use DBMS_STATS, and I am trying to get that to work). I also understand why to use histograms and I will going to use them.
@smartin
You are right, I already encountered the difference when using indexes on LIKE comparisons and the difference between '%12' and '12%'.
There is a difference on performance but I still actually wonder why my index is used at all, in cases like '%12', because I've tried indexing a column (not a concatenated index) compared to '%12' and the index was not used.
Is this because i use a concatenated index? I've read that indexes on LIKE-comparisons (with wildcards?) don't work at all but i think that is not true?
I've also read about a workaround for this '%12' case, where it is possible to use reverse() and altering the WHERE-clause but I didn't use it because my concatenated index was used nevertheless and i have enough other things to work at for now
So the cost is the factor for CBO to determine which plan to execute, right? So at the moment I am not satisfied with the CBO to chose index range scan before full table scan, that means my first approach should be collecting statistics, in particular histograms (because of the indensity) so mayble the CBO chooses my empirically faster plan already.
One question about altering optimizer options which for now I won't do: How does one estimate the overall change of executing plans when altering the optimzer options because of tuning a specific SQL-statement? Is this approach legitimate at all? In my opinion this would be really careless.
Thank you for your suggestions,
Sebastian
|
|
|
Re: index performing bad [message #133543 is a reply to message #133498] |
Fri, 19 August 2005 09:25 |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
If by altering optimizer options you just mean gathering stats, then I think the more information you can provide to CBO the better.
But if you mean things like the optimizer_mode and othr parameters, then yes you should definitely be aware of the global effect these changes would have on your database. I find it best to set these values to what you think (and test in development to prove) would be the "best" settings at the global level for "most" of the work the db will do.
Then you can temporarily alter them at the local level for specific queries or transactions or applications by using things like the all rows or first rows hints, or alter session statements.
|
|
|