Single Query suddenly goes slow until restart or change [message #428168] |
Tue, 27 October 2009 06:59 |
ihristov
Messages: 2 Registered: October 2009 Location: Bulgaria
|
Junior Member |
|
|
Hello all,
I have performance issue with one single query. The query is called using dbExpress from Delphi application.
Since a month, almost every week this query starts going VERY slow. Usually it executes in a 0.5-1 second, but when it's slowed down, it takes approx. 4-5 minutes to execute.
The query runs on a single table (~4 000 000 rows) and runs on primary index. The index itself consists of 5 fields - the query uses the first 4 of them. I know it's not a good design but restructuring the table/index is out of question, AFAIK.
Until now, I figured out two ways to speed things up. The first and obvious is to restart the server or shut/start the database. It works. For a week. Then we're back at first square.
The second way is some kind of a mystery to me: If I change the query a little bit, never mind how - for example, adding 'and (1=1)' in the clause, after deploying the application, it starts running fast. Again for a week or so.
I've attached an jpeg showing the explain plans on slow and fast executing queries. They look almost the same.
I'm running out of ideas for now. It's very annoying (and unsafe) to restart the machine on a week basis. Thus any help is greatly appreciated.
Things I've done so far:
- installed last service packs from Oracle and Microsoft.
- tried different versions of dbexpora.dll.
- Regularly gathered and estimated statistics and analyzings.
- Made sure that the server is not under heavy load (antiviruses, firewalls, archive loggers are stopped or uninstalled for a while).
- made sure that there is enough free space on hard disks.
- the disk array itself is in good health - defragmented, HDD tools doesn't find anything suspicious.
Final thoughts: Maybe the sql cache is going overloaded and someway stops executing the query. That's why when I restart the machine or change the query clause insignificantly, it starts running fast. I took a peek on v$sql and v$sqlarea but there is nothing special, and honestly, I'm not very familiar reading the values in this view - I'm just comparing the values returned by this query with the others - they look similar.
Anyway, if someone could help, it would be great.
Regards,
Ivan
Screenshots showing the explain plans for slow and fast query:
-
Attachment: Query.jpg
(Size: 188.69KB, Downloaded 4765 times)
|
|
|
|
|
Re: Single Query suddenly goes slow until restart or change [message #428931 is a reply to message #428177] |
Fri, 30 October 2009 16:42 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
This is actually a fairly common problem. It is caused by a combination of Bind Variables and Skewed Data.
What happens is the first time you run a query, Oracle peeks at the values of the bind variables and optimises a plan for those values. Whilst the SQL remains in the Shared Pool, the same plan will be used.
If the first run was for bind variables that return a "normal" amount of data, then Oracle will choose a plan appropriate for normal conditions and all is well. However, if the bind variables for the first run return an unusually large amount of data, then Oracle will optimise for larger volumes by using full scans and hash joins. This is fine for large volumes, the problem comes when oracle continues to use this plan for bind variable combinations that would return small volumes.
The reason it suddenly goes bad and stays that way is because - when the SQL is not used for a while - it gets aged out of the shared pool. If the next run uses skewed values, the "bad" plan gets used and stays in the shared pool for subsequent runs.
Rather than bouncing the database, you can just flush the shared pool to get the SQL to re-parse. Try this next time it happens, if it works then it means I am probably right.
However, if I am right it would mean that the slow plan and the fast plan would be different. Yours are the same. But looking at the slow session, we see the Long Operations tab, meaning that session has been doing some big full table scans at some stage. Since your plan does not contain any full table scans, it mut have come from an earlier SQL. It is possible that you have not caught the problem SQL here.
Anyway, try flushing the pool next time. If that works, look for a SQL with bind variables, where the bind columns could have a small number of matches or a large number depending on the value supplied.
You can rewrite the SQL in two SQLs - one optimised for long runs. Alternatively you can use Plan Stability (outlines) to guarantee a preferred plan, or you can add hints to the SQL.
Ross Leishman
|
|
|