RE: SQL Performance tuning advice
Date: Wed, 3 Jul 2024 09:22:12 +0000
Message-ID: <LO0P265MB4461CF564F857A109893A474E6DD2_at_LO0P265MB4461.GBRP265.PROD.OUTLOOK.COM>
We have given the user the data they require so this has reduced a little in priority.
> You can use SQL profiles or SQL patches with force matching - I would recommend the latter.
This does seem to be the best approach - thanks.
> Just create a good old CBO trace (10053 trace name context forever, level 1) in both environments and have a look for lines like "SPD: Directive valid: dirid = 13237011096450186301, state = 1, flags = 1, loc = 1 {EC(91171)[1, 2]}".
Except that there are more than 50,000 lines like that in test, and more than 99,000 lines in prod. Bug 29937655 is really not helping here.
> Please be aware that your non-production database has to be exactly the same as applied transformations can influence which SPDs are/can be used and also impact the SPD dynamic sampling results.
I took a copy of a standby using a SAN snapshot, and was still unable to reproduce the issue. I notice we appear to have hit bug 27557237 in production where IOTFRSPEED is set to 4096 where it is more like 334677 in the non-production environments which use the same SAN. I don't think this is the issue though.
Another thing I noticed is that cardinality seems much better in the environments where the query runs fast. This seems to be because the optimizer has added a load of OPT_ESTIMATE hints - Is this statistics feedback? This is not mentioned in the notes of the plan though.
Another thing I noticed is that an index is selected in prod where the query runs slow instead of a table scan in test where it runs fast because:
>> Index Card adjusted from 107533.272786 to 1.000000 due to adaptive dynamic sampling
We have set optimizer_adaptive_statistics = true, but it is also true in the test instances. I am not sure whether that is the whole reason the poor plan is chosen. I am not sure which parts of the trace file to zoom in on. The production trace is 150M, and the test one is 80M, so without this knowledge it is going to take forever to understand it.
When I tried to pare down the SQL to just the three tables in that section of the plan, I get the table scans and hash joins that are fast in non-production. Maybe I have missed a predicate.
I would like to understand why the plan is different in production, as it is rather annoying for developers for their code to run like a rocket in every environment apart from production, but in this case I think the SQL patch is the way to go.
I have also opened another SR about bug 29937655, though I don't expect it will be fixed as we are on 19c.
Thanks for your help!
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Jul 03 2024 - 11:22:12 CEST