Home » RDBMS Server » Performance Tuning » EXPLAIN PLAN CHANGE (After migration plan getting changed.)
EXPLAIN PLAN CHANGE [message #274535] Tue, 16 October 2007 06:17 Go to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
Hi I have a query that joins about 14 tables. The largest table is a partitioned table with total rows of about 6million.

The query is returning results in four and half min (oracle version 9.2.0.5)

We migrated to a higher version (9.2.0.Cool. Now the plan is changed and the same query is taking about 30min.

The new machine is much faster, and of higher spec.

We are now running both the machines in parallel.

All the stats related information in both the environments is same. Still when we generate plan, they are different.

Have you ever faced such a problem? do you know what to look for?

Stats are same.
Partitions are same.
Number of records are same.
Both environments are running in Parallel.
I compared the databases using toad. Everything except db_block_checking are same.

Thanks
Re: EXPLAIN PLAN CHANGE [message #274549 is a reply to message #274535] Tue, 16 October 2007 07:00 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
But patchset is not the same so code is not the same.
Yes it happens all the time, this is why tests are mandatory BEFORE applying patch on production.

Activate a 10053 trace on both system to see why optimizer chooses a different plan.

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274589 is a reply to message #274535] Tue, 16 October 2007 09:04 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
We know that the plan itself is not correct. It is calculating the cost wrong and picking a wrong plan.

As I have the plan from the old version, I used order hint and forced the same plan, which is costing more than the one generated by new version. Yet the high cost plan is giving results in less than three min.

Point is why optimizer is calculating the cost of the plan that takes more time to execute as less than the one which performs better.
Re: EXPLAIN PLAN CHANGE [message #274593 is a reply to message #274589] Tue, 16 October 2007 09:12 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Analyze trace, this gives you why.

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274635 is a reply to message #274535] Tue, 16 October 2007 11:54 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
Now I have trace files for both the environments. But strugguling to understand why optimizer calculated the cost and choose a wrong plan.

Michel the point is the plan is not correct. In new version the cost for bad plan is less than good one.

Does the trace gives any clue regrding why this is happening?
Re: EXPLAIN PLAN CHANGE [message #274639 is a reply to message #274535] Tue, 16 October 2007 12:05 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
Just to give more info: We know that we can get better performance from the query by forcing an order(i checked this using order hint, and got the response in less than three minutes which is better than old version). It's just the optimizer thinking that a plan which is not good costs less and following it. I can't use hints as this query is generated and fired by BO.

OLD ENV(9.2.0.5)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 3.74 3.95 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 268.08 265.17 3699 24913150 0 25
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 271.82 269.13 3699 24913150 0 25

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net message from client 2 342.01 342.02
db file scattered read 9 0.05 0.24
db file sequential read 16 0.01 0.05
latch free 76 0.02 0.38
direct path write 1147 0.35 4.85
direct path read 1147 0.02 0.34
SQL*Net more data to client 2 0.00 0.00
********************************************************************************

New env (9.2.0.Cool
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 4.00 3.92 0 0 0 0
Execute 1 0.01 0.00 0 0 0 0
Fetch 1 1765.70 1725.22 622 116156317 0 25
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 1769.71 1729.15 622 116156317 0 25

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 2 0.00 0.00
SQL*Net message from client 2 859.06 859.07
db file scattered read 27 0.01 0.10
db file sequential read 340 0.00 0.03
latch free 5 0.01 0.06
SQL*Net more data to client 2 0.00 0.00
********************************************************************************
Re: EXPLAIN PLAN CHANGE [message #274648 is a reply to message #274639] Tue, 16 October 2007 12:29 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
I said a 10053 trace not a 10046.
A 10053 gives you all steps that optimizer explored.

Please read carefully what we post.

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274703 is a reply to message #274648] Tue, 16 October 2007 22:04 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
There are 87178291200 different possible join orders of a 14-table join. The CBO does not evaluate all of them. To my knowlege the choice of which plans to evaluate is somewhat chaotic; the slightest change in inputs can result in completely different subset of plans being evaluated.

The chances of the plan for such a SQL remaining stable after a reboot is suspect. The chances of stability between databases is remote indeed.

This will get worse still when you upgrade to 10g. Dynamic sampling means that the plan can be affected from run-to-run by the physical distribution of data in your table. ie. Same data + different storage = different plan.

The ONLY way to ensure plan stability - espacially between databases - is to use the ORDERED hint, plus USE_* hints to specify join methods, plus INDEX*/FULL hints to specify access methods. Alternatively, use OUTLINES (which do exactly the same thing).

Ross Leishman
Re: EXPLAIN PLAN CHANGE [message #274749 is a reply to message #274703] Wed, 17 October 2007 01:20 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Is stability the way?
Each version comes with new access paths and new improvements (and new bugs).

You want stability, use RBO and nested loops.
Otherwise, read the new features, follow the new guidelines, let the optimizer chooses its path and fix (hint) if it is not correct. Yes, this is painful but this the way for performances.
If performances is not your issue, go back to RBO.

By the way, hints (ORDERED, INDEX and so on) is not a guarantee of stability. Internal query transformations are applied BEFORE hints, so you could have some surprises with ORDERED hint for instance if Oracle internally changes the order of table in FROM clause.

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274802 is a reply to message #274749] Wed, 17 October 2007 03:16 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
I wasn't advocating stability, just describing how you would go about achieving it.

Generally speaking, I agree with you - get good stats and let CBO do the rest.

But 14-table joins are a different story. 87178291200 join orders is just the start, then you have up to 4782969 different join-type combinations and 16384 access method combinations for a total of 6,831,653,917,205,672,755,200 different plans. Obviously a great many of these are silly plans that don't deserve evaluation, but it's terribly easy for a good plan to be missed.

With SQLs this big, the chances of an optimal plan NOT being amongst those evaluated is quite reasonable.

In these situations I would usually suggest a simple LEADING or ORDERED hint. In most cases the driving table is not a choice that is affected by version upgrades, and its enough to lead CBO to the optimal plan. However if you have a mission-critical application or a crippling support contract, then plan stability is a pretty good thing.

Ross Leishman
Re: EXPLAIN PLAN CHANGE [message #274812 is a reply to message #274535] Wed, 17 October 2007 04:39 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
As I told before, I am trying to get the optimizer get the plan that would take less time because these queries are fired from BO. If this is a query is used in a sql, I would have gone for a hint(which I already demonstrated).

The point here is why the optimizer is calculating the cost of the non-optimum plan to be less than the optimum one, and also differently between 9.2.0.5 and 9.2.0.8?

Michel, I have now got the 10053 trace files. I used tkprof from 9.2.0.1. Will this make the formatting any different? I still did not find any useful information what the optimizer considered before choosing this plan.

Thanks,
Aruna
Re: EXPLAIN PLAN CHANGE [message #274813 is a reply to message #274812] Wed, 17 October 2007 04:50 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
There is no format tool for 10053 trace you have to read it yourself.
You MUST have useful information in the traces, it can't be otherwise. These files give all the steps the optimizer did, all the paths it considered, all the statistics it used, all the choices it made...

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274842 is a reply to message #274813] Wed, 17 October 2007 06:51 Go to previous messageGo to next message
mson77
Messages: 208
Registered: August 2007
Location: Brazil
Senior Member
Hello arunprasad_bh,

As you said before that you have access to both oracle environments. And here we are talking about paths chosen by the optimizer considering that both have the same stats information.

Could you run and paste here the result of the command below for each oracle environment?
show parameter optimizer;
Regards,



mson77
Re: EXPLAIN PLAN CHANGE [message #274846 is a reply to message #274842] Wed, 17 October 2007 07:03 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
This is just a small part of many reasons.
And this is include in the 10053 trace.
OP has just to analyze the differences between the 2 files.

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274866 is a reply to message #274535] Wed, 17 October 2007 08:32 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production

SQL> show parameter optimizer

NAME TYPE VALUE
------------------------------------ ----------- ------------------
optimizer_dynamic_sampling integer 1
optimizer_features_enable string 9.2.0
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_max_permutations integer 2000
optimizer_mode string CHOOSE

Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

SQL> SHOW PARAMETER OPTIMIZER

NAME TYPE VALUE
------------------------------------ ----------- -------------------------
optimizer_dynamic_sampling integer 1
optimizer_features_enable string 9.2.0
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_max_permutations integer 2000
optimizer_mode string CHOOSE
SQL>

Michel I got the 10053 files from both the environments. The 9.2.0.8 is extensive. But 9.2.0.5 has not produced equivalent file. It does not have all the stats provided by higher version.
Re: EXPLAIN PLAN CHANGE [message #274870 is a reply to message #274866] Wed, 17 October 2007 08:45 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Which level do you use?
Use level 1. Level 2 does not give statistics.

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274872 is a reply to message #274535] Wed, 17 October 2007 08:50 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
Level 1 only. Ran the same set of commands in both the environments. But 9.2.0.5 is not giving the detailed steps.
Re: EXPLAIN PLAN CHANGE [message #274875 is a reply to message #274535] Wed, 17 October 2007 09:07 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
It looks like these files are ordered differently. But do have the details in both files.

What exactly is to look for?
Re: EXPLAIN PLAN CHANGE [message #274894 is a reply to message #274875] Wed, 17 October 2007 10:55 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
What are the differences.
Why optimizer chooses a different path.
You have to follow step by step the optimizer in both cases.
Long and painful work but it is the only to know the reason.

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274905 is a reply to message #274535] Wed, 17 October 2007 11:18 Go to previous messageGo to next message
arunprasad_bh
Messages: 32
Registered: June 2007
Member
There are more than 300 different combinations considered. There are also differences in the way the join orders are numbered.

Has someone did this before? I would attempt to do this the hardway, just to know the cause. But knowing that some one did this exercise before, would be of great help.

In the first glance, I can say that new version has not got the optimum join order. I have compared join order using #n numbering of the tables.

Re: EXPLAIN PLAN CHANGE [message #274907 is a reply to message #274905] Wed, 17 October 2007 11:20 Go to previous messageGo to next message
Michel Cadot
Messages: 68716
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
I did it and J. Lewis wrote a whole book (and is currently finishing the second one) on optimizer, I think he spent many night to analyze 10053 (and even other traces on sort, hash and parallel query).

Regards
Michel
Re: EXPLAIN PLAN CHANGE [message #274944 is a reply to message #274812] Wed, 17 October 2007 16:47 Go to previous message
flyboy
Messages: 1903
Registered: November 2006
Senior Member
arunprasad_bh wrote on Wed, 17 October 2007 11:39

As I told before, I am trying to get the optimizer get the plan that would take less time because these queries are fired from BO. If this is a query is used in a sql, I would have gone for a hint(which I already demonstrated).

First, apologize my terminology, which may not be accurate, as I am not BO developer. But it is possible to force hint into BO generated query.
The trick is in creation of user defined (derived) column in the universe containing this hint followed by anything (preferrably NULL, eg. /*+ ORDERED */ NULL). Then put this column as the first one in the report query.
It is quite tricky, but the generated query contains hint in the right place (right after SELECT). Also I do not know whether it is possible to assure order of tables in the FROM clause, so you may prefer LEADING hint.
Previous Topic: Stats Gathering on Whole Schema
Next Topic: Interpretation of Alert Log Errors
Goto Forum:
  


Current Time: Sat Nov 23 05:17:33 CST 2024