Help with explain plans [message #180995] |
Thu, 06 July 2006 04:16 |
charliebankes
Messages: 11 Registered: June 2006
|
Junior Member |
|
|
Running Oracle 9i r2.
Doing a simple query against 4 tables - CS_REP, TP_REP, MXOD_TP_REP, and TPBD_REP- CS_REP and TP_REP are the larget with about 10 mill rows, the other two have about 1 mill rows in each.
All tables have had stats gathered using dbms_stats.
When run with RULE hint and when stats have been dropped for CS_REP table performance is good. BUT when stats are re-applied for CS_REP then performance degrades considerably from under 1 second to 19 minutes!
Explain plans and query below.
I'm currently regenerating stats for CS_REP for all columns with skewonly paramenter but in the meantime has anyone got any ideas?
SELECT
A.M_CONTRACT, A.M_TP_PFOLIO, DECODE(trim(A.M_CNT_TYPO),'',A.M_CMP_TYPO,A.M_CNT_TYPO)
as TYPO, A.M_CNT_TYPO, A.M_TP_DTETRN, A.M_TP_DTESYS, B.M_F_TYPE, B.M_F_CURRENCY, B.M_F_AMOUNT
, B.M_F_VALUE, B.M_F_OBSCOM, C.M_OMR_PC
FROM TP_REP A
, CS_REP B
, MXOD_TP_REP C
, TPBD_REP D
WHERE ( B.M_REF_DATA=571
AND A.M_NB = B.M_NB
AND A.M_REF_DATA = B.M_REF_DATA
AND A.M_TP_PFOLIO = B.M_TP_PFOLIO
AND A.M_TPID = C.M_TPID
AND A.M_TPID = D.M_TPID
AND (B.M_F_OBSCOM <> 'Y'
OR A.M_CMP_TYPO = 'FXNDF') )
AND (B.M_F_VALUE <= '23-JUN-2006')
AND a.m_ref_data = c.m_ref_data
AND c.m_tpid = d.m_tpid
AND a.m_ref_data = d.m_ref_data
AND c.m_ref_data = d.m_ref_data
FULL STATS - slow
Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop
SELECT STATEMENT Optimizer Mode=CHOOSE 1 127896
TABLE ACCESS BY INDEX ROWID CS_REP 1 58 1
NESTED LOOPS 1 172 127896
HASH JOIN 1 114 127895
HASH JOIN 208 K 4 M 95791
VIEW ABN_ODYSSEY_PROD.index$_join$_004 1 M 8 M 5084
HASH JOIN 208 K 4 M 95791
INDEX RANGE SCAN TPBD_REF_DATA 1 M 8 M 344
INDEX FAST FULL SCAN TPBD_TPID 1 M 8 M 344
TABLE ACCESS BY INDEX ROWID MXOD_TP_REP 1 M 14 M 89947
INDEX RANGE SCAN MXOD_TP_REF_DATA 1 M 4420
TABLE ACCESS BY INDEX ROWI TP_REP 1 M 100 M 30179
INDEX RANGE SCAN TP_REF_DATA 1 5078
INDEX RANGE SCAN CS_NB 14 2
RULE - quick
Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop
SELECT STATEMENT Optimizer Mode=HINT: RULE
TABLE ACCESS BY INDEX ROWID MXOD_TP_REP
NESTED LOOPS
NESTED LOOPS
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID CS_REP
INDEX RANGE SCAN CS_REF_DATA
TABLE ACCESS BY INDEX ROWID TP_REP
INDEX RANGE SCAN TP_PFOLIO
AND-EQUAL
INDEX RANGE SCAN TPBD_TPID
INDEX RANGE SCAN TPBD_REF_DATA
INDEX RANGE SCAN MXOD_TP_TPID
NO STATS ON CS_REP - quick
Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop
SELECT STATEMENT Optimizer Mode=CHOOSE 1 6148
TABLE ACCESS BY INDEX ROWID TPBD_REP 1 9 1
NESTED LOOPS 1 199 6148
NESTED LOOPS 15 2 K 6143
NESTED LOOPS 17 2 K 6138
TABLE ACCESS BY INDEX ROWID CS_REP 15 K 1 M 17
INDEX RANGE SCAN CS_REF_DATA 1 K 5
TABLE ACCESS BY INDEX ROWID TP_REP 1 90 1
INDEX RANGE SCAN TP_PFOLIO 1 3
TABLE ACCESS BY INDEX ROWID MXOD_TP_REP 1 15 1
INDEX RANGE SCAN MXOD_TP_TPID 1 2
INDEX RANGE SCAN TPBD_TPID 1 2
[Updated on: Thu, 06 July 2006 06:23] by Moderator Report message to a moderator
|
|
|
|
|
|
|
|
Re: Help with explain plans [message #181118 is a reply to message #181105] |
Thu, 06 July 2006 22:14 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Both RULE and no stats plans perform Nested Loops joins, which means that the first row will be returned very quickly - hence your 1 second response. How long does it take to return every row though?
I'll assume it's less than 19 minutes using either of these plans. The problem is that the RULE based optimizer thinks that B.M_REF_DATA=571 will be very restrictive (actually, it "thinks" no such thing. The rule states that if there is an indexed constant predicate then that table is used as the driving table). The CBO thinks that it won't be so hot, and chooses to join CS_REP in last. This means that it joins all of the other tables for every value of M_REF_DATA, and only filters the non-571 rows at the end.
If that predicate really is restrictive (and it seems to be) then you need to encourage the CBO to drive off that table.
- Are you sure you have histograms calculated for CS_REP.M_REF_DATA? Even if you do, it may not necessarily help.
- Since the CBO does not seem to be aware of how few rows have a value of 571, try adding a CARDINALITY hint: /*+ CARDINALITY(B,10)*/
- If either of these help to get the query driving off CS_REP, but still using HASH joins on the other tables, you will need to use a FIRST_ROWS hint, or a collection of INDEX and USE_NL hints.
Ross Leishman
|
|
|
|
Re: Help with explain plans [message #181178 is a reply to message #181118] |
Fri, 07 July 2006 03:44 |
charliebankes
Messages: 11 Registered: June 2006
|
Junior Member |
|
|
New observation:
Table cs_rep has 10 million rows.
Column m_ref_data has 3 distinct values 571,572,573 all evenly spread.
An index exists on m_ref_data and m_f_value (in that order).
There are approx. 14300 distinct values in the index.
When I set the num_distnct in m_ref_data to 3 and the density to 1/3 the query runs slow. when the num_distnct in m_ref_data is set to 10million / 3 and density set to 3 / 10million the query runs fine.
I always thought that the num_distnct was the actual number of distinct values in this case 3.
Any thoughts?
[Updated on: Fri, 07 July 2006 04:16] Report message to a moderator
|
|
|
Re: Help with explain plans [message #181407 is a reply to message #181178] |
Sun, 09 July 2006 22:11 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
If there are 3M rows with value 571, then I very much doubt that all of them are being returned in the 1 second originally quoted. If you select all 3M rows using the RULE plan, you should find it takes a lot longer that the 19 minutes required by the CHOOSE mode.
My CARDINALITY suggestion was based on the assumption that 571 was a restrictive skewed value. If the SQL doesn't return 3M rows, then it must be because one of the joins is not finding a matching row. If this is case, there may be a better alternative tuning method.
You never mentioned an ORDER BY. If it is taking a long time with the ORDER BY included, it is because - as I mentioned above - there are a lot of rows returned (possibly 3M). In order to sort the results, Oracle has to find all of the matching rows first. So, rather than taking 1sec to return the first unsorted row, it takes 20 minutes to find all 3M rows, sort them, and show the first sorted row.
There are no simple alternatives for sorting the results of such a statement. The bigger the result set, the longer it will take. If it is going to be a problem, you may consider researching Materialized Views.
Ross Leishman.
|
|
|