Killer SQL and PGA
Date: Mon, 30 Jan 2012 15:13:11 -0800
Message-ID: <2500EFA954A6424E9B188BFDCD37EBF101035CD18773_at_CORPMAIL.molina.mhc>
We had a problem with our PGA growing larger than physical memory, dragging the system down until we were forced to restart the database. We recently upgraded from 9i to 11gR2 on Solaris with 16Gb physical memory hosting an OLTP application. 4Gb is used for SGA and 400Mb for PGA_aggregate_target. AMM and ASMM have not been enabled. Workarea_size_policy is set to AUTO. This is a simple database. No RAC, no shared servers, no parallel processing.
One of our analysts launched what appeared to be a simple ad hoc query: select * from deniedhist where icn in ('1', '2', '3', ... , 'x'). Unfortunately, the analyst had over 400 ICNs listed. DENIEDHIST is a UNION ALL view representing a faux-partitioned array of 129 tables, each with its own unique index on ICN. The execution plan (my comments included) shows that it is searching each underlying table 'x' number of times, using the unique index each time. This is the plan from a test version of the SQL with only 5 ICNs. The rest of the plan shows the iteration through the other 128 underlying tables (other views of the plan show the unique table and index names). 1422 SELECT STATEMENT REMOTE CHOOSE
1421 VIEW POS.DENIEDHIST 1420 UNION-ALL 11 CONCATENATION -- This is table 1 of 129 in the view 2 TABLE ACCESS BY INDEX ROWID POS.DENIED_HISTORY 1 INDEX UNIQUE SCAN UNIQUE POS.DENIED_HISTORY_ICN 4 TABLE ACCESS BY INDEX ROWID POS.DENIED_HISTORY 3 INDEX UNIQUE SCAN UNIQUE POS.DENIED_HISTORY_ICN 6 TABLE ACCESS BY INDEX ROWID POS.DENIED_HISTORY 5 INDEX UNIQUE SCAN UNIQUE POS.DENIED_HISTORY_ICN 8 TABLE ACCESS BY INDEX ROWID POS.DENIED_HISTORY 7 INDEX UNIQUE SCAN UNIQUE POS.DENIED_HISTORY_ICN 10 TABLE ACCESS BY INDEX ROWID POS.DENIED_HISTORY 9 INDEX UNIQUE SCAN UNIQUE POS.DENIED_HISTORY_ICN 22 CONCATENATION -- This is table 2 of 129 in the view ...
I estimated that the original version probably had over 120,000 steps in its execution plan. It ran for nearly an hour until memory reached a critical point, almost 20 Gb and paging like mad, that required us to abort the database and restart. I can run the full original query on a standby machine which is physically identical but is still running 9i. It finishes in 5 minutes and the PGA only grows to 500 Mb. The execution plans, at least for the small test version of the query, are the same on both machines.
I understand the problems with large 'IN' lists. There was a conversation on that topic here at the end of October . Educating the analyst and redesigning the query are secondary concerns. I also realize, after reading a bunch of Tom Kyte posts, that pga_aggregate_target will not constrain the growth of the PGA.
My real question is why the 11g memory management, without AMM or ASMM, would allow the PGA to grow so large. In 15 years of operations, there must have been similar bad queries against the database. This happened a day after the 11g upgrade. Any suggestions?
Bob Laverty
Molina Medicaid Solutions
IMPORTANT NOTICE TO RECIPIENT: This email is meant only for the intended recipient of the transmission. In addition, this email may be a communication that is privileged by law. If you received this email in error, any review, use, disclosure, distribution, or copying of this email is strictly prohibited. Please notify us immediately of the error by return email, and please delete this email from your system. Thank you for your cooperation.
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Jan 30 2012 - 17:13:11 CST