Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Spotting the real cause of a Production slowdown (10g)

RE: Spotting the real cause of a Production slowdown (10g)

From: Schultz, Charles <sac_at_uillinois.edu>
Date: Thu, 20 Apr 2006 14:30:08 -0500
Message-ID: <565F609E6D736D439837F1A1A797F34171D24F@ADMINMAIL1.ui.uillinois.edu>


Thanks to all for the many replies on this - it gave us some really good ideas. Unfortunately, we are in the middle of such a problem now, and I highly doubt that the literal sql statements are causing our latch waits and high cpu usage right now. Additionally, the shared_pool is not changing that much (relative to past events, anywhere from 2.5gb to 3gb, sga_target = 10g). CPU is being consumed by many active processes, all doing "normal" things that usually do not consume this much CPU. At least, that is what we believe at this point in time.  

Paul, thanks for the hash_plan_value query - it definitely helps disprove that we have a truckload of literal statements (in other words, I do not think literal statements is our problem, at least not in large numbers). Niall, I am still seeing a number of statements that are not sampled into v$active_session_history, but those that are are showing a lot of library cache waits.


From: oracle-l-bounce_at_freelists.org
[mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Schultz, Charles Sent: Thursday, April 20, 2006 8:15 AM
To: oracle-l_at_freelists.org
Subject: Spotting the real cause of a Production slowdown (10g)

We have observed several critical slowdowns of our Production system since upgrading to 10g. The symptoms include a rapid rise in library cache latch waits (with associated, but less notable, rises in shared pool and library cache pin waits), CPU pegged at 100% and CPU load in triple-digits, and massive shared pool resizing evidenced by v$sga_resize_ops (1.7gb added in 15 seconds). Obviously, trying to run queries against the normal v$ views is rather slow (only adds to the latching problem). We sent a hanganalyze and systemstate dumps to Oracle Support, but after a couple months of trying to figure things out, we have word that the initial Support Engineer may have misdiagnosed our problem.

Based on what little I know and researched (reading a number of notes here, Metalinks, google, oraFAQ, etc), a large number of literal sql statements might be able to cause such symptoms. It is just hard for me to imagine the number of literal statements that would consume 1.7gb of shared pool in 15 seconds. Perhaps it is not quantity, but quality as well, each statement consuming a large chunk of real estate. The only problem with this conjecture is that I do not find evidence of this in v$sql (looking at number of executions, number of literals, sharable_mem).

But the puzzle is even more elusive. EM reports extremely (abnormally) large number of executions for seemingly random sql statements. In one case, we had 3 statements that were each executed 35 million times in one hour (2 of those statements had bind variables, the other was a "simple" select). I can never catch these large numbers in v$sql, so I have to assume they age out by the time we start to look. I have asked the application folks to see if these queries are related, and why the execution count is a couple orders of magnitude greater than normal. Still waiting for resolution at that end. It seems the general consensus that these larger numbers are symptomatic as well, that they only start to escalate when the system is severely slow and the middle-tier starts to timeout. NOTE: that is the working hypothesis, we are trying to prove it.

So, my question is two-fold. Do you experts out there have advice as to how we can reactively figure out what already happened? And how to proactively avoid this from happening again? Because of the automatic AWR snapshots only happening each hour, it is hard for us to get more granular when looking at a historical view. Perhaps we should temporarily increase that?

Any and all advice is gladly welcome. I am hoping to learn some of the tricks and glean from your experience.

Thanks in advance,

PS - after reading about Kyle Hailey and Roger Sanders work, anyone have access to a DMA for dummies? =)

charles schultz
oracle dba
aits - adsd
university of illinois

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Apr 20 2006 - 14:30:08 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US