Home » RDBMS Server » Performance Tuning » ASH data to identify CPU 100%
ASH data to identify CPU 100% [message #650098] Fri, 15 April 2016 04:01 Go to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Hi All,

I am trying to query the ASH data and check if that gives an indication of CPU getting heavily loaded. There are other ways for it but I am told to not refer to the folks who have access on db server and query the load on the DB server through top and sar commands and measure CPU but instead to check at db level and identify the issue. My thinking is that if cpu is getting all used up - like 98% CPU or 100% CPU -then it will show up in the ASH views as the top wait event for this wait event: resmgr:cpu quantum.


Based on that I wrote this query:

SELECT rn
  FROM (SELECT event, total_wait_time, ROWNUM rn
          FROM (  SELECT NVL (a.event, 'ON CPU') AS event,
                         COUNT (*) AS total_wait_time
                    FROM gv$active_session_history a
                   WHERE a.sample_time > SYSDATE - 15 / (24 * 60)   -- 15 mins
                GROUP BY a.event
                ORDER BY total_wait_time DESC))
 WHERE event = 'resmgr:cpu quantum';


Here if the RN is 1 it means CPU is under pressure for last 15 minutes and everything is waiting on CPU and that is unusual and requires to be looked into. (I will be wrapping this into a shell script that emails for this alert).


My question is that is this reasoning valid or is there a different way to get such data from the ASH view (gv$active_session_history) for the issue of CPU getting all used up.

(We had an outage recently when a bad query caused the entire application to come down as it used all cpu and a single execution took more than one hour and there were several such query coming from the application.)


Thanks,
OrauserN
Re: ASH data to identify CPU 100% [message #650099 is a reply to message #650098] Fri, 15 April 2016 04:21 Go to previous messageGo to next message
John Watson
Messages: 8960
Registered: January 2010
Location: Global Village
Senior Member
The 'resmgr:cpu quantum' wait event is not relevant, it is to do with the Resource Manager, not CPU usage. Just generate an AWR report, that will show you the CPU usage.
Re: ASH data to identify CPU 100% [message #650100 is a reply to message #650099] Fri, 15 April 2016 04:25 Go to previous messageGo to next message
orausern
Messages: 826
Registered: December 2005
Senior Member
Thanks for sharing your views John.

The issue is that when CPU usage hits like 99% or so then whole app hangs. At that time resource manager kicks in in order to allocate cpu whenever it can to the processes that are waiting for cpu and that's when this wait event is seen in the ASH. -that is my understanding. Hope I am right.

[Updated on: Fri, 15 April 2016 04:25]

Report message to a moderator

Re: ASH data to identify CPU 100% [message #650101 is a reply to message #650100] Fri, 15 April 2016 04:31 Go to previous messageGo to next message
John Watson
Messages: 8960
Registered: January 2010
Location: Global Village
Senior Member
I don't know how you have configured your Resource Manager Plan. I do know that you seem to be trying to find complicated ways to do simple things. For example, I notice this,
Quote:
everything is waiting on CPU and that is unusual and requires to be looked into. (I will be wrapping this into a shell script that emails for this alert).

when you could just configure an alert threshold with dbms_server_alert.
Re: ASH data to identify CPU 100% [message #650109 is a reply to message #650101] Fri, 15 April 2016 05:46 Go to previous message
orausern
Messages: 826
Registered: December 2005
Senior Member
Thank you John! I will drop my clumsy idea and explore on your inputs.

I see the following in Morgan's library(http://www.morganslibrary.org/reference/pkgs/dbms_serv_alert.html)...this code would insert a row in dba_alerts_history and that would be queried by the shell script to send mail right? ( I am Dev. dba and don't have access to SYS or don't have access to OEM and must operate with DBA privileges only (i.e. of SYSTEM user)).

DECLARE
   vWarnOp    NUMBER (10);
   vWarnVal   VARCHAR2 (100);
   vCritOp    NUMBER (10);
   vCritVal   VARCHAR2 (100);
   vObsvPer   NUMBER (5);
   vConOcur   NUMBER (5);
-- set thresholds
BEGIN
   DBMS_SERVER_ALERT.set_threshold (DBMS_SERVER_ALERT.session_cpu_sec,
                                    DBMS_SERVER_ALERT.operator_ge,
                                    200,
                                    DBMS_SERVER_ALERT.operator_ge,
                                    400,
                                    1,
                                    1,
                                    NULL,
                                    DBMS_SERVER_ALERT.object_type_system,
                                    NULL);

   DBMS_SERVER_ALERT.set_threshold (DBMS_SERVER_ALERT.session_cpu_txn,
                                    DBMS_SERVER_ALERT.operator_ge,
                                    25,
                                    DBMS_SERVER_ALERT.operator_ge,
                                    40,
                                    1,
                                    1,
                                    NULL,
                                    DBMS_SERVER_ALERT.object_type_system,
                                    NULL);
END;
/

[Updated on: Fri, 15 April 2016 05:51]

Report message to a moderator

Previous Topic: Taking more time to Drop interval Partitions
Next Topic: What Is The Correct Formula For Buffer Cache Hit Ratio
Goto Forum:
  


Current Time: Tue Nov 26 21:23:28 CST 2024