RE: Capacity Planner from OEM VS Statspack

From: MacGregor, Ian A. <ian_at_SLAC.Stanford.EDU>
Date: Mon, 02 Feb 2004 15:32:39 -0800
Message-id: <26E3EC48949D134C94A1574B2C89466113A6E3@exchange2.slac.stanford.edu>

The job is question was purely hypothetical. I find your question about level-8 traces interesting. I included a level-8 job profile for an actual job. I think 10046 traces should be run and profiles created for every recurring job on the system; although I have not done so. I like to have baselines; I want to know trends. I want to know if a job's performance drop has been a steady decline or precipitous. One cannot trust the user here. He might claim that the drop as sudden when it was not. I try to capture the explain plans as well. Statspack does not do this well. I also capture outside of statspack information on the growth of objects. Finally I also run 10046 traces.

There are indeed gotchas. The CPU/IO report for each hour may hide problems due to its gross granularity. I don't read nor even produce these reports each day. I don't try to tune for the sake of tuning. I do however at times find folks who are incredibly patient, at least more patient than I with unsatisfactory performance. Also one cannot automatically assume the job in question is causing its own delays. Scope is a big problem with statspack.

As I said I do the traces as well. I had one user running a program which read and wrote to a file. The person called, I ran the trace and discovered a very large amount of "sql*plus waiting on client". After being so informed he checked the size of the file and found it to be just user 2 GB.

My method is to collect everything to build the most complete picture of why a jobs performance has deteriorated.

Ian

-----Original Message-----
From: Bobak, Mark [mailto:Mark.Bobak_at_il.proquest.com] Sent: Monday, February 02, 2004 1:13 PM
To: oracle-l_at_freelists.org
Subject: RE: Capacity Planner from OEM VS Statspack

Ian,

Consider that most systems will have:

jobs that run frequently
jobs that run infrequently, or only once and never again

These jobs will either perform poorly or perform acceptably well.

Jobs will have a relative level of criticality to the system, as defined = by your business rules and user community.

Unless a job is "critical", and executes "frequently" and is performing = "poorly" (all in quotes, because it's up to you to decide how to determine that), I would contend that it's = not worth chasing.

If you agree w/ that, then there should be an opportunity to get a 10046 = level-8 trace soon, since the job executes "frequently". If you don't agree w/ the above, then I'd like to = understand why.

-Mark

-----Original Message-----
From: MacGregor, Ian A. [mailto:ian_at_SLAC.Stanford.EDU] Sent: Monday, February 02, 2004 3:58 PM
To: 'oracle-l_at_freelists.org'
Subject: RE: Capacity Planner from OEM VS Statspack

What do you do when someone calls to say, "My job which usually takes 20 = minutes to run took over 2 hours last night?" Do you inquire about the = composition of the job and immediately run a 10046 trace on it? The = information from that trace may not represent what is happened the = prevous night because the conditions such as the load on the database = are different. However with statspack information I might see that the = number of direct path reads and writes went up significantly during that = period from their norm, and the waits also increased. Remember the = information is collected every 10 minutes I can now get the plan = information as well. Is there a hash join. Time to check the = statistics. =20

Perhaps the problem is not being caused by the program the person is = calling about, but by another. Statspack can be helpful here as well. = If I cannot figure it out I can still try tracing the program or set a = login trigger to start the trace when the job runs that night. Of = course Ideally I should have already collected a job profile ...

Something like =20

Call                                 Duration         Calls      =
Duration/Call -------------------------------------------------------------------------=
-----
direct path write                   95.28s  81.1%      5707      0.02s
direct path read                    21.47s  18.3%      7632      0.00s
SQL*Net message from client          0.59s   0.5%         4      0.15s
db file scattered read               0.17s   0.1%       652      0.00s
SQL*Net message to client            0.00s   0.0%         4      0.00s
db file sequential read              0.00s   0.0%         1      0.00s

Total cpu time: 30.5 seconds

And be ready to compare it with what the new trace delivers.

Finally there are some jobs which manipulate data which are impossible = to repeat because the data has changed. What do you run the 10046 trace = on then?

Statspack is certainly no 10046 trace, but it is not useless.

Ian MacGregor
Stanford Linear Accelerator Center
ian_at_SLAC.Stanford.edu

=20

-----Original Message-----
From: Andy Rivenes [mailto:arivenes_at_llnl.gov]=20 Sent: Monday, February 02, 2004 10:51 AM To: oracle-l_at_freelists.org
Subject: RE: Capacity Planner from OEM VS Statspack

However, what you're really collecting is "workload" information. It's = only=20 useful, as Cary stated, for capacity planning, or what you stated, for=20 trends. You're not going to solve a "performance" problem with this = level=20 of information.

Andy Rivenes
arivenes_at_llnl.gov

At 10:00 AM 2/2/2004 -0800, MacGregor, Ian A. wrote:
>I have to disagree here. Sometimes one needs to take a top-down=20
>approach to tuning. Measuring things such as CPU usage and I/O counts
>can be=20 useful. For instance here's a report for a very lightly
>loaded system=20 based on statspack data.
>
>
>DATABASE BEGIN_TIME END_TIME Physical Reads=20
>Physical Writes CPU SECONDS ELAPSED SECONDS
>---------- -------------------- -------------------- --------------
>--------------- ----------- ---------------
>ORAP 12-JAN-2004:00:00:03=20
>12-JAN-2004:01:00:05 14725 7697 224.005=20
>3602
>ORAP 12-JAN-2004:01:00:05=20
>12-JAN-2004:02:00:02 6271 2125 5.03=20
>3597
>ORAP 12-JAN-2004:02:00:02=20
>12-JAN-2004:03:00:04 66066 1043 6.105=20
>3602
>ORAP 12-JAN-2004:03:00:04=20
>12-JAN-2004:04:00:02 1496 1125 3.68=20
>3598
>ORAP 12-JAN-2004:04:00:02=20
>12-JAN-2004:05:00:04 1716 1462 3.995=20
>3602
>ORAP 12-JAN-2004:05:00:04=20
>12-JAN-2004:06:00:01 961 1721 3.71=20
>3597
>ORAP 12-JAN-2004:06:00:01=20
>12-JAN-2004:07:00:03 3779 1032 3.985=20
>3602
>ORAP 12-JAN-2004:07:00:03=20
>12-JAN-2004:08:00:06 16436 2026 5.84=20
>3603
>ORAP 12-JAN-2004:08:00:06=20
>12-JAN-2004:09:00:03 231051 2634 14.43=20
>3597
>ORAP 12-JAN-2004:09:00:03=20
>12-JAN-2004:10:00:05 137762 1245 15.605=20
>3602
>ORAP 12-JAN-2004:10:00:05=20
>12-JAN-2004:11:00:03 183870 1163 15.21=20
>3598
>ORAP 12-JAN-2004:11:00:03=20
>12-JAN-2004:12:00:05 143757 1166 12.975=20
>3602
>ORAP 12-JAN-2004:12:00:05=20
>12-JAN-2004:13:00:02 20349 1088 4.705=20
>3597
>ORAP 12-JAN-2004:13:00:02=20
>12-JAN-2004:14:00:04 195781 1827 13.355=20
>3602
>ORAP 12-JAN-2004:14:00:04=20
>12-JAN-2004:15:00:02 26901 1538 9.53=20
>3598
>ORAP 12-JAN-2004:15:00:02=20
>12-JAN-2004:16:00:04 43434 1039 8.205=20
>3602
>ORAP 12-JAN-2004:16:00:04=20
>12-JAN-2004:17:00:01 61892 1747 8.485=20
>3597
>ORAP 12-JAN-2004:17:00:01=20
>12-JAN-2004:18:00:04 36268 819 5.99=20
>3603
>ORAP 12-JAN-2004:18:00:04=20
>12-JAN-2004:19:00:01 87842 1302 8.4=20
>3597
>ORAP 12-JAN-2004:19:00:01=20
>12-JAN-2004:20:00:03 1256 653 4.625=20
>3602
>ORAP 12-JAN-2004:20:00:03=20
>12-JAN-2004:21:00:01 197415 1086 99.765=20
>3598
>ORAP 12-JAN-2004:21:00:01=20
>12-JAN-2004:22:00:03 556 1386 3.235=20
>3602
>ORAP 12-JAN-2004:22:00:03=20
>12-JAN-2004:23:00:06 1074 697 3.22=20
>3603
>ORAP 12-JAN-2004:23:00:06=20
>13-JAN-2004:00:00:03 14856 927 4.495=20
>3597
>ORAP 13-JAN-2004:00:00:03=20
>
>
>
>If I have a report which deviates from this, then I can start looking
>more closely. The statspack information is gathered every ten =

>minutes. I can then propely trace the statements which are
>candidate=20 problems. I can also get more information. If the CPU
>count goes up =

is=20
>it due to an increase of logical I/Os. How much of the physical I/O is
>=

>direct and to what tablespaces etc.
>
>There are many problems with statspack and the virtual views upon which
>=

>it
>is based. Counters resetting are going negative, the frequency of
>the=20 updates etc. I have made elaborate systems, chucked them out
>of=20 frustration, changed tactics and tried again.
>
>At one time I was totally against such systems because they were not
>perfect, actually far from it. But I came to realize that though=20
>statspack is inadequate, it is not useless.
>
>Ian MacGregor
>Stanford Linear Accelerator Center
>ian_at_SLAC.Stanford.edu

Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put = 'unsubscribe' in the subject line.

--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.
--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request_at_freelists.org
put 'unsubscribe' in the subject line.
--
Archives are at http://www.freelists.org/archives/oracle-l/
FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Received on Mon Feb 02 2004 - 17:32:39 CST