Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: Houston, do I have a problem?
This is an interesting report. I think the responses to it are even more
interesting. One response admits confusion (which I think is a
completely fair reaction). Another zeroes in knowingly on some specific
details. If everyone had time to respond, I would expect a rash of
differing opinions about what you should do first to fix this system...
This kind of game is a fundamental part of using system-wide performance
data. (The various ratio problems are just as relevant for system-wide
data collected from the "wait interface" as they are from
v$sys<anything-else>.)
Don't lose hope if you look at Kirti's note and wonder, "so what's the point?" You cannot see everything that's wrong with a system from a report like this. I think in fact that you can know only two things from a v$system_event report:
Sure, knowing these two things is worth something, but it leaves lots of good questions unanswered (*essential* questions, actually):
So... Is the HDS disk array a problem? Probably. But, it's possible--*likely*, actually--that an analyst could fix all the problems shown here and still have really slow applications. Why? Because several essential-but-slow programs on this system might not spend significant amounts of their response time waiting on any of the top 10 events in this list. We see it pretty often: people fix their system's "worst performance problems" and then find out that their work really didn't make a noticeable end-user impact. (I'm confident that Kirti won't end up in this trap, but that's because I trust him to exercise intuition and experience far beyond the scope of what can be learned from his v$system_event data.)
The "wait interface" is an important tool, because it "finally" (well, since over ten years ago) allows us to see where a program spends its time. But to use that tool to see how a whole system has spent its time since instance startup has the same limitations as any other method that relies upon system-wide aggregated data.
So, what should you look at to avoid performance improvement project ambiguities? Session-level data. Which session? As I mentioned last week, I believe the analyst should focus first upon sessions whose performance improvement would most significantly improve the business. That, in my opinion, is The Big Secret.
Cary Millsap
Hotsos Enterprises, Ltd.
http://www.hotsos.com
Upcoming events:
- Hotsos Clinic, Oct 1-3 San Francisco, Oct 15-17 Dallas, Dec 9-11
Honolulu
- 2003 Hotsos Symposium on OracleR System Performance, Feb 9-12 Dallas
- Next event: NCOAUG Training Day, Aug 16 Chicago
-----Original Message-----
Kirti
Sent: Sunday, August 11, 2002 5:43 PM
To: Multiple recipients of list ORACLE-L
This is not a joke.....!!!
This is from a business critical production database that I was asked to 'review' past Friday.
The report is from v$system_event taken at 10:30am, Aug 9, 2002. The server (and database) was bounced on Aug 4, 2002 at 9:20am.
This was the 1st time I was logging into this database.
SQL> /
EVENT TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT ----------------------------------- ----------- -------------- ----------- ------------ control file parallel write 143933 0 4080356626 28349.0001 db file scattered read 12540695 0 1.2254E+10 977.107332 buffer busy waits 10740450 36 8193235928 762.839167 SQL*Net message from client 180769027 0 9.9561E+10 550.761199 db file sequential read 298968127 0 1.1839E+11 395.99129 enqueue 13500 6435 2036785 150.872963 SQL*Net more data from client 52227948 0 4093231165 78.3724294 free buffer waits 16 4 795 49.6875 log file switch completion 804 43 16263 20.2276119 log buffer space 977 0 5409 5.53633572 control file single write 17 0 51 3 db file parallel write 1749695 0 2935317 1.67761638 db file parallel read 8149 0 13484 1.65468156 log file single write 1024 0 701
.684570313
latch free 2007034 1616763 1054137
.525221297
log file sync 1366242 560 526049
.385033545
SQL*Net message from dblink 1514480 0 451351
.298023744
log file sequential read 405415 0 82877
.204425095
SQL*Net break/reset to dblink 10 0 2
.2
log file parallel write 2025192 7 293332
.144841576
SQL*Net break/reset to client 28113 0 3221
.114573329
db file single write 320 0 36
.1125
SQL*Net more data from dblink 447044 0 11375
.025444923
SQL*Net more data to client 11770996 0 75680
.006429362
control file sequential read 554851 0 3261
.005877254
SQL*Net more data to dblink 1076 0 5
.00464684
buffer deadlock 1045 1029 1
.000956938
SQL*Net message to dblink 1514485 0 456
.000301092
SQL*Net message to client 180769119 0 48736
.000269604
29 rows selected.
SQL>
Here is the environment:
1)all the file systems for the database, including dump directories are
in a
single disk volume group, 2) all redo logs and control files are spread
among all the other database files, 3) Hitachi array is in use with
nothing
but RAID-5 for all files (redo as well), 4) the real hard drives within
the
array are either shared with other databases on the same server or with
other servers, 5) redo logs are of 100MB size and switch 20+ times/hour
when
some of the batch processes run in the evening, 6) no changes are
allowed to
any SQL code, Pro*COBOL code that use 'COPYBOOKs' (Remember those?) to
interact with tables at single row level (no array processing) using
routines with bunch of parameters (call insert... call update... call
delete...), 7) the array has 32GB of NV cache and that's the max it can
have
(the DB is 180GB, there are 3 other similar ones from just this server).
Now the 'icing on the cake':
The server has 3 other critical databases. All 4 running in archive log
mode. All share the *same* archive log destination. And all databases
are
expected to have same amount of batch processing. The archive log
destination is 8GB in size on the 2nd VG. The DB in question, generated
1.8GB to 2+GB of logs in less than an hour during batch processing. At
times
our automated archived log siphoning process encounters some bottlenecks
from our single IBM/Tivoli TSM Server where the logs are deposited
before
those are purged from archive log destination...
I was also informed that I will not have much chance to bring about any
changes in the environment described above. Because, I was told, ...it
is
the corporate decision to use RAID-5 with HDS array and it is 'the most
cost
effective way to address our storage needs'.... and a single VG per
database
helps UNIX support to implement HACMP with much ease... and we can not
meet
our published deadlines if we made any changes and spent time in testing
those unscheduled changes...... yadi yadi yada....
Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists --------------------------------------------------------------------To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).
-- Please see the official ORACLE-L FAQ: http://www.orafaq.com -- Author: Cary Millsap INET: cary.millsap_at_hotsos.com Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051 San Diego, California -- Public Internet access / Mailing Lists -------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).Received on Mon Aug 12 2002 - 01:38:18 CDT
![]() |
![]() |