Re: Deadlock ITL Waits
Date: Wed, 20 Jul 2011 22:20:29 -0700
Message-ID: <CADsdiQiUzTcZ6fMnNRYngUzpmQx46pL1wcOA5ZvtsqfUHMyQjg_at_mail.gmail.com>
Yeah, the massive log waits could exacerbate the ITL waits as things will be slow commiting their changes.
The db file sequential reads are huge as well. Looks like you have a problem on your storage.
Check the log write times ( I blogged some of this at http://dboptimizer.com/2011/07/20/wait-event-and-wait-class-metrics-vs-vsystem_event/)
Latencies in the past minute
col name for a25
select m.intsize_csec, n.name , m.time_waited, m.wait_count, 10*m.time_waited/nullif(m.wait_count,0) avgms from v$eventmetric m, v$event_name n
where m.event_id=n.event_id
and n.name in (
'log file sync', 'log file parallel write');
Latencies averaged over each hour
select
btime, (time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0) avg_msfrom (
select
to_char(s.BEGIN_INTERVAL_TIME,'DD-MON-YY HH24:MI') btime, total_waits count_end, time_waited_micro/1000 time_ms_end, Lag (e.time_waited_micro/1000) OVER( PARTITION BY e.event_name ORDER BY s.snap_id) time_ms_beg, Lag (e.total_waits) OVER( PARTITION BY e.event_name ORDER BY s.snap_id) count_beg from DBA_HIST_SYSTEM_EVENT e, DBA_HIST_SNAPSHOT s where s.snap_id=e.snap_id and e.event_name in ( 'log file sync', 'log file parallel write')
order by begin_interval_time
)
order by btime
/
If the log file parallel writes are bad then probably something wrong on the storage system.
- Kyle Hailey http://dboptimizer.com
On Wed, Jul 20, 2011 at 3:37 PM, Stalin <stalinsk_at_gmail.com> wrote:
> We have been seeing lots of deadlock errors lately in load testing
> environments and they all have been due to enq: TX - allocate ITL entry. In
> reviewing the statspack report for the periods of deadlock, i see that, log
> file sync wait being the top consumer with a terrible wait time. That makes
> to me think the deadlock, is just a symptom of high log file sync wait
> times. Below is the snippet from statspack and looking at these numbers,
> especially CPU not being heavily loaded, wondering if this could be a case
> of storage issue. Sys Admins are checking the storage layer but thought
> would check here get any opinions/feedback.
>
> Top 5 Timed Events Avg
> %Total
> ~~~~~~~~~~~~~~~~~~ wait
> Call
> Event Waits Time (s) (ms)
> Time
> ----------------------------------------- ------------ ----------- ------
> ------
> log file sync 1,400,773 4,357,902 3111
> 91.4
> db file sequential read 457,568 334,834 732
> 7.0
> db file parallel write 565,843 27,573 49
> .6
> read by other session 16,168 7,395 457
> .2
> enq: TX - allocate ITL entry 575 6,854 11919
> .1
> -------------------------------------------------------------
> Host CPU (CPUs: 64 Cores: 8 Sockets: 1)
> ~~~~~~~~ Load Average
> Begin End User System Idle WIO
> WCPU
> ------- ------- ------- ------- ------- -------
> --------
> 3.13 7.04 2.26 3.30 94.44 0.00
> 7.81
>
> Statistic Total per Second per
> Trans
> --------------------------------- ------------------ --------------
> ------------
> redo synch time 435,852,302 120,969.3
> 309.7
> redo synch writes 1,400,807 388.8
> 1.0
> redo wastage 5,128,804 1,423.5
> 3.6
> redo write time 357,414 99.2
> 0.3
> redo writes 9,935 2.8
> 0.0
> user commits 1,400,619 388.7
> 1.0
>
>
> Environment : 11gr2 EE (11.2.0.1), Sol 10 Sparc
>
> Thanks,
> Stalin
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Jul 21 2011 - 00:20:29 CDT