Re: Performance issues with SSD on DELL servers
Date: Mon, 29 Aug 2016 00:06:48 -0500
Message-ID: <CAHvOz+yixgRt9HY8uyEvWsfkYOMDkGQDWJHvN48_t+WsQrCG3w_at_mail.gmail.com>
Hello Gogala/Amir,
Thanks for your response. I dont see any problem in the wait events,
23 rows selected.
EVENT COUNT(*) AVG(B.WAIT_TIME)
---------------------------------------- ---------- ----------------
VKTM Logical Idle Wait 1 0
VKRM Idle 1 0
smon timer 1 0
JOX Jit Process Sleep 1 0
AQPC idle 1 0
direct path read temp 1 0
lreg timer 1 0
Streams AQ: emn coordinator idle wait 1 0
Streams AQ: qmn coordinator idle wait 1 0
wait for unread message on broadcast cha 1 0
nnel
heartbeat redo informer 1 0
pmon timer 1 0
Streams AQ: qmn slave idle wait 2 0
LGWR worker group idle 2 0
DIAG idle wait 2 0
jobq slave wait 2 0
Space Manager: slave idle wait 3 0
EMON slave idle wait 5 0
Streams AQ: waiting for time management 5 0
or cleanup tasks
rdbms ipc message 16 0
pipe get 21 0
Streams AQ: waiting for messages in the 29 0
queue
SQL*Net message from client 187 0
I believe there is something wrong with network/storage. below find the
Oswatcher netstat info.
Linux OSWbb v7.3.3
zzz ***Mon Aug 29 00:00:21 EDT 2016
Kernel Interface table
Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR
Flg
eth0 1500 8803635 0 412 0 7038079 0 0 0
BMRU
eth1 9000 45572028 0 263 0 48337634 0 0 0
BMRU
lo 65536 7919428 0 0 0 7919428 0 0 0
LRU
Ip:
60955288 total packets received
0 forwarded
0 incoming packets discarded
60943998 incoming packets delivered
63275787 requests sent out
20 dropped because of missing route
Icmp:
11152 ICMP messages received
18 input ICMP message failed.
ICMP input histogram:
destination unreachable: 3 echo requests: 283 echo replies: 10836 timestamp request: 30
11414 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 283 echo request: 10836 echo replies: 265 timestamp replies: 30 IcmpMsg: InType0: 10836 InType3: 3 InType8: 283 InType13: 30 OutType0: 265 OutType3: 283 OutType8: 10836 OutType14: 30
Tcp:
127044 active connections openings
181467 passive connection openings
32349 failed connection attempts
65022 connection resets received
61 connections established
60277974 segments received
73515227 segments send out
17155 segments retransmited
0 bad segments received.
95570 resets sent
Udp:
35018 packets received
283 packets to unknown port received.
0 packet receive errors
33852 packets sent
0 receive buffer errors
0 send buffer errors
UdpLite:
TcpExt:
4363 invalid SYN cookies received
37 resets received for embryonic SYN_RECV sockets
8 packets pruned from receive queue because of socket buffer overrun
35651 TCP sockets finished time wait in fast timer
1150363 delayed acks sent
183 delayed acks further delayed because of locked socket
Quick ack mode was activated 1432 times
5616360 packets directly queued to recvmsg prequeue.
7172648 bytes directly in process context from backlog
1689077758 bytes directly received in process context from prequeue
36837349 packet headers predicted
5488329 packets header predicted and directly queued to user
2385028 acknowledgments not containing data payload received
29084328 predicted acknowledgments
3287 times recovered from packet loss by selective acknowledgements
Detected reordering 1 times using FACK
Detected reordering 16 times using SACK
Detected reordering 26 times using time stamp
42 congestion windows fully recovered without slow start
613 congestion windows partially recovered using Hoe heuristic
129 congestion windows recovered without slow start by DSACK
2936 congestion windows recovered without slow start after partial ack
TCPLostRetransmit: 74
29 timeouts after SACK recovery
10846 fast retransmits
1248 forward retransmits
88 retransmits in slow start
4487 other TCP timeouts
32 SACK retransmits failed
52 packets collapsed in receive queue due to low socket buffer
4333 DSACKs sent for old packets
315 DSACKs sent for out of order packets
1502 DSACKs received
31840 connections reset due to unexpected data
32158 connections reset due to early user close
47 connections aborted due to timeout
TCPDSACKIgnoredNoUndo: 22
TCPSpuriousRTOs: 5
TCPSackShifted: 36711
TCPSackMerged: 53967
TCPSackShiftFallback: 47357
TCPDeferAcceptDrop: 48947
TCPRcvCoalesce: 416660
TCPOFOQueue: 95889
TCPOFOMerge: 274
TCPChallengeACK: 40
IpExt:
InMcastPkts: 1484
OutMcastPkts: 46
InBcastPkts: 619585
InOctets: 136465338184
OutOctets: 140429464662
InMcastOctets: 396864
OutMcastOctets: 5366
InBcastOctets: 57012147
zzz ***Mon Aug 29 00:01:21 EDT 2016
Shall I share the AWR report here? Please advise
Thanks,
Shankar
On Sun, Aug 28, 2016 at 8:39 PM, Mladen Gogala <gogala.mladen_at_gmail.com>
wrote:
> On 08/28/2016 07:18 PM, Hameed, Amir wrote:
>
> Are you using dNFS?
>
>
> Probably not, since he has mentioned /etc/fstab and not /etc/oranfstab. He
> also hasn't mentioned anything that would lead me to the conclusion that
> there is a problem with the filer.
> Regards
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_
> freelists.org <oracle-l-bounce_at_freelists.org>] *On Behalf Of *Mladen
> Gogala
> *Sent:* Sunday, August 28, 2016 5:20 PM
> *To:* oracle-l_at_freelists.org
> *Subject:* Re: Performance issues with SSD on DELL servers
>
>
>
> On 08/28/2016 04:19 PM, Apps DBA wrote:
>
> Hi,
>
> We are facing a wierd performance issue on our Dell 990 series OVM/OEL 7
> servers. The storage is Netapp with SSD's mounted as NFS drives on database
> and application nodes. We followed the Oracle standards in setting the
> /etc/fstab appropriately, but the performance sometimes runs with WOW
> factor and some days goes crazy. We are running R1225 with 12c database.
> Does anyone know the approach to solve this? Please advise.
>
>
>
> Thanks in advance!
>
> A
>
>
>
> Well, this is a classic performance tuning problem. First, you need to to
> figure out where the time is spent. Oracle's trace files and
> V$SYSTEM_EVENT,V$SESSION_EVENT and V$SESSION_WAIT will tell you that. If
> you have the appropriate licenses, you can also use
> V$ACTIVE_SESSION_HISTORY. You will need to figure out where was the time
> spent when the performance "went crazy".
> From what I read, you have a reason to believe that the culprit is NetApp.
> NetApp has very good monitoring tools, so you should check what the OnTap
> tools will give you. The reasons may be all over the place and you haven't
> given enough information to make any reasonable assumption.
> Essentially, all tuning problems are the same and need to be resolved in
> two steps:
>
> 1. Figure out where the time is spent
> 2. Figure out how to spend less time
>
> Oracle has an excellent tuning methodology, with many tools, but it will
> not be able to dive into the inner working of a NetApp filer. For that, you
> will have to use NetApp tools.
>
> Regards
>
>
>
> --
>
> Mladen Gogala
>
> Oracle DBA
>
> Tel: (347) 321-1217
>
>
>
> --
> Mladen Gogala
> Oracle DBA
> Tel: (347) 321-1217
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Aug 29 2016 - 07:06:48 CEST