RE: Possible reasons for load average high and CPU 90% idle on RHEL 7.6

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Mon, 23 Aug 2021 14:10:43 -0400
Message-ID: <054201d7984a$30ba0c20$922e2460$_at_rsiz.com>



Another thing I have seen is if a pool of client servers autospawns another set if zero are immediately available (and especially if someone thought an increasing poolsize each time you used them up concurrently was a good idea) then momentarily hitting the available limit can generate a big bump of hilarity. Starting connections is one of the more expensive Oracle things.  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Mikhail Velikikh Sent: Monday, August 23, 2021 7:48 AM
To: aryan.goti_at_gmail.com
Cc: ORACLE-L
Subject: Re: Possible reasons for load average high and CPU 90% idle on RHEL 7.6  

Hi,  

Linux load averages include processes in the uninterruptible sleep (D) state (disk I/O wait usually):

https://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html  

You can check historic sar -d for avgqu-sz. It can get ridiculously high if you have lots of processes waiting for I/O.    

On Mon, 23 Aug 2021 at 12:29, Goti <aryan.goti_at_gmail.com> wrote:

HI Listers,  

We had a situation where the load average spiked to more than 2000+ and continued the saw way for about 45 minutes. During this entire time we had CPU 95% idle. One thing that was observed was that we had a sudden increase in the number of connections from 1800 to around 3700. We had 90Gb of free memory throughout when the load average spiked.  

Server has 48 CPU with 512 Gb of RAM running on RHEL 7.6.  

Is this something expected?    

Thanks,

Goti

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Aug 23 2021 - 20:10:43 CEST

Original text of this message