Re: How do you detect memory issues ?

From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Thu, 6 Dec 2018 22:14:56 -0500
Message-ID: <a990abdd-4a6f-0c7d-71f0-80b4ed9ac088_at_gmail.com>



Another very decent monitoring tool is Nigel's monitor, also known as "nmon". Those who have worked on AIX are probably well acquainted with that tool. It also works on both Red Hat and Ubuntu. Here is a sample:

This utility probably works on Amazon Linux, too.

Regards

On 12/6/18 2:36 PM, kyle Hailey wrote:
>
> Thanks Mladen
> "sar -B" works on Amazon Linux
> and it still amazes me how non-obvious monitoring memory pressure is
> to this day
>
>
>
> On Wed, Dec 5, 2018 at 10:28 PM Mladen Gogala <gogala.mladen_at_gmail.com
> <mailto:gogala.mladen_at_gmail.com>> wrote:
>
> Hi Kyle,
>
> You are talking about vmstat. I prefer sar. Here is the output of
> sar -B 3 3:
>
> mgogala_at_umajor:~$ sar -B 3 3
> Linux 4.15.0-42-generic (umajor)     12/06/2018     _x86_64_    (8
> CPU)
>
> 01:10:34 AM  pgpgin/s pgpgout/s   fault/s majflt/s  pgfree/s
> pgscank/s pgscand/s pgsteal/s %vmeff
> 01:10:37 AM  23049.33   3421.33     16.00 0.00     81.33     
> 0.00      0.00      0.00      0.00
> 01:10:40 AM  19186.67      1.33    102.67 0.00    116.00     
> 0.00      0.00      0.00      0.00
> 01:10:43 AM     14.67   5064.00  32142.67 0.00  25249.00     
> 0.00      0.00      0.00      0.00
> Average:     14083.56   2828.89  10753.78 0.00   8482.11
>
>
>
> The important stats are majflts/s, which means that pages had to
> be read from disk and pgsteal/s, which denotes the number of the
> modified pages backed up and reclaimed as "free". In this context
> "free" doesn't mean empty, the page being free means that the page
> has a valid backup. Page stealing definitel and paging out
> (pgpgout/s) definitely means that there is a memory problem. On
> Red Hat systems, sar is available in the sysstat package. Another
> good indication that something is wrong is large proportion of
> kernel mode cpu time, as shown by top. Also, "top" is a good
> indicator because it shows the swap usage. If the swap usage keeps
> growing, there is a trouble with memory.
> Regards
>
> On 12/5/18 7:44 PM, kyle Hailey wrote:
>> One of those questions that seems like it should have been nailed
>> down 20 years ago but it still seems lack a clear answer
>>
>> How do you detect memory issues ?
>>
>> I always used "*_po" or "paged outs_*". Now on Amazon Linux I
>> don't see "po" but there is "bo" (blocks written out). In  past,
>> at least on OSF & Ultrix, page outs were a sign of needed memory
>> that was written out to disk and when I needed that memory it
>> would take a big performance hit to read it in. Thus "po" was a
>> good canary on the coal mine. Any consistent values over over say
>> 10 were a sign.
>>
>> Some people use "*_scan rate_*" but I never found that as easy to
>> interpret as page outs. Again what values would you use
>>
>> Some suggest using freeable memory as a yardstick where freeable
>> is  "free" + "cached"  or MemFree + Cached + Inactive. Even in
>> this case what would you use for values to alert on?
>>
>> I've always ignored swap stats as if you are swapping it is too late.
>>
>> What  do you use to detect memory issues ?
>>
>> Kyle
>
> --
> Mladen Gogala
> Database Consultant
> Tel: (347) 321-1217
>

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217


--
http://www.freelists.org/webpage/oracle-l
Received on Fri Dec 07 2018 - 04:14:56 CET

Original text of this message