Re: OOM killer terminating database on AWS EC2
Date: Mon, 13 Jan 2020 17:03:49 -0500
Message-ID: <2fd0e827-3a8c-648d-b977-0816f326d32c_at_fjandrade.com>
Hi Sandy
In AWS you can use SES for sending the emails, also use cloudwatch to monitor at a process level.
FJA
On 1/13/2020 3:44 PM, Mark J. Bobak wrote:
> Hi Sandy,
>
> I know it's (almost certainly) happening *way* above your level, but
> dropping Oracle support on *any* database, let alone a production
> database, is foolishness, and certainly *not* a cost savings, not in
> the long run.....
>
> I run Oracle on EC2, w/ mail enabled, and so far, have never run into
> an OOM situation. The system has to be *really* low on memory for the
> kernel's OOM killer to wake up and start killing stuff. When it does,
> Oracle is a big target, because it (almost certainly) is (and should
> be) the big memory consumer on your (EC2) instance.
>
> Some questions:
> 1.) What instance type(s) are you running? Do you have instance
> store volumes configured for swap? Do you have swap configured at
> all? What is the level of swap usage you are seeing?
> 2.) How is your Oracle memory usage configured? Do you have
> hugepages configured? (Please say yes....)
> 3.) What do the outputs of 'free -h' and 'top' tell you? How about
> 'vmstat'? 'sar -B'?
>
> -Mark
>
>
> On Mon, Jan 13, 2020 at 2:33 PM Sandra Becker <sbecker6925_at_gmail.com
> <mailto:sbecker6925_at_gmail.com>> wrote:
>
> Server: AWS EC2
> RHEL: 7.6
> Oracle: 12.1.0.2
>
> We have a database on an AWS EC2 server that the OOM killer has
> terminated twice in the last 5 days, both times it was the
> ora_dbw0_dwprod process. On 1/8 postfix was enabled to allow us
> to email the DBA team through an AWS relay server when a backup
> failed. We stopped running daily backups and cronjobs that did a
> quick check for expired accounts. We've left postfix enabled for
> sending emails. We are searching for answers but have none yet as
> to why this is happening. We also no longer have Oracle support
> available to us. (management saving money again).
>
> Questions:
>
> 1. Could postfix be related to the memory issues even though we
> haven't sent any emails since the first crash 5 days ago?
> 2. How can we monitor the memory usage of an EC2 instance?
> 3. How do you disable the OOM killer in EC2 should we decide to
> go that route? (we have it disabled on our on-prem servers)
> The docs I've found so far have not been helpful.
>
> I appreciate any help you can give us or pointing us in the right
> direction.
>
> Thank you,
> --
> Sandy B.
>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Jan 13 2020 - 23:03:49 CET