RE: DMON killing RSM0?
Date: Wed, 17 Jun 2020 19:57:57 +0000
Message-ID: <CH2PR02MB66641AA040D6EBB10C8DFB1CD49A0_at_CH2PR02MB6664.namprd02.prod.outlook.com>
ASH shows that for the RSM processes that are created and eventually killed during this time range all show waits on "kfk: async disk IO".
Regards,
Dave
From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Mladen Gogala
Sent: Saturday, June 13, 2020 9:51 AM
To: oracle-l_at_freelists.org
Subject: Re: DMON killing RSM0?
CAUTION: This email originated from outside of D&B. Please do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Dave,
These errors are network timeout errors. RSM processes monitor the standby status. Oracle connects to the primary port, usually 1521, and then the the connection is handed to the dynamic ports. Firewall settings sometimes cut these ports off, at least some of them. The default setting with Oracle installation is something like:
net.ipv4.ip_local_port_range = 9000 65500
Your firewall may be configured to have dynamic ports between 32000 and 55000. The result is the situation in which Linux attempts to hand off the primary connection to the dynamic port which is blocked by firewall. Each killed remote status monitor (RSM) will produce its own trace. Please, check the trace and if you see something like "timeout on the port 55831" then you know that there is some configuration you need to do. Here is a decent article about the dynamic (local) ports:
https://blog.fpmurphy.com/2015/02/ip-dynamic-port-range.html<https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fblog.fpmurphy.com%2F2015%2F02%2Fip-dynamic-port-range.html&data=02%7C01%7Cherringd%40dnb.com%7Cacd666ae74f54520dd0608d80fa94815%7C19e2b708bf12437597198dec42771b3e%7C0%7C0%7C637276567031592035&sdata=KugYcDMmGKwZE1mME%2BbPvCFrUsjcEP9yw5H0vxf9SeI%3D&reserved=0>
Fortunately, you don't have to deal with the logical standby. Now, that would be fun for the whole family. In addition to the archive delivery and the status monitoring, there is also a redo apply process.
Regards
On 6/12/20 5:39 PM, Herring, Dave (Redacted sender HerringD for DMARC) wrote:
I have a situation where it looks like the DMON process is killing off RSM0 processes every night around the same time and I don't have a good explanation as to why. This is on a 4-node Exadata env running 18c with 6 dbs, all using DG (the standby is also a 4-node Exadata env).
Every night between 20:12 and 21:35 we get a series of ORA-16665 errors from all databases, errors found in the broker's logfile. Checking each db's alert log I see messages like the following:
Process RSM0, PID = 51310, will be killed Process termination requested for pid 51310 [source = rdbms], [info = 2] [request issued by pid: 76161, uid: 110]
SPID 76161 is DMON, which means every night DMON kills off RSM0 processes around the same time. This is done for all databases.
Is there a DG broker setting that says to wipe out all DGB resource processes and restart them?
Regards,
Dave
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Jun 17 2020 - 21:57:57 CEST