RE: Oracle home headscratcher
Date: Wed, 19 Feb 2020 19:09:11 +0000
Message-ID: <DM6PR11MB4137A2458FFDD59DD0B45E00D9100_at_DM6PR11MB4137.namprd11.prod.outlook.com>
Thanks Chris, Rajeev, J, William and Iggy- turns out it was in fact the sticky bit somehow got flipped on one of the files. We copied a fresh home over and are in good shape. How it got flipped is something we'll continue to pursue.
Thanks again! - Chris
From: Iggy Fernandez <iggy_fernandez_at_hotmail.com>
Sent: Wednesday, February 19, 2020 12:21 PM
To: William Beldman <wbeldma_at_uwo.ca>; oracle-l_at_freelists.org; Newman, Christopher <cjnewman_at_uillinois.edu>
Subject: Re: Oracle home headscratcher
truss would have diagnosed the issue. sqlplus is a frontend so you would either have to run truss directly against the child oracle process or use "truss -f sqlplus ..." to trace child processes. -c produces a summary.
-c
Counts traced system calls, faults, and signals rather than displaying the trace line-by-line. A summary report is produced after the traced command terminates or when truss is interrupted. If -f is also specified, the counts include all traced system calls, faults, and signals for child processes.
The Northern California Oracle Users Group is a volunteer-run 501(c)(3) organization that has been serving the Oracle Database community of Northern California for more than thirty years by organizing four conferences a year and publishing a quarterly journal. Download the complete digital archive of the NoCOUG Journal using the Linux command: "wget www.nocoug.org/Journal/NoCOUG_Journal_{2001..2019}{02..12..3}.pdf<http://www.nocoug.org/Journal/NoCOUG_Journal_%7b2001..2019%7d%7b02..12..3%7d.pdf>".
Yes, that didn't turn up much. Unfortunately we've rebooted the server (thankfully DEV) and the problem has gone away.
What we did notice is that the shutdown scripts, which include sqlplus calls to shutdown each database, worked fine. That script was called by root of course, so now we're thinking it's something to do with the oracle user and either a permission or resource issue.
From: William Beldman <wbeldma_at_uwo.ca<mailto:wbeldma_at_uwo.ca>>
Sent: Tuesday, February 18, 2020 8:17 PM
To: Newman, Christopher <cjnewman_at_uillinois.edu<mailto:cjnewman_at_uillinois.edu>>; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>
Subject: RE: Oracle home headscratcher
Can you run truss against sqlplus/tnsping/etc. to figure out what it's doing over the course of those 10 minutes?
From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> <oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>> On Behalf Of Newman, Christopher
Sent: February 18, 2020 6:38 PM
Hi All,
We've got multiple Oracle homes on a Solaris 11.4 server (T8 SPARC). We are having issues with a single home (12.2.0.1), while others are fine (19.5, a different 12.2.0.1 home). We haven't seen this problem on any other hosts, and no known modifications to the environment happened prior to the behavior we're seeing.
From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> <oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>> on behalf of Newman, Christopher <cjnewman_at_uillinois.edu<mailto:cjnewman_at_uillinois.edu>>
Sent: Tuesday, February 18, 2020 6:40 PM
To: William Beldman <wbeldma_at_uwo.ca<mailto:wbeldma_at_uwo.ca>>; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> <oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>>
Subject: RE: Oracle home headscratcher
To: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>
Subject: Oracle home headscratcher
Sqlplus appears to hang, but does eventually connect (by eventually, I'm talking 10+ minutes, and a local connection).
This behavior extends to tnsping (times out, we traced but didn't get much), but running opatch for example, is not affected.
Standby database on the system fall behind.
External connections to databases are not impacted; only attempting to run the binaries locally from the problematic home exhibit the symptoms.
Our only clue on the host is very high utilization of our /u01 mount point, but so far our Unix crew hasn't been able to isolate which process is driving the IO.
Yesterday, on a whim we switched the problematic Oracle home permissions to 755 (from 700), and things "magically" worked and IO plummeted instantly.
Today, we switched back to 700 to see if we could break thing again; we did. However in this second case, chmod'ing the problematic home back to 755 had zero effect and the hanging behavior persists.
Any thoughts on what to look at next? Again, the problem is isolated to just this single home.
Thanks- Chris
--
http://www.freelists.org/webpage/oracle-l
Received on Wed Feb 19 2020 - 20:09:11 CET