Re: Root cause for ORA-00471
From: Mladen Gogala <gogala.mladen_at_gmail.com>
Date: Mon, 8 Jul 2024 08:47:22 -0400
Message-ID: <ef33547f-de1d-40f1-b96e-9a0135f065b9_at_gmail.com>
Date: Mon, 8 Jul 2024 08:47:22 -0400
Message-ID: <ef33547f-de1d-40f1-b96e-9a0135f065b9_at_gmail.com>
On 7/8/24 8:02 AM, Eriovaldo Andrietta
wrote:
Hi,
An instance version 12c terminated by PMON with the message :PMON(ospid:3996): terminating the instance due to error 471.
Around the time that it happened I saw for all redo groups the event : LGWR all worker groups, but this event occurred in other periods. There are redo groups.Also I got that at this moment (a little before) there were a lot of sessions ON CPU.At the moment
My doubts are- Where can I get the root cause for this error ?- Is there a way (queries) to check if 3 groups with 200mb each one is being enough for this environment? I don't see the interval of the checkpoint.The standard version can use the resource of threads for the redo log groups ?- Is there more than one possible cause for the error ORA-00471 ?- Is there any query that I can use to determine the % total amount of CPU used at this moment or interval ? ( I don't have access to the Enterprise Manager, It shows this data easily )
I suspect that at this moment there were a lot of sessions in the cpu, dbwriter was with all groups busy and PMON based on some configuration terminated the instance.I am looking for evidence.
Best RegardsEriovaldo
Do you have the trace file? ORA-00471 always produces a trace. If you can publish the trace, we may find something out from the trace. As a general rule, PMON going down is an Oracle bug. If this happens often, you could also try using strace on the PMON process and then see what was it doing? Another useful utility is ltrace. You can also try with systemtap, but that's a bit complex. There is a good book on systemtap: https://sourceware.org/systemtap/SystemTap_Beginners_Guide.pdf
It's a free download, you can either read it on-line or dowload
it to you computer.