RE: Root cause for ORA-00471

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Wed, 10 Jul 2024 12:30:41 -0400
Message-ID: <397201dad2e6$82219180$8664b480$_at_rsiz.com>

Increasing the number of online redo log groups is quite a reasonable thing to do, especially if you have reached an awkward size for apply on rotation standbys. More online log groups mitigates most of the problems with increased rate of switching. This has been true ever since Oracle switched from copying the fixed full size of online redo logs to copying only to the end of the currently used space plus to an even convenient block size. (Thanks Bill.)

Quite a long while ago (circa 1990), I began recommending a starting point of 10 to mitigate bursty transaction rates together with optimizing the size being identical across all groups, AND, if there is a known hourly bursty load pattern, switch through them artificially so that when each burst begins you have 10 or more redo log groups decoupled from archiving and checkpoint latency.

Media is much faster now. But storage is cheap and let me know if you can think of a real downside besides the cost of storage to allocating: a) plenty and b) more than the Oracle minimum of 3 groups.

Usually it is not even more storage, unless dividing the existing 3 group total size into 10 groups generates unreasonably quick load driven switches. If you do reduce the size per group, remember to cycle them out, so that all remaining groups are the same size. It may be useful to have the numbers in the group names match the rotation order as well, which can be a little dance with switching and creation order, but it is not rocket science. It is not technically required, but if you’re fixing something in an urgent situation it is one less variable to confuse the human mind.

Network speeds between campuses, as Mladen mentioned, are definitely a pacing resource for remote standby recovery systems. Increasing the number of groups at whatever is the reasonable size starts the transmission sooner whilst increasing the ability to absorb bursts of load. (Of course if your load rate exceeds your transmission rate long enough you wrap into the same problem, and it is pretty fancy dancing and engineering to detect the approach of that eventuality and dynamically temporarily create additional online redo log groups exceeding the desired total online redo log size until you can reduce them.)

Good luck,

mwf

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Eriovaldo Andrietta Sent: Tuesday, July 09, 2024 8:56 AM
To: gogala.mladen_at_gmail.com
Cc: oracle-l_at_freelists.org
Subject: Re: Root cause for ORA-00471

Hi all,

I think that looking for the disk throughput metric, is a good way to measure how much data can be transferred to and from a storage device in a given amount of time.

Regards

Eriovaldo

Em seg., 8 de jul. de 2024 às 20:43, Mladen Gogala <gogala.mladen_at_gmail.com> escreveu:

On 7/8/24 9:30 AM, Chris Taylor wrote:

For the redo log question (if 3 groups of 200MB each are enough) you can check the log switch frequency. Oracle recommends trying to hit 4-5 redo log switches per hour. (ref: Can Not Allocate Log Doc ID 1265962.1)

So if your redo log switches are around 15 every hour (for example), they're not large enough.

You can check your log switches with this script:

Script To Find Redolog Switch History And Find Archivelog Size For Each Instances In RAC (Doc ID 2373477.1)

Depending on your hardware age and storage, don't be afraid to have large redo logs. On a very busy medical system I used to work on, we had 2G redo logs to get the redo log switches down to 4-5 per hour.

HTH, Chris

There is a problem with large redo logs: standby databases. If you have a remote standby, then your archives go over the WAN. And that might be problematic with a slow network. One of my previous employers used to have the primary in NYC and the standby in Louisville, KY. The WAN was slower than 1Gb/sec, so 2GB redo logs were out of the question. I had to live with approximately 10 - 15 switches per hour.

--

Mladen Gogala
Database Consultant
https://dbwhisperer.wordpress.com

http://www.freelists.org/webpage/oracle-l

--
http://www.freelists.org/webpage/oracle-l

Received on Wed Jul 10 2024 - 18:30:41 CEST

This message: [ Message body ]
Next message: Alan Sterger: "Missing Grant on Package Body"
Previous message: Eriovaldo Andrietta: "Re: Root cause for ORA-00471"
In reply to: Eriovaldo Andrietta: "Re: Root cause for ORA-00471"
Next in thread: Jonathan Lewis: "Re: Root cause for ORA-00471"
Reply: Jonathan Lewis: "Re: Root cause for ORA-00471"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message