Re: Archive Log Size

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Tue, 16 Mar 2021 20:34:05 +0000
Message-ID: <CAGtsp8=WaTg6BwuxNVeB7NVqCqmGbDNG1CNBBUae48hX+Nux-A_at_mail.gmail.com>



There's always:

set log_archive_trace = 1

But I'd try that on a UAT type of system first so that you could see the impact of a reasonable workload.
It should only be a few lines of trace (in the lgwr, arcN, etc. files) on each interesting event - but that's still something to be cautious about. you can do:

alter system set log_archive_trace = 1 scope = memory;

to avoid getting it into an spfile

Regards
Jonathan Lewis

On Tue, 16 Mar 2021 at 18:43, Mark W. Farnham <mwf_at_rsiz.com> wrote:

> I may have been shipped beta versions with some tracing turned on that
> showed the switch reason. I can’t find any notes on how to turn it on via
> trace from the time frame when I was checking into that, which was shorting
> after I convinced them that it would be best to NOT copy the entire file
> for the short manual switches used to do clearly document and minimize the
> set of redo logs required as part of an old fashioned “hot” backup. That
> would have been sometime between 1988 and 1994. (Early V6 copied the entire
> file regardless of the end of active redo marker, thinking that matching
> the exact size [and binary compare if you really wanted to be sure] would
> be the simplest way to know the file had been copied correctly to the
> backup. That was true, but not convenient for very large online logs with
> operational manual switches.)
>
>
>
> I didn’t hunt for the trace flags that may or may not turn on that
> information in current releases. I would think that would be in the most
> basic trace with nothing turned on, but it could have been chucked away as
> something customers would ever need to know.
>
>
>
> Sorry I don’t have any better recollection, and my archives from COAT went
> into the dustbin soon after it was sold to Bain (2006).
>
>
>
> I’d check with support to see if there is a simple trace setting that
> would have that information. I would be shocked if they don’t know why they
> are switching when they switch.
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Henry Poras
> *Sent:* Tuesday, March 16, 2021 9:29 AM
> *To:* Mark W. Farnham
> *Cc:* Jonathan Lewis; ORACLE-L
> *Subject:* Re: Archive Log Size
>
>
>
> Thanks for the ideas. I hadn't checked all of those trace files.
> Unfortunately, I didn't see anything useful. Maybe I'll cycle back and see
> if I missed something. I'm also still trying to think of a way to
> distinguish between local cause and standby cause.
>
>
>
> Henry
>
>
>
>
>
> On Fri, Mar 12, 2021 at 1:38 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>
> I am rusty digging in on these details and they have almost certainly
> changed a few things, but if I recall correctly then once upon a time the
> reason for a switch appeared in one or more of the alert logs from the set
> alert.log, the lgwr log, the ckpt log, and the arch log.
>
>
>
> Now I always tried to keep quite a few log groups in the rotation (on the
> order of ten) because there are several things that need to be complete for
> a log that has been switched out of before it can be used again.
>
>
>
> Finally mean time to recover of course cares whether the redo data has
> made it to the standby, but it is the accumulated unapplied redo maximum
> that seems likely to be applicable within that time that still might drive
> a checkpoint and switch. I am also rusty on those mechanics and they have
> probably also changed since last I dug into the precise blow by blow order
> of mandatory events to ensure reliability in all edge cases.
>
>
>
> So, are there any hints for the switch reasons in the logs? IF not, can
> ARCH, CKPT, and LGWR be traced at a higher level to cause them to puke
> switch reasons?
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce_at_freelists.org [mailto:
> oracle-l-bounce_at_freelists.org] *On Behalf Of *Henry Poras
> *Sent:* Friday, March 12, 2021 1:04 PM
> *To:* Jonathan Lewis
> *Cc:* ORACLE-L
> *Subject:* Re: Archive Log Size
>
>
>
> Wow. I just reread my last response and it reads just like what it was,
> free form thoughts. Let me try and clarify what I meant.
>
>
>
> Initially, Jonathan's supposition that somehow the MTTR setting gets the
> standby to force these short log switches seems reasonable. However, upon
> further thought, I have some questions. First, I typically associate
> fast_start_mttr_target with checkpointing, and that isn't a standby thing.
> Second, the Standby gets its data from LGWR (the creator for dest_id=2), so
> the data should be in the SRL without a log switch
>
>
>
> This also got me wondering if LNS (LGWR) falling behind in writing from
> the log buffer to the standby SRL might play a role. So I looked at
> x$logbuf_readhist, but the HITRATE on all 3 nodes was 98%, so it looks like
> no backlog there.
>
>
>
> 1 select indx, inst_id, bufsize, rdmemblks, rddiskblks, hitrate, bufinfo
> 2* from x$logbuf_readhist
> primary:sys_at_ohcops1> /
>
> INDX INST_ID BUFSIZE RDMEMBLKS
> RDDISKBLKS HITRATE BUFINFO
> ---------- ---------- -------------------- ------------------------
> ------------------------ ---------- --------------------
> 0 1 107812K 53747612
> 1869757119 2
> 1 1 129374K 361691660
> 1561813071 18
> 2 1 150937K 727333789
> 1196170942 37
> 3 1 172499K 1303643599
> 619861132 67
> 4 1 194062K 1834188083
> 89316648 95 TARGET-90
> 5 1 215624K 1896707348
> 26797383 98 CURRENT
> 6 1 237186K 1897284975
> 26219756 98
> 7 1 258749K 1897284975
> 26219756 98
> 8 1 280311K 1897284975
> 26219756 98
> 9 1 301874K 1897284975
> 26219756 98
> 10 1 323436K 1897284975
> 26219756 98
> 11 1 344998K 1897284975
> 26219756 98
> 12 1 366561K 1897284975
> 26219756 98
> 13 1 388123K 1897284975
> 26219756 98
> 14 1 409686K 1897284975
> 26219756 98
> 15 1 431248K 1897284975
> 26219756 98
>
> 16 rows selected.
>
>
>
> I'm back to wondering if there is a way to distinguish between a local vs.
> standby effect here.
>
>
>
> Henry
>
>
>
>
>
> On Thu, Mar 11, 2021 at 10:43 AM Henry Poras <henry.poras_at_gmail.com>
> wrote:
>
> Your supposition makes sense.I associate fast_start_mttr with
> checkpointing, which isn't happening on the standby. Just reading directly
> from the SRL. However, on the other hand, it would still be strange for
> this to be the cause of the small archive. We also know from earlier that
> the Standby (log_archive_dest_2) is getting its data from LGWR so the log
> switches shouldn't be necessary to obtain the same mttr on the standby. I
> also tend to get confused by some of the interrelations between
> fsat_start_mttr_target, archive_lag_target, ...
>
>
>
> If it is mttr related, I would guess that doubling my
> fast_start_mttr_target should double the size of my archive logs (seeing
> that the current archive sizing is pretty consistent). Not sure if I'll be
> able to do that, but it's worth a try. I think it would be pretty harmless.
>
>
>
> Henry
>
>
>
>
>
> On Thu, Mar 11, 2021 at 10:07 AM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
> I wonder if the MTTR gets reflected by something in the Standby.
>
> If you're mean time to recover has to be 15 seconds then maybe Oracle does
> something to ensure that the standby could recover inside the MTTR as well
> - and archiving log files prematurely might be a mechanism to allow that to
> happen.
>
>
>
> Regards
>
> Jonathan Lewis
>
>
>
>
>
>
>
> On Wed, 10 Mar 2021 at 15:29, Henry Poras <henry.poras_at_gmail.com> wrote:
>
> fast_start_mttr_target integer 15
>
>
>
> *gv$instance_recovery*
>
> INST_ID : 1
> RECOVERY_ESTIMATED_IOS : 18594
> ACTUAL_REDO_BLKS : 369677
> TARGET_REDO_BLKS : 1002837
> LOG_FILE_SIZE_REDO_BLKS : 3821985
> LOG_CHKPT_TIMEOUT_REDO_BLKS : 1002837
> LOG_CHKPT_INTERVAL_REDO_BLKS :
> FAST_START_IO_TARGET_REDO_BLKS :
> TARGET_MTTR : 15
> ESTIMATED_MTTR : 12
> CKPT_BLOCK_WRITES : 91402225
> OPTIMAL_LOGFILE_SIZE : 457
> ESTD_CLUSTER_AVAILABLE_TIME : 3
> WRITES_MTTR : 211343278
> WRITES_LOGFILE_SIZE : 0
> WRITES_LOG_CHECKPOINT_SETTINGS : 0
> WRITES_OTHER_SETTINGS : 0
> WRITES_AUTOTUNE : 24832295
> WRITES_FULL_THREAD_CKPT : 97904
>
> -----------------
>
> INST_ID : 2
> RECOVERY_ESTIMATED_IOS : 14247
> ACTUAL_REDO_BLKS : 268501
> TARGET_REDO_BLKS : 1047425
> LOG_FILE_SIZE_REDO_BLKS : 3821985
> LOG_CHKPT_TIMEOUT_REDO_BLKS : 1047425
> LOG_CHKPT_INTERVAL_REDO_BLKS :
> FAST_START_IO_TARGET_REDO_BLKS :
> TARGET_MTTR : 15
> ESTIMATED_MTTR : 7
> CKPT_BLOCK_WRITES : 8211142
> OPTIMAL_LOGFILE_SIZE : 596
> ESTD_CLUSTER_AVAILABLE_TIME : 3
> WRITES_MTTR : 6818539
> WRITES_LOGFILE_SIZE : 0
> WRITES_LOG_CHECKPOINT_SETTINGS : 0
> WRITES_OTHER_SETTINGS : 0
> WRITES_AUTOTUNE : 37153280
> WRITES_FULL_THREAD_CKPT : 134238
>
> -----------------
>
> INST_ID : 3
> RECOVERY_ESTIMATED_IOS : 7800
> ACTUAL_REDO_BLKS : 262743
> TARGET_REDO_BLKS : 1035890
> LOG_FILE_SIZE_REDO_BLKS : 3821985
> LOG_CHKPT_TIMEOUT_REDO_BLKS : 1035890
> LOG_CHKPT_INTERVAL_REDO_BLKS :
> FAST_START_IO_TARGET_REDO_BLKS :
> TARGET_MTTR : 15
> ESTIMATED_MTTR : 14
> CKPT_BLOCK_WRITES : 6467159
> OPTIMAL_LOGFILE_SIZE : 285
> ESTD_CLUSTER_AVAILABLE_TIME : 8
> WRITES_MTTR : 18266031
> WRITES_LOGFILE_SIZE : 0
> WRITES_LOG_CHECKPOINT_SETTINGS : 0
> WRITES_OTHER_SETTINGS : 0
> WRITES_AUTOTUNE : 16983159
> WRITES_FULL_THREAD_CKPT : 81827
>
> -----------------
>
> That's a table I almost never look at. I need to go back and review it a
> bit.
>
> Henry
>
>
>
> On Wed, Mar 10, 2021 at 4:59 AM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>
>
> I don't think fast_start_mttr_target has been mentioned explicitly yet,
> nor (g)v$instance_recovery.
>
> Any information there.
>
>
>
> Regards
>
> Jonathan Lewis
>
>
>
>
>
> On Tue, 9 Mar 2021 at 22:34, Henry Poras <henry.poras_at_gmail.com> wrote:
>
> Jonathan,
>
> According to the docs, "A 0 value disables the time-based thread advance
> feature" so I am going under the assumption that 900 means 900. Of course,
> sometimes docs lie.
>
>
>
> Tim, what you suggest makes sense, but it doesn't match up with what I am
> seeing in this environment. There is some data in the first few entries in
> this thread. I can summarize over a larger time frame if you like.
> Basically, each thread is switching every couple of minutes or less, and
> the archive logs are consistently 40-43M (redo logs are 256M).
>
>
>
> Henry
>
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Mar 16 2021 - 21:34:05 CET

Original text of this message