RE: Redo for the Undo

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Tue, 19 Dec 2023 19:33:49 -0500
Message-ID: <02eb01da32dc$339ad530$9ad07f90$_at_rsiz.com>



Lots of good ways to reduce the redo generated have been listed in this thread. I applaud reducing redo required whenever practical since it tends to be well correlated with less total physical work by the machine to accomplish the same task, less total redo to process should require recovery be required, and making the concurrency footprint of the operation smaller. That said,  

checkpoint incomplete waits mean specifically that you have wrapped around the groups that you have and you are waiting for the release of a redo log group.  

So without changing anything, if the transaction is bursty (which by definition a large number of rows being deleted is), and if there is sufficient interregnum for eventual catchup, then it may be possible to accommodate the burst of activity by having more (and possibly larger) redo log groups.  

Quite often in situations under pressure, increasing the buffer before a redo log group needs to be re-used can temporarily (or permanently) cure the wait. Ideally that allows non-pressure cooker engineering toward execution of the operation somewhat closer to optimal and with sufficient testing. Of course, avoiding the experiencing of this wait has from time to time resulted in loss of support for improvement, leaving things that should be improved as they are. They’re broken and should be fixed, but adding buffer means the wheels still turn.  

Tanel Poder has a fine article on this. As I wrote, fixing things is better, and JL has suggested a possible way where there is not really much work AND Oracle’s well tested software is on the hook, not the crafting of “copy keep” and then swap syntax.  

Good luck,  

mwf  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Jonathan Lewis Sent: Monday, December 18, 2023 10:13 AM To: Oracle L
Subject: Re: Redo for the Undo  

Lothar,  

Thanks for the answers.

The thought behind my questions was about the viability of using the option for:  

alter table X move partition Y

including rows where {predicate}

online

update indexes

;  

Moving a partition to "forget" the rows you want to delete may produce a lot less undo (virtually none), and if the remaining rows are far fewer than the number of "deleted" rows the redo for row creation and index recreation may be much smaller than the original redo.  

Given your example of the delete statement using bind variables you may have to use the "partition for" syntax.  

Regards

Jonathan Lewis        

On Mon, 18 Dec 2023 at 14:43, Lothar Flatz <l.flatz_at_bluewin.ch> wrote:

Hi Jonathan,

welcome advise, thanks.

Which version of Oracle? 19.11

Is there only one partition key value per partition ? yes, list partitioned , one value per partition

Is the index you've described unique or the primary key index? it is the PK Index

    (I'm always suspicious of any index on a partitioned table that starts with the partition key column(s) - it's often a waste of space) agree

Are there any global indexes and, in particular, is the index you've described global or local? All Indexes are local

How many rows are in the partition you're deleting from, and how many rows are left after the deletion is complete? Difficult. I can't see the state before the delete any more. It seems to me not all records get deleted.

Regards

Lothar
Am 18.12.2023 um 10:49 schrieb Jonathan Lewis:  

Lothar,  

Just in case you find something of interest in one of my older articles, here's a link to a little catalog on "Massive Deletes": Massive Delete | Oracle Scratchpad (wordpress.com) <https://jonathanlewis.wordpress.com/2018/06/08/massive-delete/>  

The volume of undo comes from the need to stored the entire (inline) portion of the deleted row in the undo segment, and things that can make the volume worse are:

  1. the number of indexes that have to be updated at the same time
  2. extra undo relating to any migrated or chained rows
  3. whether Oracle can create one undo record per BLOCK or whether it has to create one record per row
  4. whether the index maintenance can be done as a delayed sorted array delete (block level) or in "real-time" row at a time
  5. the version of Oracle

It is possible to "optimize" deletes - but it's not a trivial exercise to predict the best strategy without know the data and indexing very well.  

A tablescan or index-fast-full-scan delete will be doing row at a time deletes with real-time index maintenance; and that could lead to a lot of extra random I/O and (for very large objects with lots of indexes) lots of index leaf blocks being written and re-read multiple times (and the IFFS scan could result in lots of table blocks also being written and re-read). (So that index_ffs() hint looks as if it may be a bad idea).  

An index range scan can delay all the index maintenance and then do it as efficiently as possible, but that a large accumulation and sort of key values. If the checkpointing is the big threat, though, this may be a strategy that moves the stress away from the log writer. An alternative would be to add more redo log file space though.  

Which version of Oracle?

Is there only one partition key value per partition ?

Is the index you've described unique or the primary key index?

    (I'm always suspicious of any index on a partitioned table that starts with the partition key column(s) - it's often a waste of space)

Are there any global indexes and, in particular, is the index you've described global or local?

How many rows are in the partition you're deleting from, and how many rows are left after the deletion is complete?  

Regards

Jonathan Lewis    

On Mon, 18 Dec 2023 at 07:57, Lothar Flatz <l.flatz_at_bluewin.ch> wrote:

Hi Jonathan,

that makes sense, Thank you.
What we see are "log file switch (checkpoint incomplete) ". I first wanted to understand the basics before jumping in a conclusion. The "log file switch (checkpoint incomplete)" are always accompanied by big deletes (> 100 million rows). The deletes are a necessary but not sufficient criteria. Since we see big deletes without "log file switch (checkpoint incomplete)" waits. "log file switch (checkpoint incomplete) " means that DBWR can not keep up with LGWR. I wondered how that happens. My guess was that it is connected with undo, since this is a lot more data than redo. Indeed there is a 95% correlation between "Undo blocks written by DBWR" and the size of the delete. I understanding is further, that DBWR has to write blocks, whilst LGWR writes change vectors. In addition it might happen that the DBWR has to write the same block multiple times, if rows are deleted from that block, but not simultaneously. The delete statement might be some indication. The where condition contains the transaction date in a range comparison (delete /*+ Index_ffs(t) */ table1 t where partition_key = :b1 and transaction date < :b2). The only useable index looks like this (partition_key, col1, col2, transaction_date). Col1 is highly selective, col2 has 3 distinct values. The software vendors does not want to create an other index on that table because of one (big) delete per day. I can go along the that, but

  • Index_ffs does not guarantee any sort sequence, as a consequence the same datablock might by multiple times during the delete whilst deleting rows within that block
  • FTS is slower than the Index scan, but rows of the same blocks are deleted in short sequence, which might result in less work of the DBWR
  • How about sorting index rowids as an intermediate step?
  • Index Range scan might allow batching up deletes (not sure of that)

Can you (or anybody) comment on the points above?

This is an OLTP environment and deletes vary a lot in size. I think the usual strategy as of Insert and exchange partition will not work.

Thanks

Lothar

Am 16.12.2023 um 00:05 schrieb Jonathan Lewis:

Possibly the difficulty in figuring it out comes from the following:

  "since roll forward will generate the related undo anew"  

Rolling forward simply reads and applies a set of redo change vectors, ensuring that either all or none of the redo change vectors in a single redo change record are applied. (Oracle may skip a load of redo change records at the start of the first (archived) redo log file being used, but once it has started applying records a failed records will cause the roll forward to terminate.)  

Applying a redo change vector to a table block does NOT result in any undo being generated. The undo for that table block change is written into the correct block in the undo tablespace because the redo record holding the redo change vector for the table will also hold a redo change vector for the undo record originally generated by the change to the table.  

Regards

Jonathan Lewis      

On Fri, 15 Dec 2023 at 08:04, Lothar Flatz <l.flatz_at_bluewin.ch> wrote:

Hi Mladen,

thanks. Got that. Having a hard time to figure why this would be needed. I guess after a database crash when you restore a backup the before images prior to the restore point might be missing. Therefore roll forward will reinstall them in order to be able to cope this flash back. Other than that I see no reason for it since roll forward will generate the related undo anew.

Or, probably more generally speaking, restore should reconstruct the database state in its whole, not leaving out parts (like undo).

Thanks

Lothar

Am 13.12.2023 um 15:34 schrieb Mladen Gogala:
> On 12/13/23 07:53, Lothar Flatz wrote:

>> Hi,
>>
>> can a conventional delete produce redo for undo? And why? What I 
>> don't get: if the redo is applied, undo would be generated anyway, 
>> would it not?
>>
>> But maybe my idea of roll forward is a bit simplistic. Not really my 
>> area of expertise.
>>
>> Thanks
>>
>> Lothar
>>
>> -- 
>> http://www.freelists.org/webpage/oracle-l
>>
>>

> Hi Lothar,
>

> Delete changes blocks, UNDO blocks among other things. REDO mechanism
> protects the UNDO tablespace as well and generally writes any block
> change to the UNDO tablespace.
>

> Regards
>

> --
> Mladen Gogala
> Database Consultant
> Tel: (347) 321-1217
> https://dbwhisperer.wordpress.com
--
http://www.freelists.org/webpage/oracle-l



 

 





--
http://www.freelists.org/webpage/oracle-l
Received on Wed Dec 20 2023 - 01:33:49 CET

Original text of this message