RE: Original design approach to Oracle REDO Logs

From: Clay Jackson <"Clay>
Date: Thu, 17 Jun 2021 23:15:48 +0000
Message-ID: <CO1PR19MB4984A22E8CFE4A2EDA78951A9B0E9_at_CO1PR19MB4984.namprd19.prod.outlook.com>



JL wrote – “….if the row were not locked then it would be possible to create cases that produced wrong results.” And MWF wrote “Making sure the bucket has no leaks was definitely priority number one”

To quote Scotty, from Star Trek – “The more they tinker with the plumbing, the easier it is to stop up the drain”.

And not that I’m calling Bill Bridge a grandmother or old; but there’s another aphorism that goes:

“Don’t try to teach grandma to suck eggs”

(That said, this is an enlightening and entertaining discussion)

Clay Jackson

From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Jonathan Lewis Sent: Thursday, June 17, 2021 3:13 PM
To: ORACLE-L <oracle-l_at_freelists.org> Subject: Re: Original design approach to Oracle REDO Logs

CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.

I've just re-run the test of the "no-change" update against 19.11.0.0 and the code has been changed to generate an 11.19 (array update) entry that shows "no columns changed" for every row in the array. I'll have to do a little more testing to see what it does with some rows in a block really changing and some being updated without change.

Regards
Jonathan Lewis

On Thu, 17 Jun 2021 at 22:52, Jonathan Lewis <jlewisoracle_at_gmail.com<mailto:jlewisoracle_at_gmail.com>> wrote: Oracle actually introduced a "no change" optimisation for redo in 12.2

If an update doesn't change the values in the row then Oracle doesn't go through the normal generation of undo and redo for that row. HOWEVER - and this is where the "don't change the block" idea fails - it locks the row, which does generate undo and redo. And if you're unlucky (and if you're updating every row in the block in a small way you probably will be) the "optimisation" will generate more undo and redo rather than less because real updates can generate one undo and redo vector per block but the lock row vectors are (up to 19.3) single row vectors.

I wrote about this a couple of years ago: https://jonathanlewis.wordpress.com/2019/09/08/quiz-night-34/<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fjonathanlewis.wordpress.com%2F2019%2F09%2F08%2Fquiz-night-34%2F&data=04%7C01%7Cclay.jackson%40quest.com%7Cd076a1ab508d444a7b3a08d931dd2795%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C637595648183185895%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ujV0Q70WlhlJVcAPBUjDQCHd2nI1QWh%2B%2FHxFaE1FxFM%3D&reserved=0>

The "lock row" is necessary for backwards compatibility (if nothing else) - but I think it would be hard to argue that it shouldn't happen because if the row were not locked then it would be possible to create cases that produced wrong results.

Regards
Jonathan Lewis

On Thu, 17 Jun 2021 at 21:12, Mark W. Farnham <mwf_at_rsiz.com<mailto:mwf_at_rsiz.com>> wrote: While I agreed with what JL wrote, IF we knew how fast processors are now, then an update all rows query might compare the block images on the fly and not write the blocks where all the rows in a block were updated to the same value so that the actual block image resulting from the update didn’t need to be written.

Off the top of my head I *think* that would work and be efficient, but I am not engaged enough to ponder possible edge cases. But I *think* you have all the information you need in hand at commit time to flag blocks to *not* bother writing. If the buffer had to flush along the way before the commit, they would already be in the redo stream, but with the massive memory now available a private redo thread might be able to handle it both correctly and efficiently.

I really like Oracle’s redo model though. Someone once put it to me thusly: Everything Bill Bridge starts ends up in checkmate. (Meaning Bill wins.) Making sure the bucket has no leaks was definitely priority number one.

From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [mailto:oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>] On Behalf Of Jonathan Lewis Sent: Thursday, June 17, 2021 10:16 AM
To: ORACLE-L (oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>) Subject: Re: Original design approach to Oracle REDO Logs

Mike,

I don't think you can ask about hypothetical strategies for the redo log before stating what you think the purpose of the database should be.

If you expect the database to be the reference of last resort for the correctness and (immediate) consistency of the data then you might like a different strategy from someone who thinks the database is some sort of record of evolution of the data that allows the current version of the data to be re-imaged by a client.

Regards
Jonathan Lewis

On Thu, 17 Jun 2021 at 13:55, Michael D O'Shea/Woodward Informatics Ltd <woodwardinformatics_at_strychnine.co.uk<mailto:woodwardinformatics_at_strychnine.co.uk>> wrote: Chaps, request for opinions/discussion/feedback .....

Question: If Oracle were written today, would the same strategy behind the "redo log" be adopted.

An example .... for a table of 10,000,000 rows, and a contrived piece of DML

update someTable
  set someColumn1 = 1.234

that "updated" all the rows
 but where someColumn1 was only updated to 1.234 for 6 rows as the remaining 10,000,000 - 6 rows were already 1.234

Should (assuming just DML and also just the basic data types, number, varchar2, date, ... ) the redo log

  • record the data "change" for all 10,000,000 "updates"
  • record the real data change for just the 6 real updates
  • record solely the SQL used to perform the update for some playback purpose
  • send a message to some message broker such as Solace, Tibco, .... allowing subscribers to process the data change that way (might be replication/backup, some sort of refresh or push notification to other applications, email dispatch and so on)
  • some other approach else

There is a considerable movement to event streaming technology such as Kafka to (indirectly) drive data change events to downstream and dependent systems (in Oracle assumably by polling the redo log file, or maybe some LogMiner interface .. I don't know the detail) in databases that include Oracle, MongoDB (referred to as Change Streams), and many more.

My "ask" focuses more on "just the database" interoperability with the remainder of what is often a large tech stack, and not the original design decision around redo logs for data recovery.

Mike

http://www.strychnine.co.uk<https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.strychnine.co.uk%2F&data=04%7C01%7Cclay.jackson%40quest.com%7Cd076a1ab508d444a7b3a08d931dd2795%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C637595648183195855%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=6MayhIpzaM%2FMhQzxQhBDBqFaovdDGT%2FAAABagbQhRic%3D&reserved=0>

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Jun 18 2021 - 01:15:48 CEST

Original text of this message