Re: Data movement logic in data guard

From: Hemant K Chitale <hemantkchitale_at_gmail.com>
Date: Thu, 31 Dec 2020 19:09:25 +0800
Message-ID: <CAMNBsZtGRXeq-34-HFLMKNXqT-JjLQGd2B1BKto7pzXJBiaxUg_at_mail.gmail.com>



Don't you find the Apply Lag Value for INST_ID=1 very strange ?? Why is it so many days ?

Hemant K Chitale

On Wed, Dec 30, 2020 at 9:04 PM Lok P <loknath.73_at_gmail.com> wrote:

> Thank you.
>
> Data from standby side v$dataguard_stats looks something as below. Not
> sure why that diff is so big.
>
> Requirement is to move data from a few transaction tables incrementally to
> the target(say remote DB) in regular intervals
> ( say once in ~15minutes) as and when they reach the standby database, but
> should not miss any or duplicate any in the target. For this we want to
> leverage standby database considering primary is having some resource
> constraints.
>
> And for this we are taking a reference column as date_created of all the
> transaction tables which are populated with sysdate when it's
> created/inserted in the primary database table.Now before initiating data
> move from the standby at say exactly sysdate+N seconds, say i.e.
> 10:01:01AM. we want to ensure all the data till that point in time must be
> moved to standby without any left inflight. So before each data move
> procedure triggers , we will be triggering the pre-validation to ensure all
> the data till 10:01:01 AM has been reached standby and nothing left behind.
> This is to safely ensure we are consistent on the target/remote database
> without any chance of missing record, as because the criteria for the data
> fetch is going to be Select.. from transacton_table where
> date_created<=10:01:01AM. And next pick/fetch would be from 10:01:02AM
> onwards after next ~15minutes. And in case some data with date_created as
> 10:01:01AM arrives at the standby little late , they won't get picked and
> will be missed.
>
> So for implementing a full proof pre data load validation logic, I was
> looking for a date column which i can compare to ensure data movement is
> safe till that point in time from standby to remote DB.
>
> INST_ID NAME VALUE UNIT TIME_COMPUTED DATUM_TIME
>
> 1 apply lag * +171 16:34:19 * day(2) to second(0)
> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>
> 2 apply lag +00 00:00:59 day(2) to second(0) interval 12/30/2020
> 07:06:09 12/30/2020 07:06:09
>
> 3 apply lag day(2) to second(0) interval 12/30/2020 07:06:09
>
> 4 apply lag day(2) to second(0) interval 12/30/2020 07:06:09
>
> 1 transport lag +00 00:00:00 day(2) to second(0)
> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>
> 2 transport lag +00 00:00:00 day(2) to second(0)
> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>
> 3 transport lag +00 00:00:00 day(2) to second(0)
> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>
> 4 transport lag +00 00:00:00 day(2) to second(0)
> interval 12/30/2020 07:06:09 12/30/2020 07:06:09
>
>
>
>
> On Wed, Dec 30, 2020 at 4:09 PM <niall.litchfield_at_gmail.com> wrote:
>
>> I must confess to not having followed all of this but it does sound a
>> little bit like you are trying to reinvent some wheels here.
>>
>> The standard (in my view anyway) way to see how far "behind" a data
>> guard standby is to query v$dataguard_stats like the below
>>
>> SQL> select name,value,time_computed
>> 2 from v$dataguard_stats
>> 3 where name in ('apply lag','transport lag')
>> 4 order by name;
>>
>> NAME VALUE
>> TIME_COMPUTED
>> -------------------------------- ------------------------------
>> ------------------------------
>> apply lag +00 00:00:00
>> 12/30/2020 10:26:20
>> transport lag +00 00:00:00
>> 12/30/2020 10:26:20
>>
>> Similarly, for your "how do I ensure that a low activity database
>> transports redo regularly?" question earlier, there's a parameter
>> ARCHIVE_LAG_TARGET for that.
>>
>> As for the data movement from your standby to a (remote?) new table or
>> tables in a new database, if this is a regular thing this sounds like a
>> tailor-made use case for MERGE logic, especially if there is a possibility
>> that source data will be updated/deleted after initial load and you also
>> need to replicate this.
>>
>>
>> On Tue, Dec 29, 2020 at 6:01 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>>> While we were planning on the strategy we saw one issue here. To
>>> check/confirm if the data has been moved to standby/DR till certain
>>> point of time (say current time i.e. sysdate), we were initially thinking
>>> of relying on column last_replication_dt of gv$standby_log, if it's >
>>> sysdate, then only we will assume that all data has been replicated to DR
>>> till that point so it's good to pick/move data from DR/Standby till that
>>> point/sysdate. If last_replication_dt <sysdate it will keep on sleep for a
>>> few seconds and will not trigger the data move.
>>>
>>> But the issue in above strategy is that, in case there is no frequent
>>> changes happen on the primary , say a Dev instance , in which we hardly
>>> have any change happen on primary, in that case the last_replication_dt
>>> will never reach closer/equal to sysdate, and in that case it will keep on
>>> looping in that SLEEP status. Is there a better strategy or logic to handle
>>> this?
>>>
>>> LOOP
>>>
>>> SELECT SYSDATE, ( SELECT max(last_time) FROM gv$standby_log)
>>>
>>> INTO current_dt,
>>>
>>> last_replication_dt
>>>
>>> FROM DUAL;
>>>
>>> WHILE (current_dt > last_replication_dt)
>>>
>>> LOOP
>>>
>>> DBMS_LOCK.SLEEP(30);
>>>
>>> SELECT max(last_time)
>>>
>>> INTO last_replication_dt
>>>
>>> FROM gv$standby_log;
>>>
>>> END LOOP;
>>>
>>> /
>>>
>>> On Sun, Dec 20, 2020 at 1:09 AM Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Thank You very much for the detailed explanation.
>>>>
>>>> My apology, but I am not having much idea about the backup/recovery
>>>> area. Actually we had fallen into this wrong ordering trap while Golden
>>>> gate replication was used in one of the systems with multiple parallel
>>>> replication threads used for replication and we ended up missing data in
>>>> the target system, so wanted to check if that can happen in data
>>>> guard setup as well, as we are planning to use the data guard physical
>>>> standby / DR database as our data source.
>>>>
>>>> But If i get your point correctly , I think you are pointing to the
>>>> fact that the application of the archived redo logs on the DR database will
>>>> happen in the same order as the logical transactions happen in the primary
>>>> side. Which means , my initial concern regarding if the two different
>>>> inserts can be applied in different order in the DR side as compared to
>>>> primary is not true. And that way it will ensure data movement will never
>>>> fall into this trap due to wrong order.
>>>>
>>>> My other doubt was ,
>>>> 1)If it's good to rely on column LAST_TIME of v$standby_log(as I posted
>>>> the logic below) for checking the lag periodically and thus keep on looping
>>>> till the lag reduces to ZERO at certain point in time. And then only data
>>>> fetch/load query will be executed to pick data from the transaction tables?
>>>> And this block/loop will be called at the start of each data load procedure
>>>> call.
>>>>
>>>> 2) For insert only TABLES it's easy to take MAX(date_created) and
>>>> store it in some reference tables and then use that as reference to do the
>>>> incremental data load to target . But how to achieve this data movement
>>>> logic accurately, if UPDATE/DELETE also happens on the base transaction
>>>> table?
>>>>
>>>> LOOP
>>>>
>>>> SELECT SYSDATE, ( SELECT max(last_time) FROM gv$standby_log)
>>>>
>>>> INTO current_dt,
>>>>
>>>> last_replication_dt
>>>>
>>>> FROM DUAL;
>>>>
>>>> WHILE (current_dt > last_replication_dt)
>>>>
>>>> LOOP
>>>>
>>>> DBMS_LOCK.SLEEP(30);
>>>>
>>>> SELECT max(last_time)
>>>>
>>>> INTO last_replication_dt
>>>>
>>>> FROM gv$standby_log;
>>>>
>>>> END LOOP;
>>>>
>>>> /
>>>>
>>>>
>>>> On Sat, Dec 19, 2020 at 8:56 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>>>>
>>>>> dataguard is fundamentally a recovery operation.
>>>>>
>>>>>
>>>>>
>>>>> dataguard with the “secondary open for read” is NOT logical
>>>>> replication, it is physical replication. (Two or more separate Dataguard
>>>>> destinations with different less than everything sets of tablespaces are
>>>>> beyond the scope of this email.) Receipt of each transaction COMMIT or
>>>>> ROLLBACK and the “secondary” processing completion of same is when
>>>>> transaction results are viewable in the “open for read” dataguard secondary.
>>>>>
>>>>>
>>>>>
>>>>> If they are ever out of order, that’s a major league bug in recovery
>>>>> (not just dataguard.)
>>>>>
>>>>>
>>>>>
>>>>> IF memory serves there is guidance documentation on using the open for
>>>>> read secondary dataguard as a source of information. Dataguard for
>>>>> switchover and failover existed for YEARS before the open active dataguard
>>>>> for read, so the documentation is probably skewed a bit toward that. I
>>>>> suggest first completely understanding just the notion for being complete
>>>>> to a point in time with respect to commit points before moving on to the
>>>>> exact functionality of active open for read. Then the context documents
>>>>> about active open for read might make sense.
>>>>>
>>>>>
>>>>>
>>>>> Myself, I prefer the scheduled suspend recovery on business event,
>>>>> cold clone rename, open renamed clone frozen (allowing for aggregation for
>>>>> that freeze point), and resume recovery to actively open dataguard, which
>>>>> relegates real time queries to the primary but which allows dedicated
>>>>> configuration of a completely relationally integral database instance
>>>>> configured for ready only advantages like aggregation and result cache and
>>>>> minimized query traffic on undo for your selected cold clone rename
>>>>> versions. Often quarterly, monthly, weekly, and daily for a week are good
>>>>> choices, depending on your operational needs. For finance and manufacturing
>>>>> many of my clients found monthly and daily for a week were useful, but that
>>>>> was before the recovery product was built, literally using continuous
>>>>> recovery as described for the remote physical backup. Now don’t be confused
>>>>> by that. That has come to be called “roll your own dataguard” but it
>>>>> preceded Oracle’s product by a long time. We called it the “standby
>>>>> recovery database.” If you search the IOUW and OAUG archives I think you
>>>>> can find my detailed papers.
>>>>>
>>>>>
>>>>>
>>>>> Several folks nearly simultaneously and independently “invented” this
>>>>> after Sequent presented a sales information demo of recovering a six month
>>>>> old vintage backup to current just from redo logs. That was early V6, so
>>>>> probably 1989 or so. Johnny Chan and Brien Christiansen of CISCO are the
>>>>> first folks who I worked with that routinely did the cold clone rename open
>>>>> as a reporting database frozen in time to relieve the overloaded primary
>>>>> systems of queries where “yesterday” and “end of last fiscal month” were
>>>>> actually as good as (or better than) current.
>>>>>
>>>>>
>>>>>
>>>>> Good luck. Whether the redo logs are being shipped close to real time
>>>>> for the log writer or from the archived redolog after it is switch out,
>>>>> they are processed in order on the destination. If you are thinking of this
>>>>> in the context of replication of logical transactions rather than
>>>>> physically ordered commits and rollbacks, that way lies madness. Deltas are
>>>>> being applied to datablocks.
>>>>>
>>>>>
>>>>>
>>>>> (There does exist a temporary logical phase for upgrades, but that is
>>>>> also beyond the scope of this email.)
>>>>>
>>>>>
>>>>>
>>>>> mwf
>>>>>
>>>>>
>>>>>
>>>>> *From:* oracle-l-bounce_at_freelists.org [mailto:
>>>>> oracle-l-bounce_at_freelists.org] *On Behalf Of *Lok P
>>>>> *Sent:* Saturday, December 19, 2020 6:42 AM
>>>>> *To:* Clay Jackson (cjackson)
>>>>> *Cc:* Oracle L
>>>>> *Subject:* Re: Data movement logic in data guard
>>>>>
>>>>>
>>>>>
>>>>> Thanks for the guidance here.
>>>>>
>>>>> I will try to see how I can test the scenario to ensure the
>>>>> replication order will be exactly the same in DR as in primary even during
>>>>> catching up large lag.
>>>>>
>>>>>
>>>>>
>>>>> Two questions with related to same
>>>>>
>>>>>
>>>>>
>>>>> 1) We don't want to keep on hitting the data read queries in those
>>>>> transaction tables if the lag is very high and should wait for the lag to
>>>>> catch up. For that reason is it okay to refer to the v$standby_log. And
>>>>> keep on looping with a sleep something as below?
>>>>>
>>>>>
>>>>>
>>>>> LOOP
>>>>>
>>>>> SELECT SYSDATE, ( SELECT max(last_time) FROM gv$standby_log)
>>>>>
>>>>> INTO current_dt,
>>>>>
>>>>> last_replication_dt
>>>>>
>>>>> FROM DUAL;
>>>>>
>>>>> WHILE (current_dt > last_replication_dt)
>>>>>
>>>>> LOOP
>>>>>
>>>>> DBMS_LOCK.SLEEP(30);
>>>>>
>>>>> SELECT max(last_time)
>>>>>
>>>>> INTO last_replication_dt
>>>>>
>>>>> FROM gv$standby_log;
>>>>>
>>>>> END LOOP;
>>>>>
>>>>> /
>>>>>
>>>>>
>>>>>
>>>>> 2) For insert only TABLES it's easy to take MAX(date_created) and
>>>>> store it in some reference tables and then use that to do the incremental
>>>>> data load to target . But how to achieve this data movement/replication,
>>>>> if UPDATE/DELETE also happens on the base transaction table? Is there any
>>>>> recommended way for that?
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Dec 19, 2020 at 1:04 AM Clay Jackson (cjackson) <
>>>>> Clay.Jackson_at_quest.com> wrote:
>>>>>
>>>>> While I admit I haven’t specifically tested this case, which I would
>>>>> recommend you do, I think you’re “overthinking” this.
>>>>>
>>>>>
>>>>>
>>>>> The whole point behind DataGuard (or, at least one of the major “use
>>>>> cases”) is to create a “consistent” COPY of your primary database, so that
>>>>> in the event of a failure (loss of access), users can be moved to the
>>>>> secondary database without loss of data (or consistency). The mechanism
>>>>> by which this happens is through physical copying of the redo log(s), or
>>>>> records within the logs from the primary to the secondary, and then
>>>>> “applying” those logs (or records) to the secondary, IN THE SAME order as
>>>>> they were “applied” on the primary. So, while the actual writing of the
>>>>> transactions to the database on the secondary will ALWAYS happen at a point
>>>>> in time AFTER transaction was written to the primary, I don’t think the
>>>>> scenario you outlined (records being written “out of order” is possible.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Clay Jackson
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
>>>>> Behalf Of *Lok P
>>>>> *Sent:* Friday, December 18, 2020 11:06 AM
>>>>> *To:* Oracle L <oracle-l_at_freelists.org>
>>>>> *Subject:* Re: Data movement logic in data guard
>>>>>
>>>>>
>>>>>
>>>>> *CAUTION:* This email originated from outside of the organization. Do
>>>>> not follow guidance, click links, or open attachments unless you recognize
>>>>> the sender and know the content is safe.
>>>>>
>>>>>
>>>>>
>>>>> Moving data from one to other system looks to be very common but I
>>>>> have never came across building such logic manually in code. And I don't
>>>>> have
>>>>>
>>>>> much idea about how the archive log apply happens at standby. But it
>>>>> seems like, to maintain constraints it has to be in exactly in same order
>>>>> as the data load happens in primary. Else things will break.
>>>>>
>>>>>
>>>>>
>>>>> Can somebody guide me here, if the logic which we are going to rely
>>>>> to decide reference date/time in our case for data movement will never
>>>>> fail?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, 18 Dec 2020, 4:27 pm Lok P, <loknath.73_at_gmail.com> wrote:
>>>>>
>>>>> Actually in golden gate setup with multiple parallel replication
>>>>> threads , we have encountered scenarios where two transactions generated
>>>>> from source can reach target in different order thus causing the data
>>>>> pickup from target to miss some rows.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Dec 18, 2020 at 9:30 AM Lok P <loknath.73_at_gmail.com> wrote:
>>>>>
>>>>> Its version 11.2.0.4 of oracle exadata. We have a requirement in which
>>>>> we need to move the data to another system and for that we want to utilize
>>>>> the DR database(which is a physical standby with data guard configured)
>>>>> rather than Primary, as we want to not to affect the key applications which
>>>>> are running on primary. And we are almost saturating the DB resources on
>>>>> the primary during peak hours.
>>>>>
>>>>> For copying/moving data without miss in each ~15minutes interval
>>>>> frequency, we are relying on "date_created" column as reference column of
>>>>> the transaction table , so in case we have some additional lag from primary
>>>>> to DR, is it possible that a record created on primary and DR as below
>>>>> sequence, such that the row-3 created on DR before Row-2? In that case we
>>>>> may miss that row when we take MAX(date_created) from our transaction table
>>>>> to move the data.
>>>>>
>>>>> In such a scenario , how should we make our logic full proof to pick
>>>>> the max(date_created) on source so that we won't miss any data? Or should
>>>>> we some way utilize the column last_time of v$standby_log to make our logic
>>>>> full proof?
>>>>>
>>>>> Aso i am wondering if by any way we can handle UPDATE/DELETE of the
>>>>> records in source?
>>>>>
>>>>> On Primary:-
>>>>>
>>>>> Row-1 created at 18th DEC 10:00AM with date_created as 18th DEC 10:00AM
>>>>>
>>>>> Row-2 created at 18th DEC 10:05 AM with date_created as 18th DEC 10:05
>>>>> AM
>>>>>
>>>>> Row-3 created at 18th DEC 10:06 AM with date_created as 18th DEC 10:06
>>>>> AM
>>>>>
>>>>> On DR when we have lag of say ~5minutes:-
>>>>>
>>>>> Row-1 created at 18th DEC 10:05AM with date_created as 18th DEC 10:00AM
>>>>>
>>>>> Row-3 created at 18th DEC 10:11:01 AM with date_created as 18th DEC
>>>>> 10:06AM
>>>>>
>>>>> Row-2 created at 18th DEC 10:11:02 AM with date_created as 18th DEC
>>>>> 10:05AM
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>
>> --
>> Niall Litchfield
>> Oracle DBA
>> http://www.orawin.info
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Dec 31 2020 - 12:09:25 CET

Original text of this message