Re: High Buffer busy waits

From: Timur Akhmadeev <timur.akhmadeev_at_gmail.com>
Date: Sat, 29 Jul 2023 08:52:13 +0300
Message-ID: <CACGsLCL+A7HPEAvN4zVGKXdosunK5vnscQ-eXUDV3aN6ncOd6A_at_mail.gmail.com>



The script calculates “redo to add” as a product of the average redo rate per second with the seconds waited for incomplete checkpoint. In your case there’re 10+ sessions waiting for checkpoint to complete and the value is over estimated by 10x+ so it should be around 4G redo to add. I prefer to have smaller number of groups and would try 3x8G setup.

Check the bug Priit mentioned as it may explain sudden change in behavior.

On Sat, 29 Jul 2023 at 01:31, yudhi s <learnerdatabase99_at_gmail.com> wrote:

> Resending as the initial reply got bounced back.......
>
> Thank You So much.
>
> I verified in v$log, we have 4 groups and each having size is ~4GB. And no
> changes have been done in this recently. And not seeing any change in
> read/write IO response time for the storage cell servers which could be
> suspected here.
>
> Also Jonathan when you mentioned below*,* can you please give some more
> pointers* , "how can i get the block addresses for the bmbs show up in
> the bbws" , *considering the fact that we have the AWR data with us, as
> the issues occurred in the past*?*
>
> *" The way in which Oracle associates processes with data blocks does
> change with the size of the extent, and I have wondered whether
> the algorithm goes a little further and allocates more level 1 bitmap
> blocks as the number of inserting processes grows. You could examine
> the block addresses of the bmbs that show up in bbws to see how many
> different blocks are involved in a short period of time, this might give us
> some clues about how the feature works in extreme cases"*
>
> Additionally I think if the smaller value of initrans would be a factor
> here, then we would have seen "enq: TX - allocate ITL entry" as the event.
> Correct me if I'm wrong here.
>
> When I run the script which "Timur Akhmadeev" has pointed to, to see the
> amount of redo log needed , I am seeing some of the specific times of the
> day when the demand is showing very high i.e. 20-40GB . Wondering if it's
> really pointing towards a smaller REDO or something else? and yes it's a 2
> Node RAC system but the application is configured to run on one dedicated
> service/instance only.
>
> I have added the full redo log trends for 3days, with each snap for
> the whole day in the below location.
>
> https://gist.github.com/databasetech0073/bb4f8c4a7251d513775eebe2ead7bcda
>
> Below is a sample redo output from Timur's script, where it shows ~20-40GB
> Redo requirement.
> TIM CPU_SEC PHY_READS PHY_WRITES CR_SERVED CURRENT_SERVED REDO_MB
> PROCESSES AVG_DF_SEQ AVG_DF_SCAT REDO_DIFF_TO_MD_PCT AVG_LFPW AVG_LOG_SYNC
> LOG_CKPT_SEC REDO_NEEDED
> 20230722 2000 6333.12 18370113 2806698 2126 674 12038.07 1171 323.9741
> 0.119394 0.261889 2070.364 27539.44
> 20230723 0530 12983.13 14971155 3009666 1566 390 16915.52 1166
> 495.7551 0.182841 0.217913 1135.729 21251.59
> 20230723 2000 5865.12 17578971 2544062 2529 1197 10661.8 1125 275.5025
> 0.108198 0.180787 3583.654 42172.39
> 20230723 2045 6125.02 9263717 2993726 16254 14899 14932.66 1103 425.92
> 0.111488 0.446633 3245.135 53604.53
> 20230723 2145 4593.99 5385775 1848283 19228 12376 8389.169 1158
> 195.4619 0.094307 0.177062 1998.818 18549.14
>
>
> On Fri, Jul 28, 2023 at 11:11 AM Priit Piipuu <priit.piipuu_at_gmail.com>
> wrote:
>
>> Heh! I guessed it's RAC just by looking at the wait chains. There have
>> been some bugs in 19c about dbwr not picking up writes, for example Bug
>> 33973908.
>>
>> On Thu, 27 Jul 2023, 00:48 yudhi s, <learnerdatabase99_at_gmail.com> wrote:
>>
>>> Missed to mention, it's version 19C of oracle and on a exadata box.
>>>
>>> On Thu, 27 Jul, 2023, 2:46 am yudhi s, <learnerdatabase99_at_gmail.com>
>>> wrote:
>>>
>>>> Hello Listers,
>>>>
>>>> We suddenly see higher response time for our incoming transactions
>>>> during certain time periods. and then tried running tanels DASH_WAIT_CHAIN
>>>> getting below wait chains. It's pointing to high 'Buffer busy waits' and
>>>> the top sqls it's pointing to is a single row INSERT query.
>>>>
>>>> And as ist a single row insert , so the plan_hash_value is showing as
>>>> '0' and also the Avg number of execution of the Insert query remains same.
>>>> So wondering what can be the cause of the issue? Can it be associated with
>>>> something related to infrastructure as the final blocker showing as "DB
>>>> writer"?
>>>> or
>>>> As the current_obj# in ASH is pointing to table only and it has default
>>>> settings i.e. ini_trans - 1 and max_trans- 255 and we are using ASSM.
>>>> Should we consider increasing the value of ini_trans here?
>>>>
>>>>
>>>>
>>>> https://github.com/tanelpoder/tpt-oracle/blob/master/ash/dash_wait_chains.sql
>>>>
>>>>
>>>> %This SECONDS AAS DISTINCT_SIDS WAIT_CHAIN FIRST_SEEN LAST_SEEN
>>>> 38% 5710 15.9 19 -> USER1:(PROGMGR) buffer busy waits [1st level bmb]
>>>> -> USER1:(PROGMGR) log file switch (checkpoint incomplete) -> [idle
>>>> blocker 1,2395,34375 (oracle_at_XXXXXXXXXXX (DBW0))] 7/25/2023 5:46 7/25/2023
>>>> 5:46
>>>> 20% 3050 8.5 166 -> USER1:(PROGMGR) log file switch (checkpoint
>>>> incomplete) -> [idle blocker 1,2395,34375 (oracle_at_XXXXXXXXXXX (DBW0))] 7/25/2023
>>>> 5:46 7/25/2023 5:46
>>>> 7% 990 2.8 47 -> USER1:(PROGCAP) ON CPU 7/25/2023 5:46 7/25/2023 5:51
>>>> 5% 770 2.1 40 -> USER1:(PROGCAP) log file switch (checkpoint
>>>> incomplete) -> [idle blocker 1,2395,34375 (oracle_at_XXXXXXXXXXX (DBW0))] 7/25/2023
>>>> 5:46 7/25/2023 5:46
>>>> 5% 700 1.9 16 -> USER1:(PROGMGR) buffer busy waits [data block] ->
>>>> USER1:(PROGMGR) log file switch (checkpoint incomplete) -> [idle blocker
>>>> 1,2395,34375 (oracle_at_XXXXXXXXXXX (DBW0))] 7/25/2023 5:46 7/25/2023 5:46
>>>> 3% 390 1.1 5 -> USER1:(PROGCAP1) buffer busy waits [data block] ->
>>>> USER1:(PROGCAP1) log file switch (checkpoint incomplete) -> [idle blocker
>>>> 1,2395,34375 (oracle_at_XXXXXXXXXXX (DBW0))] 7/25/2023 5:46 7/25/2023 5:46
>>>> 2% 330 0.9 5 -> RPT_USER:(JDBC Thin Client) ON CPU 7/25/2023 5:46 7/25/2023
>>>> 5:51
>>>> 2% 290 0.8 6 -> ADMIN:(JDBC Thin Client) ON CPU 7/25/2023 5:46 7/25/2023
>>>> 5:51
>>>> 2% 270 0.8 22 -> USER1:(PROGMGR) ON CPU 7/25/2023 5:46 7/25/2023 5:51
>>>>
>>> --
Regards
Timur Akhmadeev

--
http://www.freelists.org/webpage/oracle-l
Received on Sat Jul 29 2023 - 07:52:13 CEST

Original text of this message