Re: High Administrative and Network waits

From: Lok P <loknath.73_at_gmail.com>
Date: Fri, 5 Nov 2021 02:26:47 +0530
Message-ID: <CAKna9VYnYORQSRt3+J6FBXGQa1ODA5nz8KfiyPctoixXPmfrAQ_at_mail.gmail.com>



Thank You.
Yes i saw whatever history we have in v$rman_backup_job_details the RMAN DB INCR has been increasing from ~1hrs to now it becomes ~7hrs+. And also the application job runs during this period only. So it seems it is matching as you said. And also it seems the increase in backup run time started immediately after we moved to AES 256 tablespace encryption algo from AES128. So now I am trying to relate how logically the new encryption algo can cause RMAN backup to struggle with the event 'BACKUP:MML write backup piece' can in turn cause an increase in network waits and then impact application jobs? Can you please help me understand this part.

On Fri, Nov 5, 2021 at 1:55 AM Pap <oracle.developer35_at_gmail.com> wrote:

> As you highlighted, the event "BACKUP:MML write backup piece" time
> increased along with few network waits, post your 256 encryption and you
> suspect that may be the cause of the issue. So I believe your backup must
> not be running the whole day, so just to confirm your theory, you may check
> the backup timing, if they have increased recently and your application job
> is running in the same exact backup window.
>
> On Thu, Nov 4, 2021 at 9:14 AM Lok P <loknath.73_at_gmail.com> wrote:
>
>> I have not yet checked these. But yes, this database is hosted on an
>> exadata X3 , high capacity full rac machine. I will try to get these
>> outputs from the DBA team.
>>
>> On Thu, Nov 4, 2021 at 5:06 AM Mladen Gogala <gogala.mladen_at_gmail.com>
>> wrote:
>>
>>> Have you tried netstat -s|egrep -i
>>> "(fail|error|warn|drop|retrans|drop|collis)"? How about netstat -i? Do
>>> you see any errors on the interface? What kind of network is that? 10Gb?
>>> 1Gb?
>>>
>>> On 11/3/21 17:08, Lok P wrote:
>>> > It's getting bounced back so resending.
>>> >
>>> > Its version 19C of oracle. We were suddenly getting complaints from
>>> > the team regarding the slowness of few of the application processes.
>>> > Those that were running in 2-2.5hrs are now running for ~6hrs. This
>>> > process calls many quick queries from Java and we verified most of
>>> > them have no change in plan but a bit increase in elapsed time
>>> > (which is mainly cpu time) observed. But when looking into the
>>> > database we see a spike in Administrative waits (mainly BACKUP:MML
>>> > write backup piece) and network wait events(mainly Data Guard Network
>>> > buffer stall reap, SQL*Net more data from client followed by SQL*Net
>>> > more data to client). And we recently moved from existing AES128 to
>>> > AES256 tablespace encryption. Not sure if that is anyway playing any
>>> > role here causing these administrative and network waits in turn
>>> > impacting the application query/processes.
>>> >
>>> > However I tried capturing the stats in production from v$sesstat for
>>> > two of the processes/sessions while they were already running for
>>> > ~3hrs from the logon_time noted in v$session. But I am not able to
>>> > understand if that is pointing anything suspicious towards the
>>> > encryption being the cause. I don't have results from v$sesstats from
>>> > a good time though. Can you guide me here? How these can be logically
>>> > related.
>>> >
>>> > Regards
>>> > Lok
>>>
>>> --
>>> Mladen Gogala
>>> Database Consultant
>>> Tel: (347) 321-1217
>>> https://dbwhisperer.wordpress.com
>>>
>>> --
>>> http://www.freelists.org/webpage/oracle-l
>>>
>>>
>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu Nov 04 2021 - 21:56:47 CET

Original text of this message