RE: intermittent long "log file sync" waits
Date: Wed, 29 Jan 2020 14:59:48 +0000
Message-ID: <38080_1580310012_5E319DFC_38080_13400_1_141268bcc0d94744975e4e0ef8b6d14f_at_vontobel.com>
Yes, it was strange to see rman backups in all your reports. We should have paid more attention to it. But there were also periods when the backup was running without impacting log file sync.
Locks on CF are typically held during log switches. You can correlate log switch times with the log file sync problem to answer this. I think Jonathan has already suggested that.
>not sure why that wouldn't show up on ASH though.
That’s a big question – I’ve never seen that before. pSnapper would provide the information what the lg* were doing.
From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Chris Stephens
Sent: Mittwoch, 29. Januar 2020 15:49
To: Stefan Koehler <contact_at_soocs.de>
Cc: oracle-l <Oracle-L_at_freelists.org>
Subject: Re: intermittent long "log file sync" waits
holy shnickees psnapper is awesome. i discovered quite a few backup related processes that were stuck w/ pstack prior to pulling down and using psnapper. after killing those sessions, the remaining rman processes showed progress w/ psnapper output like:
psnapper]# ./psn -p 874 -g comm,state,syscall
Linux Process Snapper v0.14 by Tanel Poder [https://tp.dev/psnapper] Sampling /proc/syscall, stat for 5 seconds... finished.
- Active Threads ===========================================================
samples | avg_threads | comm | state | syscall ------------------------------------------------------------------------------ 43 | 0.43 | (oracle_*_lsst) | Disk (Uninterruptible) | pread64 31 | 0.31 | (oracle_*_lsst) | Disk (Uninterruptible) | lstat 7 | 0.07 | (oracle_*_lsst) | Running (ON CPU) | pread64 6 | 0.06 | (oracle_*_lsst) | Running (ON CPU) | fstatfs 3 | 0.03 | (oracle_*_lsst) | Disk (Uninterruptible) | [running] 3 | 0.03 | (oracle_*_lsst) | Disk (Uninterruptible) | open 3 | 0.03 | (oracle_*_lsst) | Running (ON CPU) | [running] 1 | 0.01 | (oracle_*_lsst) | Disk (Uninterruptible) | fstatfs 1 | 0.01 | (oracle_*_lsst) | Running (ON CPU) | open
it turns out we (i) copied over backup scripts and cron entries to another cluster i built up a few weeks ago. i thought i successfully replaced all references to database experiencing issues but that was not the case. I killed all processes related to backups scripts kicked off from this new RAC connecting to problem RAC and now ASH output is looking much better.
i will continue to monitor for unreasonable "log file sync" waits but i suspect this was the cause of the issue. possibly locks on CF? not sure why that wouldn't show up on ASH though.
On Wed, Jan 29, 2020 at 2:13 AM Stefan Koehler <contact_at_soocs.de<mailto:contact_at_soocs.de>> wrote:
Hello Chris,
as I already mentioned - please use pSnapper ( https://github.com/tanelpoder/psnapper ) by Tanel Poder.
pSnapper samples /proc file system and does not suspend the process like pstack (as it is just a wrapper around gdb which uses ptrace() under the hood). Suspending a process can screw it up :)
In addition pSnapper provides more useful information (e.g. state, C function that runs in kernel space for a specific syscall, etc.).
Best Regards
Stefan Koehler
Independent Oracle performance consultant and researcher
Website: http://www.soocs.de
Twitter: _at_OracleSK
> Chris Stephens <cstephens16_at_gmail.com<mailto:cstephens16@gmail.com>> hat am 28. Januar 2020 um 23:49 geschrieben:
>
> was just getting ready to sign off and noticed the archivelog backup scheduled to run every hour seems to be stuck and has been for at least 10 mins:
>
> [oracle_at_lsst-oradb04 ~]$ ps -ef | grep oracle_backup.sh
> oracle 13369 31167 0 16:43 pts/0 00:00:00 grep --color=auto oracle_backup.sh
> oracle 14649 14645 0 16:00 ? 00:00:00 /bin/bash /home/oracle/scripts/rman/oracle_backup.sh -d lsst2db -s archivelog.rman -c lsst2db_rman -e cs2018_at_ncsa.illinois.edu<mailto:cs2018_at_ncsa.illinois.edu> -r iddsdba_rman
> [oracle_at_lsst-oradb04 ~]$ pstack 15649
> Process 15649 not found.
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0 0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1 0x0000000000440b84 in waitchld.isra.10 ()
> #2 0x0000000000441e3c in wait_for ()
> #3 0x0000000000433b0e in execute_command_internal ()
> #4 0x0000000000433d2e in execute_command ()
> #5 0x000000000041e365 in reader_loop ()
> #6 0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0 0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1 0x0000000000440b84 in waitchld.isra.10 ()
> #2 0x0000000000441e3c in wait_for ()
> #3 0x0000000000433b0e in execute_command_internal ()
> #4 0x0000000000433d2e in execute_command ()
> #5 0x000000000041e365 in reader_loop ()
> #6 0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0 0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1 0x0000000000440b84 in waitchld.isra.10 ()
> #2 0x0000000000441e3c in wait_for ()
> #3 0x0000000000433b0e in execute_command_internal ()
> #4 0x0000000000433d2e in execute_command ()
> #5 0x000000000041e365 in reader_loop ()
> #6 0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0 0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1 0x0000000000440b84 in waitchld.isra.10 ()
> #2 0x0000000000441e3c in wait_for ()
> #3 0x0000000000433b0e in execute_command_internal ()
> #4 0x0000000000433d2e in execute_command ()
> #5 0x000000000041e365 in reader_loop ()
> #6 0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$
>
>
> On Tue, Jan 28, 2020 at 4:30 PM Chris Stephens < cstephens16_at_gmail.com<mailto:cstephens16_at_gmail.com>> wrote:
> > ok. i've had the following running since lunchtime and will let it run through the night in hopes that issue occurs again:
> >
> > _at_snapper ash,stats,trace 10 999999 lgwr
> >
> > I will run the following script as well:
> >
> > #! /bin/bash
> >
> > while [ 1 ]
> > do
> > echo `date` >> /tmp/pstack.26552
> > pstack 26552 >> /tmp/pstack.26552
> > echo `date` >> /tmp/pstack.26556
> > pstack 26556 >> /tmp/pstack.26556
> > echo `date` >> /tmp/pstack.26560
> > pstack 26560 >> /tmp/pstack.26560
> > sleep 5
> > done
> >
> > based off:
> >
> > [oracle_at_lsst-oradb05 bin]$ ps -ef | egrep "lg.*lsst2db2"
> > oracle 26552 1 0 Jan17 ? 00:06:14 ora_lgwr_lsst2db2
> > oracle 26556 1 0 Jan17 ? 00:00:13 ora_lg00_lsst2db2
> > oracle 26560 1 0 Jan17 ? 00:00:12 ora_lg01_lsst2db2
> >
> > All sessions for this workload are connecting to service w/ tracing enabled so we'll have trace data as well.
> >
> > I will (hopefully) have updates in the morning.
Please consider the environment before printing this e-mail. Bitte denken Sie an die Umwelt, bevor Sie dieses E-Mail drucken.
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <style type="text/css">p { font-family: Arial;font-size:9pt }</style> </head> <body> <p> <br>Important Notice</br> <br />
This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system.<br /> Without prejudice to any contractual agreements between you and us which shall prevail in any case, we take it as your authorization to correspond with you by e-mail if you send us messages by e-mail. However, we reserve the right not to execute orders and instructions transmitted by e-mail at any time and without further explanation.<br /> E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of Vontobel Holding Ltd. and any of its affiliates (hereinafter collectively referred to as "Vontobel Group") for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version.</br> Please note that all e-mail communications to and from the Vontobel Group are subject to electronic storage and review by Vontobel Group. Unless stated to the contrary and without prejudice to any contractual agreements between you and Vontobel Group which shall prevail in any case, e-mail-communication is for informational purposes only and is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction.<br /> The legal basis for the processing of your personal data is the legitimate interest to develop a commercial relationship with you, as well as your consent to forward you commercial communications. You can exercise, at any time and under the terms established under current regulation, your rights. If you prefer not to receive any further communications, please contact your client relationship manager if you are a client of Vontobel Group or notify the sender. Please note for an exact reference to the affected group entity the corporate e-mail signature. For further information about data privacy at Vontobel Group please consult <a href="https://www.vontobel.com">www.vontobel.com</a>.<br />
</p> </body> </html>
-- http://www.freelists.org/webpage/oracle-lReceived on Wed Jan 29 2020 - 15:59:48 CET