Re: intermittent long "log file sync" waits

From: Stefan Koehler <contact_at_soocs.de>
Date: Wed, 29 Jan 2020 09:13:36 +0100 (CET)
Message-ID: <578860919.117416.1580285616352_at_ox.hosteurope.de>


Hello Chris,
as I already mentioned - please use pSnapper ( https://github.com/tanelpoder/psnapper ) by Tanel Poder.

pSnapper samples /proc file system and does not suspend the process like pstack (as it is just a wrapper around gdb which uses ptrace() under the hood). Suspending a process can screw it up :)

In addition pSnapper provides more useful information (e.g. state, C function that runs in kernel space for a specific syscall, etc.).

Best Regards
Stefan Koehler

Independent Oracle performance consultant and researcher Website: http://www.soocs.de
Twitter: _at_OracleSK

> Chris Stephens <cstephens16_at_gmail.com> hat am 28. Januar 2020 um 23:49 geschrieben:
>
> was just getting ready to sign off and noticed the archivelog backup scheduled to run every hour seems to be stuck and has been for at least 10 mins:
>  
> [oracle_at_lsst-oradb04 ~]$ ps -ef | grep oracle_backup.sh
> oracle   13369 31167  0 16:43 pts/0    00:00:00 grep --color=auto oracle_backup.sh
> oracle   14649 14645  0 16:00 ?        00:00:00 /bin/bash /home/oracle/scripts/rman/oracle_backup.sh -d lsst2db -s archivelog.rman -c lsst2db_rman -e cs2018_at_ncsa.illinois.edu -r iddsdba_rman
> [oracle_at_lsst-oradb04 ~]$ pstack 15649
> Process 15649 not found.
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0  0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1  0x0000000000440b84 in waitchld.isra.10 ()
> #2  0x0000000000441e3c in wait_for ()
> #3  0x0000000000433b0e in execute_command_internal ()
> #4  0x0000000000433d2e in execute_command ()
> #5  0x000000000041e365 in reader_loop ()
> #6  0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0  0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1  0x0000000000440b84 in waitchld.isra.10 ()
> #2  0x0000000000441e3c in wait_for ()
> #3  0x0000000000433b0e in execute_command_internal ()
> #4  0x0000000000433d2e in execute_command ()
> #5  0x000000000041e365 in reader_loop ()
> #6  0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0  0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1  0x0000000000440b84 in waitchld.isra.10 ()
> #2  0x0000000000441e3c in wait_for ()
> #3  0x0000000000433b0e in execute_command_internal ()
> #4  0x0000000000433d2e in execute_command ()
> #5  0x000000000041e365 in reader_loop ()
> #6  0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$ pstack 14649
> #0  0x00007f9a6537e41c in waitpid () from /lib64/libc.so.6
> #1  0x0000000000440b84 in waitchld.isra.10 ()
> #2  0x0000000000441e3c in wait_for ()
> #3  0x0000000000433b0e in execute_command_internal ()
> #4  0x0000000000433d2e in execute_command ()
> #5  0x000000000041e365 in reader_loop ()
> #6  0x000000000041c9ce in main ()
> [oracle_at_lsst-oradb04 ~]$
>
>
> On Tue, Jan 28, 2020 at 4:30 PM Chris Stephens < cstephens16_at_gmail.com> wrote:
> > ok. i've had the following running since lunchtime and will let it run through the night in hopes that issue occurs again:
> >  
> > _at_snapper ash,stats,trace 10 999999 lgwr
> >  
> > I will run the following script as well:
> >  
> > #! /bin/bash
> >
> > while [ 1 ]
> > do
> >     echo `date` >> /tmp/pstack.26552
> >     pstack 26552 >> /tmp/pstack.26552
> >     echo `date` >> /tmp/pstack.26556
> >     pstack 26556 >> /tmp/pstack.26556
> >     echo `date` >> /tmp/pstack.26560
> >     pstack 26560 >> /tmp/pstack.26560
> >     sleep 5
> > done
> >  
> > based off:
> >  
> > [oracle_at_lsst-oradb05 bin]$ ps -ef | egrep "lg.*lsst2db2"
> > oracle   26552     1  0 Jan17 ?        00:06:14 ora_lgwr_lsst2db2
> > oracle   26556     1  0 Jan17 ?        00:00:13 ora_lg00_lsst2db2
> > oracle   26560     1  0 Jan17 ?        00:00:12 ora_lg01_lsst2db2
> >  
> > All sessions for this workload are connecting to service w/ tracing enabled so we'll have trace data as well.
> >  
> > I will (hopefully) have updates in the morning.

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Jan 29 2020 - 09:13:36 CET

Original text of this message