intermittent long "log file sync" waits

From: Chris Stephens <cstephens16_at_gmail.com>
Date: Tue, 28 Jan 2020 10:09:24 -0600
Message-ID: <CAEFL0szEBH0P-hqb39Pmjw-pdJ60OZ+RUuFoqCXsC7+=iPEFzQ_at_mail.gmail.com>



3-node Oracle 19.3 RAC
Centos 7

We have a SQLAlchemy/Python based application workload that is running the exact same steps with widely varying response times which appear to be related to varying "log file sync" wait times.

Here is a profile of a "fast" run:

CALL-NAME                        DURATION       %   CALLS      MEAN
MIN        MAX
------------------------------  ---------  ------  ------  --------

-------- ---------
SQL*Net message from client 53.197782 91.8% 10,092 0.005271 0.000177 28.568493 EXEC 3.759177 6.5% 9,816 0.000383 0.000000 0.239592 row cache lock 0.233153 0.4% 541 0.000431 0.000113 0.000941 PARSE 0.140399 0.2% 4,867 0.000029 0.000000 0.006620 DLM cross inst call completion 0.137330 0.2% 956 0.000144 0.000004 0.000505 library cache lock 0.100171 0.2% 215 0.000466 0.000151 0.002133 library cache pin 0.079729 0.1% 216 0.000369 0.000056 0.000710 FETCH 0.058253 0.1% 1,062 0.000055 0.000000 0.004148 log file sync 0.048217 0.1% 149 0.000324 0.000259 0.000505 CLOSE 0.045416 0.1% 4,929 0.000009 0.000000 0.000073 20 others 0.135624 0.2% 11,854 0.000011
 0.000000 0.001700
------------------------------  ---------  ------  ------  --------

-------- ---------
TOTAL (30) 57.935251 100.0% 44,697 0.001296
 0.000000 28.568493

Here is a profile of a "slow" run:

CALL-NAME                         DURATION       %   CALLS      MEAN
MIN         MAX
------------------------------  ----------  ------  ------  --------

-------- ----------
SQL*Net message from client 131.186118 61.0% 10,092 0.012999 0.000212 106.789360 log file sync 79.291166 36.8% 150 0.528608 0.000264 2.986575 EXEC 3.728402 1.7% 9,816 0.000380 0.000000 0.221403 row cache lock 0.248868 0.1% 542 0.000459 0.000111 0.001036 PARSE 0.164267 0.1% 4,867 0.000034 0.000000 0.004652 DLM cross inst call completion 0.146981 0.1% 957 0.000154 0.000005 0.001188 library cache lock 0.104354 0.0% 218 0.000479 0.000160 0.000728 library cache pin 0.082504 0.0% 202 0.000408 0.000157 0.000672 FETCH 0.056687 0.0% 1,062 0.000053 0.000000 0.003969 CLOSE 0.043590 0.0% 4,929 0.000009 0.000000 0.000180 20 others 0.142044 0.1% 11,866 0.000012
 0.000000 0.001792
------------------------------  ----------  ------  ------  --------

-------- ----------
TOTAL (30) 215.194981 100.0% 44,701 0.004814
 0.000000 106.789360

looking at even histogram for that event:

SQL> _at_evh "log file sync"

       EVH_EVENT    EVH_WAIT_TIME_MILLI    WAIT_COUNT    EVH_EST_TIME
                LAST_UPDATE_TIME
________________ ______________________ _____________ _______________
______________________________________
log file sync                < 1               200051         100.026
27-JAN-20 11.39.57.344734 PM -06:00
log file sync                < 2                  165           0.248
28-JAN-20 12.18.10.429089 AM -06:00
log file sync                < 4                  150            0.45
27-JAN-20 11.18.31.158102 PM -06:00
log file sync                < 8                  199           1.194
27-JAN-20 11.19.14.209947 PM -06:00
log file sync               < 16                  253           3.036
28-JAN-20 08.03.17.851328 AM -06:00
log file sync               < 32                  472          11.328
27-JAN-20 11.20.22.746033 PM -06:00
log file sync               < 64                  728          34.944
28-JAN-20 01.13.37.364541 AM -06:00
log file sync              < 128                  691          66.336
27-JAN-20 11.31.37.400504 PM -06:00
log file sync              < 256                  414          79.488
28-JAN-20 12.18.10.423987 AM -06:00
log file sync              < 512                  405          155.52
28-JAN-20 03.27.50.540383 AM -06:00
log file sync             < 1024                  459         352.512
27-JAN-20 11.35.14.378363 PM -06:00
log file sync             < 2048                  482         740.352
28-JAN-20 01.18.20.556248 AM -06:00
log file sync             < 4096                  576        1769.472
27-JAN-20 11.21.05.084998 PM -06:00
log file sync             < 8192                   89         546.816
27-JAN-20 11.57.36.436460 AM -06:00
log file sync            < 16384                   60          737.28
25-JAN-20 07.48.31.460408 AM -06:00
log file sync            < 32768                   39         958.464
27-JAN-20 11.59.09.869286 AM -06:00
log file sync            < 65536                   27        1327.104
25-JAN-20 09.49.13.856563 AM -06:00 The weird thing is that I don't see corresponding log I/O waits (awaits) in iostat output.

I have a ticket open w/ oracle but does anyone have any suggestions to discover root cause and/or solution?

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jan 28 2020 - 17:09:24 CET

Original text of this message