Re: Oracle 12.1.0.2, RHEL 7 and XFS issue
Date: Thu, 30 Jul 2015 22:29:13 +0200 (CEST)
Message-ID: <1709291619.128220.1438288153314.JavaMail.open-xchange_at_app03.ox.hosteurope.de>
Hi Uwe,
> Do you know of any issues with XFS on Linux 7 with direct I/O?
I am not aware of anything special about XFS and OEL 7 in this case. FYI i recently did some benchmarks with XFS and OEL 6.6 (with help of SLOB) at client site and it was an ASM raw device like performance.
> Do you have any suggestions how to further track down the issue? E.g., how could I prove there's something wrong with the O_DIRECT calls?
- You can track down the I/O histogram to microsecond level with Oracle 12.1.0.2. Luca Canali already created a script for this ( http://db-blog.web.cern.ch/blog/luca-canali/2015-06-event-histogram-metric-and-oracle-12c ).
- I can not see the I/O reference values from your old 11.1 environment. What was the repsonse time there?
- Did you use the same amount of LUNs (keyword disk queues) in both envs?
- You currently don't know where the "time" is spent and so you have to drill down the I/O stack (System Call Interface -> Virtual File System -> Block Layer -> SCSI layer -> Device driver). You can use blktrace to dig into the block layer, compare the response times and check if the possible time difference is driven above or below / in the block layer. Frits Hoogland has written a blog post about this ( https://fritshoogland.wordpress.com/2014/11/28/physical-io-on-linux/ ). If it is in the VFS layer you can use XFS stats for further information (https://www.kernel.org/doc/Documentation/filesystems/xfs.txt - /proc/fs/xfs/stat).
- Compare the response times with the storage capabilities, e.g. what kind of repsonse time should be expected for a 8k request (asuming 8k block size) from a storage perspective.
Hope this helps as a start.
P.S.: You mentioned a "new VM" - what kind of VM do you use? There could be nasty side effects as well, if not configured / tested out correctly.
Best Regards
Stefan Koehler
Freelance Oracle performance consultant and researcher
Homepage: http://www.soocs.de
Twitter: _at_OracleSK
> Uwe Küchler <uwe_at_kuechler.org> hat am 30. Juli 2015 um 21:14 geschrieben:
>
>
> Dear fellows of the Oracle,
>
> >From Red Hat and Oracle Linux 7 onwards, XFS is the default file system of
> the OS.
> At a customer site, XFS was already the preferred file system, so the
> customer chose to stick to it for a new VM with OL7 and Oracle 12.1.0.2.
>
> But, while testing the migrated database against the old one, most of the
> batch jobs showed a slowdown to at least twice the run time than in the
> 11.1 environment.
>
> Both statspack reports showed clearly that "db file sequential read" was
> by far the main wait event.
> Top SQL and their explain plans did not differ between the environments.
>
> Research took a while, but to get to the point: It boiled down to the I/O
> response times, as shown in the wait event histogram excerpts below:
>
> With
> - 24 GiB RAM
> - 5 GiB sga_target
> - Buffer Cache: 4,592M
> - "filesystemio_options=NONE":
> Total ----------------- % of Waits
> ------------------
> Event Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s
> >1s
> -------------------------- ----- ----- ----- ----- ----- ----- ----- -----
> -----
> db file scattered read 836 89.7 1.6 .8 1.8 3.1 1.9 1.1
> db file sequential read 121K 83.8 .7 .8 3.9 6.1 2.8 1.9
> .0
>
> 80-90% of those waits < 1ms?
> This can most certainly be attributed to file system caching (no Flash
> Cache, SSD or other smart stuff in place here).
>
> - 24 GiB RAM
> - 8 GiB sga_target
> - Buffer Cache: 5,856M
> - With "filesystemio_options=SETALL":
> Total ----------------- % of Waits
> ------------------
> Event Waits <1ms <2ms <4ms <8ms <16ms <32ms <=1s
> >1s
> -------------------------- ----- ----- ----- ----- ----- ----- ----- -----
> -----
> db file scattered read 63K 42.6 1.9 3.2 19.9 26.0 4.0 2.4
> db file sequential read 208K 47.1 1.6 3.7 19.0 21.9 4.5 2.3
> .0
>
> In other batch job runs, the amount of waits < 1 ms was even lower (some
> 30%).
> As you can see, I made the SGA / the buffer cache bigger in the 12c
> environment, to allow for more buffering within the SGA.
>
> Of course, I checked MOS for any known issues with Direct I/O on XFS in
> this constellation, but haven't found anything so far. Just the usual
> recommendations to avoid double buffering and also the confirmation that
> XFS is capable of doing direct I/O.
>
> And now for my
> QUESTION(s):
> ============
> Do you know of any issues with XFS on Linux 7 with direct I/O?
> Do you have any suggestions how to further track down the issue? E.g., how
> could I prove there's something wrong with the O_DIRECT calls?
>
> Thanks for your time.
> Uwe
>
>
> P.S.: On (a hundred-and-) second thought I could try to enlarge the buffer
> cache even more, as there's enough RAM left. At least for the tests.
>
> P.P.S.: In case Kevin Closson reads this: I am eagerly awaiting your
> upcoming blog article on XFS!
>
> ---
> http://oraculix.com
-- http://www.freelists.org/webpage/oracle-lReceived on Thu Jul 30 2015 - 22:29:13 CEST