Re: To Swap, or not to Swap

From: Timur Akhmadeev <timur.akhmadeev_at_gmail.com>
Date: Mon, 3 Apr 2023 10:54:55 +0300
Message-ID: <CACGsLCKY8xCjx6UHG-SXLEsPW48iMBsQtcPjbBOfuXLmfmjbrg_at_mail.gmail.com>



Just an example for zero swap from Netflix: https://www.brendangregg.com/Slides/AWSreInvent2017_performance_tuning_EC2.pdf

Usage: - Swappiness is set to zero to disable swapping and favor ditching
> the file system page cache first to free memory. (This tunable doesn’t make
> much difference, as swap devices are usually absent.)

On Mon, Apr 3, 2023 at 2:13 AM Jared Still <jkstill_at_gmail.com> wrote:

> So, I would like to devise some testing for this, with and without swap.
>
> Suggestions for metrics to track?
>
> There are certain things I would like to track, mostly from an app
> perspective.
>
> But also I would like to see how responsive the system is under severe
> memory pressure, both with and with without swap.
>
>
>
> On Sat, Apr 1, 2023 at 06:29 Frits Hoogland <frits.hoogland_at_gmail.com>
> wrote:
>
>> I too keep on coming across systems with no swap. Our YugabyteDB systems
>> are setup with no swap.
>> And my first reaction was identical to most others: what?! No swap?!
>>
>> Since then, my position to swap or no swap is much more seeing the
>> benefit, and not being fiercely against it.
>> Of course the only right answer to swap or not is: it depends.
>>
>> The way I see it, is that swap is like a soft pillow. If you are running
>> into memory shortage, swap will soften the landing, and make the system
>> increasingly slower, then coming to a standstill and then still kill the
>> system. And therefore the question to ask is: do we want to get into a
>> situation of unpredictable slowness before the OOM kill? The latter is how
>> I look at it now: there have been countless hours spent on trying to make
>> sense of swap, trying to understand and tune swap and swapping, whilst
>> there always is, and must be an actual problem that caused swap. So
>> removing it removes that discussion and gets you more straight into facing
>> the problem.
>>
>> There are two thing that I see additionally:
>> - you might argue that it’s swap will only mildly be used (…in your
>> case). I would argue that if you cannot control the server to only take the
>> actual memory, and it swaps, how the hell can you control it to just mildly
>> swap?
>> - many servers perform some swapping whilst memory pressure is never
>> seen. One common reason for that is that buffered IO is treated with equal
>> priority as memory allocations. That means that if you start performing
>> lots of IOs using buffered calls, the OS might, and will, start paging out
>> existing memory allocations that have not recently been touched, such as
>> bootstrap code for an application because the buffered IO gotten higher
>> priority. One extremely common case that such a case happens is with most
>> common backups. (I hope this will be an “aha” moment for lots of people
>> asking why their database server starts to allocate some swap, whilst it
>> never did get over allocated)
>>
>>
>> *Frits Hoogland*
>>
>>
>>
>>
>> On 1 Apr 2023, at 15:10, Mark W. Farnham <mwf_at_rsiz.com> wrote:
>>
>> “Are there no longer any scenarios where the swapfile allows the system
>> to recover, without failing or hanging?”
>>
>> First, good ask Jared, excellent analysis Tim, from my viewpoint.
>>
>> I would slightly alter Tim’s question:
>>
>> “For the goal of the server in question, are there any scenarios where a
>> swapfile allows the system to recover without failing or hanging?”
>>
>> For a server with a primary goal of providing the support of one or more
>> instances of Oracle which are allocated within the bounds of the server, I
>> can imagine some “clients” of the database services being allowed to run
>> directly on the database server to eliminate all the latencies that occur
>> between servers.
>> With a very fast swapfile AND a decently implemented sniping monitor, I
>> further imagine the database services continue to deliver within the
>> planned service quality while the rogue client is paused or killed with
>> data and logs for analysis. (Hint, if the rogue client is holding a system
>> lock or an application lock that needs to be shared, I’m thinking pausing
>> is not an option.)
>>
>> There are certainly other scenarios where fail as fast as possible and
>> recover is better for the goal than even trying to recover.
>>
>> So I think Clay is also right that “it depends,” begging the question of
>> what is the best solution for someone supporting a fleet of generically
>> configured servers. Frankly, it would never have occurred to me to **NOT**
>> have swap on the popular OS copied from UNIX, mostly because I don’t know
>> whether lack of a certain amount of swap still tosses warnings that freak
>> out customers (or actually fail the install) when installing my favorite
>> RDBMS.
>>
>> Seymour Cray, of course, used to say things about only implementing
>> virtual memory if you want things to be slower than they need to be.
>>
>> For database services there is probably room for an OS built for a direct
>> addressing cpu/memory complex. In that case programs would only start if
>> real space declared to be needed is available and addresses are resolved to
>> real addresses at program load time. I’m not even sure whether modern chip
>> technology could be faster with direct addressing than with virtual
>> addressing.
>>
>> I suppose quantum computing has problems if you try to use virtual memory
>> and/or swapping….
>>
>> mwf
>>
>>
>>
>> *From:* oracle-l-bounce_at_freelists.org [mailto:
>> oracle-l-bounce_at_freelists.org] *On Behalf Of *Tim Gorman
>> *Sent:* Thursday, March 30, 2023 8:25 PM
>> *To:* jkstill_at_gmail.com; Oracle-L Freelists
>> *Subject:* Re: To Swap, or not to Swap
>>
>>
>> Jared,
>>
>> You've made a good point with your testing. In essence, *fail fast*.
>> If it is just *fail fast* versus *fail slow*, then of course we all
>> choose to *fail fast* and then recover.
>>
>> The only question that comes to my mind is whether the presence of a
>> swapfile always means slow failure.
>>
>> Are there no longer any scenarios where the swapfile allows the system to
>> recover, without failing or hanging?
>>
>> For example, in Azure, VMs can use remote storage (a.k.a. OsDisk) for the
>> swapfile, or VMs can locate the swapfile on optional direct-attached SSD
>> storage that is considered "temporary" or ephemeral, because when the VM is
>> stopped and deallocated, the direct-attached storage has to be erased,
>> because another VM may be allocated to it in future. It is not quality of
>> storage that makes it "ephemeral", just the use-case. Anyway, the OsDisk
>> has I/O latency averaging 0.70 ms for both reads and writes, but the
>> so-called "ephemeral" disk provides less than 0.05 ms I/O latency, which is
>> about 14x faster.
>>
>> Clearly the performance of the storage on which the swapfile resides is
>> going to make a difference in its usefulness. If your testing involved
>> slow storage, then I can see where the machine would take 7-8 mins to
>> fail. I'm not trying to denigrate the resources you used, but I'm trying
>> to ask if the swapfile is on fast storage, then perhaps could it be more
>> helpful, even in extreme situations?
>>
>> In other words, shouldn't we ensure that a swapfile is fast, as well as
>> big enough? Wouldn't more performant storage allow the swapfile to recover
>> the situation?
>>
>> Thanks so much for the thought exercise!
>>
>> -Tim
>>
>> On 3/30/2023 10:46 AM, Jared Still wrote:
>>
>> I was recently asked by a colleague this same question.
>>
>> He had been asked by a client, with a fairly well regarded sysadmin team.
>>
>> They wanted to eliminate swap: here's why.
>>
>> If a process is consuming memory at a prodigious rate, then the OOM (out
>> of memory) killer is going to catch up to it and kill it eventually.
>>
>> Their position was that with a swap partition, this process was prolonged
>> far too long.
>>
>> Without swap, the process gets killed relatively quickly.
>>
>> With swap, it can take many minutes. The CPU spends so much time managing
>> memory on swap (remember, we are at an OOM condition), which is slow, that
>> the time to kill the process is prolonged to many minutes.
>>
>> At first my position was "what, no swap! we can't do that!"
>>
>> But, I decided to test it a bit.
>>
>> A small physical server, 4 cores and 32G of RAM, is running Oracle 19.3.
>>
>> A swingbench test is running, 10 sessions per core.
>>
>> When I cause an OOM condition with the 16G swap partition enabled, it
>> took the system between 7.5-8 minutes to kill the process.
>>
>> (For the client, the amount of time was 20+ minutes.)
>>
>> And during that time, it was impossible to logon to the server. The CPU
>> was too busy thrashing around in the swap partition.
>>
>> The next step of course is to disable the swap.
>>
>> Same OOM condition caused. Time to resolution is now 7 seconds.
>>
>> There is no swap to manage as if it were RAM.
>>
>> That is quite a bit difference.
>>
>> Of course I wondered 'what about paging in memory for new processes?', as
>> that often uses a page in swap.
>>
>> Without swap, it just takes place in memory.
>>
>> Swap is also a landing place for some pages used to initialize processes,
>> as they can only be used once.
>>
>> This is a minimal amount, and can just be left in memory.
>>
>> If one really wants to conserve, there is a thing called ZRAM (compressed
>> memory) where those pages can be parked, instead of swap.
>>
>> So, does anyone see any other need for a swap partition?
>>
>> It seems to have outlived its usefulness.
>>
>> Jared Still
>> Certifiable Oracle DBA and Part Time Perl Evangelist
>> Principal Consultant at Pythian
>> Oracle ACE Alumni
>> Pythian Blog http://www.pythian.com/blog/author/still/
>> Github: https://github.com/jkstill
>>
>> Personality: http://www.personalitypage.com/INTJ.html
>>
>>
>>
>> On Thu, Mar 30, 2023 at 9:24 AM Jared Still <jkstill_at_gmail.com> wrote:
>>
>> That is the question.
>>
>> I am curious about current thoughts on having or not having a swap
>> partition on Linux based Oracle servers.
>>
>> Let's assume typical production standard servers with a reasonable amount
>> of RAM, sway 256G or more.
>>
>> I have some thoughts on this myself, but would like to see others'
>> thoughts on this.
>>
>>
>> Jared Still
>> Certifiable Oracle DBA and Part Time Perl Evangelist
>> Principal Consultant at Pythian
>> Oracle ACE Alumni
>> Pythian Blog http://www.pythian.com/blog/author/still/
>> Github: https://github.com/jkstill
>>
>> Personality: http://www.personalitypage.com/INTJ.html
>>
>>
>> --
> Jared Still
> Certifiable Oracle DBA and Part Time Perl Evangelist
> Principal Consultant at Pythian
> Oracle ACE Alumni
> Pythian Blog http://www.pythian.com/blog/author/still/
> Github: https://github.com/jkstill
> Personality: http://www.personalitypage.com/INTJ.html
>
>
>

-- 
Regards
Timur Akhmadeev

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Apr 03 2023 - 09:54:55 CEST

Original text of this message