Re: To Swap, or not to Swap
Date: Thu, 30 Mar 2023 22:56:18 -0700
Message-ID: <CAORjz=Mj7=cE_reHFGvDDrYUrwmKKWcKRp9=Pn50+=u9QNx1xA_at_mail.gmail.com>
I would respectfully mention that a 16G swap partition is unlikely to be the salvation of a server with 1T of RAM. :)
On Thu, Mar 30, 2023 at 19:35 Clay Jackson (cjackson) < Clay.Jackson_at_quest.com> wrote:
> Yes, thanks for the exercise!
>
>
>
> Tim - very good points, as always. Based on all of this, I would assert
> that the answer (to the question “To Swap, or not to Swap”) is really, “It
> depends”.
>
>
>
> Tim makes the point that “fail fast and recover” is better than “Fail slow
> and recover”. I can definitely agree with that.
>
>
>
> But what if we had an option, “Gracefully degrade, notify and recover”?
> I would think degradation, for a short period of time, is better than a
> failure. It seems that Tim’s fast storage, along with monitoring of swap
> usage, would allow for that.
>
>
>
> My father (an Episcopal, Anglican for those not in the US, priest) used to
> say, “In the end, all ethics are situational”. If I can paraphrase – “In
> the end, the “correct” solution to a (potential) failure always depends on
> the situation”.
>
>
> Clay Jackson
>
>
>
>
>
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> *On
> Behalf Of *Tim Gorman
> *Sent:* Thursday, March 30, 2023 5:25 PM
> *To:* jkstill_at_gmail.com; Oracle-L Freelists <oracle-l_at_freelists.org>
> *Subject:* Re: To Swap, or not to Swap
>
>
>
> *CAUTION:* This email originated from outside of the organization. Do not
> follow guidance, click links, or open attachments unless you recognize the
> sender and know the content is safe.
>
>
>
> Jared,
>
>
>
> You've made a good point with your testing. In essence, *fail fast*. If
> it is just *fail fast* versus *fail slow*, then of course we all choose
> to *fail fast* and then recover.
>
> The only question that comes to my mind is whether the presence of a
> swapfile always means slow failure.
>
> Are there no longer any scenarios where the swapfile allows the system to
> recover, without failing or hanging?
>
> For example, in Azure, VMs can use remote storage (a.k.a. OsDisk) for the
> swapfile, or VMs can locate the swapfile on optional direct-attached SSD
> storage that is considered "temporary" or ephemeral, because when the VM is
> stopped and deallocated, the direct-attached storage has to be erased,
> because another VM may be allocated to it in future. It is not quality of
> storage that makes it "ephemeral", just the use-case. Anyway, the OsDisk
> has I/O latency averaging 0.70 ms for both reads and writes, but the
> so-called "ephemeral" disk provides less than 0.05 ms I/O latency, which is
> about 14x faster.
>
> Clearly the performance of the storage on which the swapfile resides is
> going to make a difference in its usefulness. If your testing involved
> slow storage, then I can see where the machine would take 7-8 mins to
> fail. I'm not trying to denigrate the resources you used, but I'm trying
> to ask if the swapfile is on fast storage, then perhaps could it be more
> helpful, even in extreme situations?
>
> In other words, shouldn't we ensure that a swapfile is fast, as well as
> big enough? Wouldn't more performant storage allow the swapfile to recover
> the situation?
>
> Thanks so much for the thought exercise!
>
> -Tim
>
> On 3/30/2023 10:46 AM, Jared Still wrote:
>
> I was recently asked by a colleague this same question.
>
>
>
> He had been asked by a client, with a fairly well regarded sysadmin team.
>
>
>
> They wanted to eliminate swap: here's why.
>
>
>
> If a process is consuming memory at a prodigious rate, then the OOM (out
> of memory) killer is going to catch up to it and kill it eventually.
>
>
>
> Their position was that with a swap partition, this process was prolonged
> far too long.
>
>
>
> Without swap, the process gets killed relatively quickly.
>
>
>
> With swap, it can take many minutes. The CPU spends so much time managing
> memory on swap (remember, we are at an OOM condition), which is slow, that
> the time to kill the process is prolonged to many minutes.
>
>
>
> At first my position was "what, no swap! we can't do that!"
>
>
>
> But, I decided to test it a bit.
>
>
>
> A small physical server, 4 cores and 32G of RAM, is running Oracle 19.3.
>
>
>
> A swingbench test is running, 10 sessions per core.
>
>
>
> When I cause an OOM condition with the 16G swap partition enabled, it took
> the system between 7.5-8 minutes to kill the process.
>
>
>
> (For the client, the amount of time was 20+ minutes.)
>
>
>
> And during that time, it was impossible to logon to the server. The CPU
> was too busy thrashing around in the swap partition.
>
>
>
> The next step of course is to disable the swap.
>
>
>
> Same OOM condition caused. Time to resolution is now 7 seconds.
>
>
>
> There is no swap to manage as if it were RAM.
>
>
>
> That is quite a bit difference.
>
>
>
> Of course I wondered 'what about paging in memory for new processes?', as
> that often uses a page in swap.
>
>
>
> Without swap, it just takes place in memory.
>
>
>
> Swap is also a landing place for some pages used to initialize processes,
> as they can only be used once.
>
>
>
> This is a minimal amount, and can just be left in memory.
>
>
>
> If one really wants to conserve, there is a thing called ZRAM (compressed
> memory) where those pages can be parked, instead of swap.
>
>
>
> So, does anyone see any other need for a swap partition?
>
>
>
> It seems to have outlived its usefulness.
>
>
>
> Jared Still
> Certifiable Oracle DBA and Part Time Perl Evangelist
>
> Principal Consultant at Pythian
>
> Oracle ACE Alumni
>
> Pythian Blog http://www.pythian.com/blog/author/still/
> <https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.pythian.com%2Fblog%2Fauthor%2Fstill%2F&data=05%7C01%7Cclay.jackson%40quest.com%7Cfbf6cd8a6a334c5d07f208db317e87dc%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C638158191761986020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Vd3C8P%2FF50M0DeRjlz2Ci2B7RMG7qV8bZxXo4ZwO0IU%3D&reserved=0>
>
> Github: https://github.com/jkstill
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjkstill&data=05%7C01%7Cclay.jackson%40quest.com%7Cfbf6cd8a6a334c5d07f208db317e87dc%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C638158191761986020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=o8p6oyzzhW6jT8ruSatnHahpXqmVRY5YxHEsiUV%2ByoE%3D&reserved=0>
>
> Personality: http://www.personalitypage.com/INTJ.html
> <https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.personalitypage.com%2FINTJ.html&data=05%7C01%7Cclay.jackson%40quest.com%7Cfbf6cd8a6a334c5d07f208db317e87dc%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C638158191762142238%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XdWV2p8U%2FQSoUfmAGNLoVVmEBBJei0j4Ju4JmB9VJ8Q%3D&reserved=0>
>
>
>
>
>
> On Thu, Mar 30, 2023 at 9:24 AM Jared Still <jkstill_at_gmail.com> wrote:
>
> That is the question.
>
>
>
> I am curious about current thoughts on having or not having a swap
> partition on Linux based Oracle servers.
>
>
>
> Let's assume typical production standard servers with a reasonable amount
> of RAM, sway 256G or more.
>
>
>
> I have some thoughts on this myself, but would like to see others'
> thoughts on this.
>
>
>
>
>
> Jared Still
> Certifiable Oracle DBA and Part Time Perl Evangelist
>
> Principal Consultant at Pythian
>
> Oracle ACE Alumni
>
> Pythian Blog http://www.pythian.com/blog/author/still/
> <https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.pythian.com%2Fblog%2Fauthor%2Fstill%2F&data=05%7C01%7Cclay.jackson%40quest.com%7Cfbf6cd8a6a334c5d07f208db317e87dc%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C638158191762142238%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=J%2BcP9lUHkE1euYUvwsXE1YCIpA%2BetWLtOos3gBhEdFo%3D&reserved=0>
>
> Github: https://github.com/jkstill
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjkstill&data=05%7C01%7Cclay.jackson%40quest.com%7Cfbf6cd8a6a334c5d07f208db317e87dc%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C638158191762142238%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FFYcaad4HjRIFbUlwYdQATouwGbF3BAmBKNgYD6f8P8%3D&reserved=0>
>
> Personality: http://www.personalitypage.com/INTJ.html
> <https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.personalitypage.com%2FINTJ.html&data=05%7C01%7Cclay.jackson%40quest.com%7Cfbf6cd8a6a334c5d07f208db317e87dc%7C91c369b51c9e439c989c1867ec606603%7C0%7C0%7C638158191762142238%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XdWV2p8U%2FQSoUfmAGNLoVVmEBBJei0j4Ju4JmB9VJ8Q%3D&reserved=0>
>
>
>
-- Jared Still Certifiable Oracle DBA and Part Time Perl Evangelist Principal Consultant at Pythian Oracle ACE Alumni Pythian Blog http://www.pythian.com/blog/author/still/ Github: https://github.com/jkstill Personality: http://www.personalitypage.com/INTJ.html -- http://www.freelists.org/webpage/oracle-lReceived on Fri Mar 31 2023 - 07:56:18 CEST