Re: ideal CPU/Memory relation

From: Tanel Poder <tanel_at_tanelpoder.com>
Date: Fri, 19 Aug 2022 16:10:53 -0400
Message-ID: <CAMHX9JJpVYJz2v4zk7tZWi+9+4DQJRbYkiiEkKUQ6cBW9Qw93g_at_mail.gmail.com>



There's another thing to think about - especially when you want the best memory access performance & throughput (and are not optimizing just for having max amount of RAM possible).

Computers are networks. Modern CPUs are also networks. One core can not consume max memory bandwidth, you need multiple cores.

Cores within a single processor can be connected to memory controllers / RAM and each other via ring networks (like some Xeons) or a central "I/O" hub (like AMD Zen).

On some CPUs it's the CPU cores that each have memory channels (no central I/O hub). With cheaper CPUs (where half of the cores are disabled), half of the memory channels (or "CPU I/O" bandwith may be cut in half too -> https://www.servethehome.com/amd-epyc-7002-rome-cpus-with-half-memory-bandwidth/ ).
This happens to be the case with the 16-core AMD Threadripper Pro that I have in my Lenovo Workstation
<https://tanelpoder.com/posts/11m-iops-with-10-ssds-on-amd-threadripper-pro-workstation/> .

The older Xeons have 6 memory channels each, the newest have 8 I think.

AMD EPYC / Threadripper Pro have 8 memory channels.

So, you should populate all 8 DIMM slots with memory if you want performance (not less). Or if your mobo supports dual/triple DIMMs per memory channel, could also use 16 DIMMs with 8 channels, but this won't increase your RAM throughput (and may increase latency actually, due to having to switch between the bank ID (or whatever it's called)). The low latency folks in high frequency trading world care about this stuff.

My 2 x Xeon machine has CPUs with 6 memory channels each, but 8 DIMM slots in the mobo. But I filled only the 6 slots for each CPU, to avoid imbalancing the RAM access throughput & traffic.

So if you're building a bad-ass (gaming) workstation or some high-performance server, don't buy just one-two large & expensive DIMMs in the hopes of adding more later, but populate enough DIMM slots so that the exact number of your CPUs' memory channels are in use.

Oh, the world is changing, PCIe (especially PCIe 5.0 and future 6.0) latency and throughput are so good, so that it's getting pretty close to the RAM speed as far as the transport goes. So (now that Intel killed Optane <https://tanelpoder.com/posts/testing-oracles-use-of-optane-persistent-memory/>) it's worth keeping an eye on the Compute Express Link (CXL) standard. With CPU support, it's basically like cache coherent system memory, but accessed over PCIe5.0+ links. It's even possible to connect large boards full of DRAM to multiple separate compute nodes, so in theory someone could build a CXL-based shared global buffer cache used by the entire rack of servers concurrently, without needing RAC GC/LMS processes to ship blocks around.

--
Tanel Poder
https://learn.tanelpoder.com


On Fri, Aug 19, 2022 at 3:02 AM Lothar Flatz <l.flatz_at_bluewin.ch> wrote:


> Hi,
>
> had somebody ever heard of a ideal CPU/Memory relation for a database
> server?
> A supplier of a customer stated such thing,
> I suppose they made it up.
> Any comments?
>
> Thanks
>
> Lothar
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>
-- http://www.freelists.org/webpage/oracle-l
Received on Fri Aug 19 2022 - 22:10:53 CEST

Original text of this message