That's not going to happen. Extra memory channels are very expensive die-wise. Nvidia and AMD achieve these rates with hbm, which has very wide buses (4096 bits) and short traces from stacking. I can't see any way CPU memory will compete until they move to hbm. Keep in mind gddr6 is available in GPUs now, and is faster than ddr5, but much slower than hbm.
So can Intel, but they don't. Hbm would likely require them to sell a fixed-size memory amount, which can be severely limiting for server applications. Not to mention it's extremely power hungry compared to ddr, so you won't get anywhere near the amounts ddr gives without making power consumption go way up.
EPYC is already a modular architecture, literally nothing stops AMD replacing a couple of "compute" dies with HBM2 stacks. They could release CPUs that don't require DIMM sockets at all. E.g.: instead of 2 sockets + a bunch of DIMM sockets, the same motherboard space could be used for 4 sockets with embedded memory.
They could but then you're cutting your FLOPS down to get your memory bandwidth up. And HBM2 doesn't get you much capacity. The 7nm Instinct MI50 has 4 stacks of HBM2 to achieve 32GB in capacity. So other than as a joke toy, what would you do with a 32-core / 64-thread CPU with 32GB of RAM? That's what you'd end up with if you swapped out 4 compute dies for 4 HBM2 stacks.
Assume that in 1-2 years HBM capacity doubles, and it's a quad-socket motherboard. You'd have 64GB per socket, or 256GB total.
Remind me how much memory an NVIDIA accelerator has?
To play Devil's advocate, putting HBM2 in the package doesn't magically solve everything. The intra-socket bandwidth could be enormous, but the inter-socket bandwidth would still be whatever it is now, and would be difficult to increase.
Epyc doesn't do quad sockets. Is this just another hypothetical "what if" at this point with no basis in reality?
Because sure, a hypothetical non-existent Epyc re-designed to compete in the double precision floating point space favoring memory bandwidth above all else could be really cool. Then again, so could anything else custom designed exclusively for that use case.
> but the inter-socket bandwidth would still be whatever it is now, and would be difficult to increase.
64 PCI-E 4.0 lanes form the CPU-CPU interconnect currently.
Since we're making up stuff why not assume that's doubled next generation along with being PCI-E 5.0? So that'd be 500GB/s give or take.
That would be great, but to date hbm is always part of the board. I'm all for selling motherboards with the ram already on it if it means higher bandwidth, but it's just never happened before.