• xeon vs epyc

    From Maurice Kinal@1:153/7001.2989 to Benny Pedersen on Thu Sep 1 19:25:54 2022
    Hey Benny!

    AMD EPYC 7642 48-Core Processor AuthenticAMD

    You're going to need a 16K monitor methings. :::evil grin:::

    Life is good,
    Maurice

    ... Fidonet 4K - Sweet Sixteen Penguins of the Apocalypse.
    --- GNU bash, version 5.1.16(1)-release (x86_64-znver2-linux-gnu)
    * Origin: One of us @ (1:153/7001.2989)
  • From Mike Miller@1:154/30 to Benny Pedersen on Tue Sep 6 08:16:41 2022

    Hello Benny!

    01 Sep 22 12:08, you wrote to all:


    Linux mx 5.15.63-gentoo-dist #1 SMP Thu Aug 25 12:40:44 -00 2022
    x86_64 Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz GenuineIntel
    GNU/Linux Linux localhost 5.19.6-gentoo-dist #1 SMP PREEMPT_DYNAMIC
    Wed Aug 31 18:48:13 -00 2022 x86_64 AMD EPYC 7642 48-Core Processor AuthenticAMD GNU/Linux


    We've been switching over to Epyc boxes at work, and it's been a bit of a nightmare, although mostly that's been due to software limitations.

    We started out with dual CPU 64-core Epyc CPUs, and ran into limitations with applications that couldn't deal with 256 processors. We had to manually pin each thread/appliaction to a core / set of cores.

    We eventually switched over to purchasing single-CPU 64 core EPYC boxes, which resolved our issues with CPU pinning for the most part.


    However, every single EPYC box we're running has to have IOMMU disabled in BIOS. Otherwise, after about 3 months of running the servers will start spewing "AMD-Vi: Completion wait loop timed out" errors. This will cause the pcie devices to rapidly disable/re-enable, which knocks out networking. We've yet to nail down the actual cause of the issue, and it doesn't seem to matter what kernel version we're running. The odd thing is that it will happen in short bursts, groups of servers (usually assigned to the same application) will start blowing up every hour or two, one after another. Not a fun thing when I'm on call, because I swear it starts happening overnight every time. :D




    Mike


    ... Dancers do it with rhythm.
    === GoldED+/LNX 1.1.5-b20220504
    --- SBBSecho 3.15-Linux
    * Origin: War Ensemble - warensemble.com - Appleton, WI (1:154/30)