"...Intel disabled AVX-512 for its Core 12th, 13th, and 14th Generations of Core...

mort96 · on Nov 4, 2024

They introduced efficiency cores, and those don't have AVX-512. Lots of software breaks if it suddenly gets moved to a core which supports different instructions, so OSes wouldn't be able to move processes between E-cores and P-cores if P-cores supported AVX-512 while E-cores didn't.

PhilipRoman · on Nov 4, 2024

As long as the vast majority of processes don't use AVX-512, you could probably catch sigill or whatever in kernel and transparently move to a P-core, marking the task to avoid rescheduling on an E-core it again in near future. Probably not very efficient, but tasks which use AVX are usually something you want to run on a P-core anyway.

mort96 · on Nov 4, 2024

That's actually an interesting idea, and yeah that should work.

You'd still have the problem that software will use the CPUID instruction to do runtime-detection of AVX-512 support. You'd need some mechanism to make CPUID report to lack AVX-512 support if the OS doesn't support catching SIGILL in the way you describe, and make CPUID report to support AVX-512 (even when run on an E-core) if the OS supports catching SIGILL and moving to a P-core. That sounds doable, but I have no idea how easy it is. You'd need to be able to configure AVX-512 reporting differently for virtual machine guests than for the host, and you'd need the host to be able to reconfigure AVX-512 support at runtime to support e.g kexec. There are probably tonnes of other considerations as well which I'm not thinking about.

Given the relatively limited benefit from going 512-bit wide compared to 256-bit, I guess I understand the decision, but you're right that it's not as black and white as I made it out to be.

cesarb · on Nov 4, 2024

> As long as the vast majority of processes don't use AVX-512

One very common function used by nearly every process is memcpy, and it's often optimized to use the largest vector size available, so it wouldn't surprise me if the vast majority of processes does use AVX-512.

PhilipRoman · on Nov 5, 2024

Ah, you're right. That definitely complicates things.

account42 · on Nov 4, 2024

Wouldn't it have been more reasonable to emulate AVX-512 on the E-cores?

Retr0id · on Nov 4, 2024

That doesn't sound very E-friendly

mort96 · on Nov 4, 2024

Depends. If they can essentially "just" add a bit of microcode to emulate AVX-512 with a 256-bit wide vector pipeline then it shouldn't be worse. I don't know if that's feasible though or if there are other costs (would you need physical 512-bit wide registers for example?).

Dylan16807 · on Nov 4, 2024

You can double-pump a lot of the instructions, but emulating some of the rest would lead to severe slowdowns in code that should instead be using an AVX2 implementation.

It's better not to fake it that hard. If those cores don't have it, don't pretend to have it.

But turning it off on the P cores was dumb.

jsheard · on Nov 4, 2024

The reason is that those CPUs have two types of cores, performance and efficiency, and only the former supports AVX512. Early on you could actually get access to AVX512 if you disabled the efficiency cores, but they put a stop to that in later silicon revisions, IIRC with the justification that AVX512 didn't go through proper validation on those chips since it wasn't supposed to be used.

hnuser123456 · on Nov 4, 2024

Probably reduce wasted silicon because very few consumers will see a significant benefit in everyday computing tasks. Also supposedly they had issues combining it with e-cores. Intel is struggling to get their margins back up. The 11th gen had AVX512 but the people who cared seem to be PS3 emulator users.