Any model you'd recommend for a Mac with 48 GB RAM?

reitzensteinm · on Dec 31, 2024

48GB is maybe just enough to squeeze a quantized 70B model in, like Llama 3.3, but you'll need to raise the GPU memory allocation limit [1] and it might not be super fast.

You could also try Qwen 2.5 32b, which you should just work with ollama or LM Studio with no config changes.

I've got a 32gb M1 Max and a 24gb 4090, and I barely ever run models on my Mac, as the memory bandwidth and compute for prefill is much better on the 4090. But I'm essentially locked out of Llama 3 70B class models, which I only use via API.

[1] See: https://www.reddit.com/r/LocalLLaMA/comments/186phti/m1m2m3_...