Holy crap, the movement to get huge models working well on local hardware has kicked off and the results are impressive.
This morning I used this project https://github.com/danveloper/flash-moe.git
Which allows me to run this HUGE Qwen 3 – 327B parameter model ON MY LOCAL MACBOOK PRO. It’s a bit slow, but damn if it doesn’t work! SOTA on my LAPTOP
</freakout>
