Date: March 15th, 2026 5:40 PM Author: Jared Baumeister
MoE whatever, give it 12 CPU cores and 48gb of system RAM on top of your RTX 6000 unless you want context window to be thimble-size. This is on Ollama, because llama.cpp is still getting caught up and has to be recompiled every day as bugs get fixed
(http://www.autoadmit.com/thread.php?thread_id=5846010&forum_id=2most#49746161) |