LM Studio 0.4.0 introduces llmster and lms CLI for headless local inference of Google Gemma 4 26B-A4B on macOS.
The MoE model activates 3.8B parameters per token, achieving 51 tokens/second on a 48GB MacBook Pro M4 Pro with 82.6% MMLU Pro score. Setup involves installing lms CLI, starting the daemon, and loading the Q4_K_M quantized model for use with Claude Code.