Expert Parallelism Bookmarks | William Callahan

dev.synthetic.new

December 3, 2025

Synthetic LLM Hosted Models

Chat with open-source models privately

z.ai

February 11, 2026

GLM-5: From Vibe Coding to Agentic Engineering

GLM-5 is a 744B-parameter MoE model (40B active) from Zhipu AI, scaled up from GLM-4.5's 355B with 28.5T pre-training tokens and DeepSeek Sparse Atten...

openrouter.ai

December 2, 2025

Trinity Mini (free) - API, Providers, Stats

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for eff...

dev.synthetic.new

December 3, 2025

Synthetic LLM Hosted Models

Chat with open-source models privately

z.ai

February 11, 2026

GLM-5: From Vibe Coding to Agentic Engineering

GLM-5 is a 744B-parameter MoE model (40B active) from Zhipu AI, scaled up from GLM-4.5's 355B with 28.5T pre-training tokens and DeepSeek Sparse Atten...

openrouter.ai

December 2, 2025

Trinity Mini (free) - API, Providers, Stats

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for eff...

~/bookmarks

Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on...

Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on...

Discover More

Synthetic LLM Hosted Models

GLM-5: From Vibe Coding to Agentic Engineering

Trinity Mini (free) - API, Providers, Stats

Similar Content

~/bookmarks

Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on...

Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on...

Discover More

Synthetic LLM Hosted Models

GLM-5: From Vibe Coding to Agentic Engineering

Trinity Mini (free) - API, Providers, Stats