~/bookmarks

/**/

GitHub - noonghunna/club-3090: Community recipes for serving LLMs on RTX 3090. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B configs for 1× and 2× cards.

github.comSaved May 3, 20269 min

Preview of GitHub - noonghunna/club-3090: Community recipes for serving LLMs on RTX 3090. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B configs for 1× and 2× cards.

noonghunna · via GitHub

Summary

Repository providing Docker-based configurations for running large language models locally on RTX 3090 GPUs using multiple inference engines (vLLM, llama.cpp, SGLang).

Currently supports Qwen3.6-27B with options for single or dual GPU setups, offering throughput up to 127 tokens/second or robust 262K context handling depending on engine choice. Includes OpenAI-compatible API, benchmarking tools, and scaling guidance for 3+ GPU clusters.

Topics

Local LLM Deployment Llama.cpp vLLM Large Language Models Serving GPU Optimization

View on GitHub All Bookmarks

William's Bookmark Library

/**/

GitHub - noonghunna/club-3090: Community recipes for serving LLMs on RTX 3090. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B configs for 1× and 2× cards.

github.comSaved May 3, 20269 min

noonghunna · via GitHub

Summary

Repository providing Docker-based configurations for running large language models locally on RTX 3090 GPUs using multiple inference engines (vLLM, llama.cpp, SGLang).

Topics

Local LLM Deployment Llama.cpp vLLM Large Language Models Serving GPU Optimization

View on GitHub All Bookmarks

~/bookmarks

GitHub - noonghunna/club-3090: Community recipes for serving LLMs on RTX 3090. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B configs for 1× and 2× cards.

Summary

Topics

Discover Similar Content

GitHub - noonghunna/club-3090: Community recipes for serving LLMs on RTX 3090. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B configs for 1× and 2× cards.

Summary

Topics