Together AI publishes token-based pricing for multiple hosted AI models, with distinct input, output, and cached batch rates per 1M tokens.
It also offers single-tenant GPU instances and on-demand or reserved HGX/GB-series GPU clusters with hourly pricing that decreases for longer reservations. Additional services include per-resource pricing for code sandboxes, a low per-session fee for code interpreter, storage charges for a shared filesystem, and per-token tiers for supervised fine-tuning and DPO across different model sizes.