Shell + Skills + Compaction: Tips for long-running agents that do real work
Practical patterns for building with skills, hosted shell, and server-side compaction in the Responses API.
OpenAI scaled PostgreSQL to handle millions of queries per second for 800 million ChatGPT users by deploying a single primary database instance with nearly 50 read replicas across multiple regions, offloading read-heavy traffic while minimizing writes.
The system handles massive global traffic through optimizations including caching layers, rate limiting across multiple layers, query routing, and migration of write-heavy workloads to sharded systems like Azure CosmosDB. Despite PostgreSQL's inherent challenges with write-heavy workloads due to its multiversion concurrency control implementation, this architecture maintains low double-digit millisecond latency and five-nines availability with only one major incident in the past year.