How We Made 100M Vector Indexing in 20 Minutes Possible on PostgreSQL
1. Introduction In the past few months, we’ve heard consistent feedback from users and partners: while our goal of providing a scalable, high-performa...
The article explains how SwissTable, an open-addressing hash table with separate control-byte metadata and h1/h2 hash splitting, achieves high speed and memory efficiency through cache-friendly probing and SIMD-friendly control-byte scans.
It describes how this design has become a modern baseline, influencing Rust’s HashMap and Go’s built-in map implementation, which gained significant performance and memory improvements. The author then outlines a Java implementation using the incubating Vector API to vectorize control-byte scans, discussing challenges with Java’s object layout, tombstone handling, and resizing, and reporting competitive throughput and reduced heap usage versus JDK HashMap at high load factors.