One Million Screenshots
Explore the web’s biggest homepage. Discover similar sites. See changes over time. Get web data.
Nathan Rooy created a collection of one million screenshots from small web sites, avoiding popular domains sourced from Common Crawl used in onemillionscreenshots.com.
Screenshots were captured with Playwright, visual embeddings generated via a custom triplet loss encoder, and organized using self-organizing maps (SOMs) with parallel color distribution for layout. Two SOMs link visual similarity for micro-placement and color for macro-placement, producing a zoomable map at screenshots.nry.me.