Qwen/Qwen3-4B-Thinking-2507 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
The Caselaw Access Project (CAP) is a large legal dataset consisting of over 6.7 million U.S. court decisions spanning more than 360 years, digitized by Harvard Law Library in collaboration with Ravel Law.
It provides free public access, searchable metadata, and full-text opinions for state and federal courts, supporting research, legal analysis, and access to common law. Data cleaning and processing efforts have improved text quality for use in AI and legal technology applications.
The dataset can be accessed via an API, bulk downloads, and includes annotated subsets such as the Caselaw4 dataset with binary case outcomes.