GitHub - ScrapeGraphAI/Scrapegraph-ai: Python scraper based on AI
Python scraper based on AI. Contribute to ScrapeGraphAI/Scrapegraph-ai development by creating an account on GitHub.
LangExtract is a Python library that extracts structured data from unstructured text using large language models and controlled generation techniques.
It provides precise source grounding, interactive HTML visualization, and supports both cloud-based and local LLMs for flexible integration. The tool handles long documents efficiently with text chunking and parallel processing, and does not require model fine-tuning for new domains.