Clean-Web-Scraper
PublicA Node.js web scraper that extracts clean, readable content from websites - perfect for AI/LLM training datasets. Features smart crawling, Mozilla Readability integration, and organized content storage ?
aiartificial-intelligencecleancrawlerdata-preprocessingdatasetfine-tuningllmrecursive-crawlingscraper
Creat:2025-01-10T05:52:16
Update:2025-03-24T06:55:15
2
Stars
0
Stars Increase