HomeAI Tutorial
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

MinerU-HTML

Public

MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG applications, and training data generation.

Creat2025-11-26T18:31:36
Update2025-11-28T20:23:24
https://opendatalab.com/ai-ready/AICC#playground
57
Stars
0
Stars Increase