python-code-docstring-scraper
PublicA multi-threaded GitHub scraper to collect Python code with docstrings from public repositories, creating a well-documented dataset for the JaraConverse LLM model.
causal-language-modelingdata-scrapingdatasetdataset-generationdataset-scriptsdocstdocstring-generatorgithub-scraperllmllm-training
Creat:2024-07-22T03:50:31
Update:2025-01-17T02:00:44
3
Stars
0
Stars Increase