Home
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

AI Tutorial

cca-gpt-model-trainer

Public

A Python based Linux based command line tool to automate the process of fetching, cleaning, and combining data from MediaWiki and Jira, and then fine-tuning a language model (GPT) on the combined dataset. The script is designed to handle GPU memory constraints and can switch to CPU if needed.

Creat2024-06-19T01:02:01
Update2025-01-07T21:38:13
0
Stars
0
Stars Increase

Related projects