Home
Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

AI Tutorial

amharic-gpt

Public

Training and fine-tuning pipeline for a custom GPT-style language model built exclusively for Amharic. Pretrained on a 12+ GB corpus and adapted on curated datasets, with support for SentencePiece tokenization, LoRA fine-tuning, and efficient inference tools.

Creat2024-12-06T08:35:25
Update2025-09-26T20:39:44
0
Stars
0
Stars Increase