Machine-Learning-and-Language-Model
PublicThis project explores GPT-2 and Llama models through pre-training, fine-tuning, and Chain-of-Thought (CoT) prompting. It includes memory-efficient optimizations (SGD, LoRA, BAdam) and evaluations on math datasets (GSM8K, NumGLUE, StimulEq, SVAMP).
Creat:2025-01-04T00:15:17
Update:2025-01-15T15:12:34
https://github.com/Ledzy/MDS5210-24fall
2
Stars
0
Stars Increase