grokking
Publicunofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
artificial-intelligencedeep-learninggrokkingneural-networkpythonpytorchtransformertransformer-models
Creat:2021-11-17T14:45:01
Update:2025-03-06T23:02:24
78
Stars
0
Stars Increase