dapt
PublicCode for "On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models"
Creat:2023-07-13T06:50:16
Update:2024-04-07T22:41:33
5
Stars
0
Stars Increase
Code for "On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models"