HomeAI Tutorial

MoE-SFT

Public

? Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Creat2024-06-17T13:33:24
Update2025-02-24T09:11:04
https://arxiv.org/abs/2406.11256
41
Stars
0
Stars Increase

Related projects