gdGPT
PublicTrain llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
baichuan2-7bbloomchatglm3-6bdeepspeedflash-attentionfull-finetunellama2llmmixtral-8x7bmodel-parallization
Creat:2023-06-24T16:51:11
Update:2025-03-25T16:22:16
97
Stars
0
Stars Increase