Continuous research on training Transformer models at scale.
retrieva-jp
Transformer encoder pretrained based on Megatron-LM, specifically designed for Japanese scenarios
Muennighoff
This is a small GPT-2-like model designed for testing the conversion functionality between Megatron-LM and transformers, primarily used for integration testing and debugging scripts
bigscience
This is a small GPT-2-like model designed for testing the conversion between Megatron-LM and transformers, primarily used for integration testing and debugging scripts.
AI-Nordics
A Swedish Bert Large model implemented based on the Megatron-LM framework, containing 340 million parameters, pre-trained on 85GB of Swedish text