MiMo Audio is an audio language model based on large-scale pre-training, achieving SOTA performance among open-source models in speech intelligence and audio understanding benchmark tests. This model demonstrates strong few-shot learning ability and can generalize to tasks not included in the training data, supporting various audio tasks such as speech conversion, style transfer, and speech editing.
Audio Processing
Safetensors