No description available
ASAP is a technology designed for learning agile whole-body skills in humanoid robots, achieving skill transfer through alignment of simulation with real-world physics.
alexandreacff
This model is a fine-tuned audio classification model based on asapp/sew-mid-100k on the alexandreacff/kaggle-fake-detection dataset, designed for fake audio detection.
asapp
SEW-D-mid is a speech pre-training model developed by ASAPP Research, focusing on automatic speech recognition tasks, achieving a good balance between performance and efficiency.
patrickvonplaten
This model is an automatic speech recognition model fine-tuned from asapp/sew-d-mid-400k on the LIBRISPEECH_ASR - CLEAN dataset, achieving a word error rate (WER) of 1.0536 on the evaluation set.
SEW-D-tiny is an efficient speech recognition pre-trained model developed by ASAPP Research, focusing on the balance between performance and efficiency.
SEW-tiny is a compressed and efficient speech pretraining model developed by ASAPP Research, pretrained on 16kHz sampled speech audio, suitable for various downstream speech tasks.
This model is an automatic speech recognition model fine-tuned from asapp/sew-d-small-100k on the TIMIT_ASR - NA dataset, achieving a word error rate of 0.8061 on the evaluation set.
SEW-D-base+ is an efficient speech recognition model developed by ASAPP Research, pre-trained on 16kHz sampled speech audio, and excels on the LibriSpeech dataset.
An automatic speech recognition model fine-tuned on the TIMIT_ASR dataset based on asapp/sew-d-small-100k
SEW (Squeezed and Efficient Wav2vec) is a speech recognition pre-trained model developed by ASAPP Research, outperforming wav2vec 2.0 in both performance and efficiency.
SEW-D-mid-k127 is an efficient speech recognition pre-trained model developed by ASAPP Research, demonstrating significant improvements in performance and efficiency compared to wav2vec 2.0.
SEW-D is a compressed and efficient speech pre-training model developed by ASAPP Research, pre-trained on 16kHz sampled speech audio, suitable for various downstream speech tasks.
anton-l
This model is a fine-tuned version of asapp/sew-mid-100k on the superb dataset, primarily used for keyword spotting tasks.
An automatic speech recognition model fine-tuned on the TIMIT_ASR - NA dataset based on asapp/sew-small-100k