Keyword spotting for audio with attention (KWS model for audio)
??? 60+ Implementations/tutorials of deep learning papers with side-by-side notes ?; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), ? reinforcement learning (ppo, dqn), capsnet, distillation, ... ?
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
Easily train a good VC model with voice data <= 10 mins!
Cross-platform, customizable ML solutions for live and streaming media.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
SoftVC VITS Singing Voice Conversion
Deezer source separation library including pretrained models.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
GUI for a Vocal Remover that uses Deep Neural Networks.
The free and privacy-friendly screen recorder with no limits ?