使用 Gemini AI 转写音视频为 SRT 字幕
? Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
? Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows
? Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Plugins/Artifacts) and Thinking. One-click FREE deployment of your private ChatGPT/ Claude / DeepSeek application.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Port of OpenAI's Whisper model in C/C++
?? - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
?? Turn any webpage into a desktop app with Rust. ?? 利用 Rust 轻松构建轻量级多端桌面应用
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择GPT3.5/GPT-4o/GPT-o1/ DeepSeek/Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
A generative speech model for daily dialogue.