AIbase
Product LibraryTool NavigationMCP

vision-voice-multimodal-app

Public

An AI-powered chatbot that combines image understanding, voice input, and multilingual speech output. Users can upload images, ask questions by voice, and receive intelligent spoken answers. Showcases state-of-the-art models (Gemini, Whisper, Kokoro TTS) and robust Python engineering.

Creat2025-05-25T02:46:41
Update2025-06-10T15:49:43
1
Stars
0
Stars Increase