viVQA-voice-assistant
PublicVoice assistant using Multimodal LLMs - LLaVA-NeXT (Mistral 7B) finetuned & PhoWhisper
audio-speech-recognitionllavaloramistral-7bmultimodal-large-language-modelstext-to-speechvisual-question-answering
Creat:2024-03-10T23:45:00
Update:2024-10-29T13:30:26
4
Stars
0
Stars Increase