Vision-to-VibeVoice-en

Public

A Gradio-based demo for end-to-end vision-to-speech inference: Extract text or descriptions from images using Qwen2.5-VL-7B-Instruct, then convert to natural speech audio via Microsoft VibeVoice-Realtime-0.5B.

accelerate cuda gradio huggingface-hub huggingface-spaces huggingface-transformers matplotlib nvidia-gpu opencv opencv-python

Creat：2025-12-06T00:33:33

Update：2025-12-09T11:22:15

https://huggingface.co/spaces/prithivMLmods/Vision-to-VibeVoice-en

Stars

Stars Increase

Related projects

Stable Diffusion Webui

Hot

Stable Diffusion web UI

158833

1年前

+73today

Vllm

Hot

amd

A high-throughput and memory-efficient inference and serving engine for LLMs

64909

1年前

+236today

Gradio

data-analysis

Build and share delightful machine learning apps, all in Python. ? Star to support our work!

40854

1年前

+40today

Sglang

Hot

cuda

SGLang is a fast serving framework for large language models and vision language models.

21068

10个月前

+144today

HivisionIDPhotos

cnn

??HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

20302

10个月前

+21today

Stable Diffusion Webui Colab

stable diffusion webui colab

15965

10个月前

+4today

Ebook2audiobook

Hot

audiobooks

Convert ebooks to audiobooks with chapters and metadata using dynamic AI models and voice cloning. Supports 1,107+ languages!

15957

10个月前

+64today

Kaldi

c-plus-plus

kaldi-asr/kaldi is the official location of the Kaldi project.

15257

10个月前

+3today

Open3D

Open3D: A Modern Library for 3D Data Processing

13085

10个月前

+11today

TensorRT LLM

blackwell

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.

12336

6个月前

+31today

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Services

AI Model Compatibility Checker

AI Deployment Calculator

Vision-to-VibeVoice-en

Related projects

Stable Diffusion Webui

Vllm

Gradio

Sglang

HivisionIDPhotos

Stable Diffusion Webui Colab

Ebook2audiobook

Kaldi

Open3D

TensorRT LLM

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Vision-to-VibeVoice-en

Related projects

Stable Diffusion Webui

Vllm

Gradio

Sglang

HivisionIDPhotos

Stable Diffusion Webui Colab

Ebook2audiobook

Kaldi

Open3D

TensorRT LLM

GEO Services