Show-1 combines pixel and latent diffusion models to achieve efficient, high-quality text-to-video generation.
Baidu
-
Input tokens/M
Output tokens/M
32
Context Length
$1
$4
64
Baichuan
4
showlab
Show-1 is an efficient text-to-video generation model that combines the advantages of pixel and latent space diffusion models, capable of producing high-quality videos with precise text alignment.
Show-1 is an efficient text-to-video generation model that combines the strengths of pixel and latent space diffusion models to produce high-quality videos closely aligned with text prompts.
Show-1 is an efficient text-to-video generation model that combines the strengths of pixel and latent space diffusion models to produce videos highly aligned with text descriptions.
Show-1 is an efficient text-to-video generation model that combines the strengths of pixel and latent space diffusion models to produce high-quality videos that precisely match the input text.