Janus-Pro is an innovative autoregressive framework that unifies multimodal understanding and generation capabilities. By decoupling visual encoding paths and employing a single Transformer architecture, it resolves conflicts in the roles of visual encoders between understanding and generation.
Multimodal
Transformers