Hugging Face Launches Open Source Multimodal AI Model IDEFICS Supporting Image and Text Inputs

站长之家

Published inAI News · 1 min read · Aug 23, 2023

Hugging Face has introduced an open-source multimodal AI model named IDEFICS, which can accept both images and text as inputs and generate coherent text outputs. Developed based on DeepMind's Flamingo visual-language model, IDEFICS boasts 80 billion parameters. The model is available in versions with 9 billion and 80 billion parameters, both supporting the generation of coherent text. This introduction provides researchers and developers with a powerful open-source visual-language model, demonstrating the potential of generative models in handling multimodal inputs. It is expected to propel the development of multimodal AI applications.

AI Model IDEFICS Multimodal

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Nanometer AI Super Search Intelligence Entity Explodes Upgrade! One-Click Generation of PPTs, Videos, and Voiceover Scripts; Medical Research Can Also Be Searched in Seconds

The Nanometer AI Super Search Intelligent Entity under the 360 Company has undergone a major update, adding new features such as multimodal content generation, cross-domain professional search, and smarter task preview functions. From one-click generation of PPTs, PDF reports to automatically integrating videos, voiceover scripts, and storyboard planning, Nano AI redefines the boundary of AI search and creation with more efficient and intuitive experiences. AIbase comprehensively organizes the latest social media dynamics to help you deeply understand the latest breakthroughs of Nano AI. Multimodal Generation: Handle everything from PPTs to videos with one click.

Jun 13, 2025

Alibaba Engineers Give Up Spring Festival Holidays to Chase AI Progress

According to a report by Bloomberg, in January this year, the low-cost and high-performance AI model launched by Chinese AI company DeepSeek caused a shock in the global technology community and brought significant urgency to China's tech giant Alibaba. To quickly catch up with this technological breakthrough, engineers at Alibaba canceled their vacation during the most important traditional Chinese festival — the Spring Festival — choosing to stay overnight in the company and fully commit themselves to AI R&D. Chairman of Alibaba, Joseph Tsai, vividly told this story of racing against time at the VivaTech Science and Technology Conference held in Paris on Wednesday.

Jun 12, 2025

210

Apple AI Model Update: Device-Side Strength Gradually Approaching Competitors, but Server-Side Performance Falls Short

Apple has released the latest update to its artificial intelligence model, which primarily supports the Apple Intelligence feature for systems like iOS and macOS. According to Apple's official data, the newly launched model performs comparably with similar products from Google and Alibaba, but compared to OpenAI's GPT-4o released a year ago, Apple's server-side model performance is notably weaker. In the update, Apple emphasized the capabilities of its 'device-side model'.

Jun 11, 2025

200

OpenAI Launches o3-pro AI Model: Higher Reliability and Tool Integration but at the Cost of Speed

Recently, OpenAI announced the launch of its latest o3-pro AI model, which is designed to provide more reliable and accurate responses for enterprises. As an advanced version of the o series inference models, o3-pro supports more software tool integrations and is particularly suitable for businesses and developers requiring high precision and accuracy. However, the response speed of o3-pro has slowed down compared to previous models. OpenAI stated that due to o3-pro's enhanced capabilities with richer tools, it needs extra time when handling complex problems.

Jun 11, 2025

150

Apple Brings ChatGPT and Other AI Models to Xcode

Jun 10, 2025

140

Senior High School Entrance Examination Math Competition: Six AI Models Clash, DouBao and YuanBao Win the Championship

Jun 9, 2025

440

OpenAI Upgrades ChatGPT Voice Mode for a More Natural Conversation Experience

Building on the GPT-4o released last year, OpenAI has made significant updates to its advanced voice mode, making voice communication more natural and aligned with human conversation styles. This advanced feature is based on the native multimodal model, which can quickly respond to audio inputs, with the fastest response time at 232 milliseconds and an average response time of 320 milliseconds, nearly matching the speed of human conversations. At the beginning of this year, OpenAI had already made minor updates to this voice mode, improving interruption frequency and accent handling.

Jun 9, 2025

210

Anthropic Releases AI Model Customized for National Security with Support from Amazon and Google

Artificial intelligence company Anthropic has launched an AI model suite designed specifically for U.S. national security agencies, named Claude Gov. This product has received strategic support from Amazon and Google, and is currently only available to institutions holding the highest security clearances. The Claude Gov model suite was developed according to the specific needs of defense and intelligence departments. Compared to the standard version of Claude, this model has seen significant improvements in its ability to handle classified materials, effectively reducing automation risks.

Jun 6, 2025

290

Anthropic on Why They Cut Off Windsurf's Access to Their AI Models: Focus on Long-Term Clients

Jun 6, 2025

300

Elon Musk updates X platform policy: prohibits the use of content by third parties to train AI models

Jun 6, 2025

240

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Hugging Face Launches Open Source Multimodal AI Model IDEFICS Supporting Image and Text Inputs

站长之家

This article is from AIbase Daily

AI News Recommendations

Nanometer AI Super Search Intelligence Entity Explodes Upgrade! One-Click Generation of PPTs, Videos, and Voiceover Scripts; Medical Research Can Also Be Searched in Seconds

Alibaba Engineers Give Up Spring Festival Holidays to Chase AI Progress

Apple AI Model Update: Device-Side Strength Gradually Approaching Competitors, but Server-Side Performance Falls Short

OpenAI Launches o3-pro AI Model: Higher Reliability and Tool Integration but at the Cost of Speed

Apple Brings ChatGPT and Other AI Models to Xcode

Senior High School Entrance Examination Math Competition: Six AI Models Clash, DouBao and YuanBao Win the Championship

OpenAI Upgrades ChatGPT Voice Mode for a More Natural Conversation Experience

Anthropic Releases AI Model Customized for National Security with Support from Amazon and Google

Anthropic on Why They Cut Off Windsurf's Access to Their AI Models: Focus on Long-Term Clients

Elon Musk updates X platform policy: prohibits the use of content by third parties to train AI models