Google Launches New Visual Language Model PaLI-3 with Strong Performance and Fewer Parameters

学术头条

Published inAI News · 1 min read · Oct 19, 2023

Recently, Google's research team introduced a new vision-language model called PaLI-3. This model, despite having fewer parameters compared to larger models, demonstrates superior performance. The research utilizes a contrastive pre-trained image encoder, which enables PaLI-3 to excel in various localization and text comprehension tasks. PaLI-3 has achieved top results on multiple visual question answering datasets, showcasing its robust multimodal understanding capabilities. The study compared classification pre-training with contrastive pre-training and found that the latter leads to more efficient vision-language models.

Visual Language Model Google PaLI-3

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Google Launches New AI Shopping Feature for Virtual Try-On and Personalized Shopping Experience

May 30, 2025

130

Google Play Store Introduces New Feature: Users Can Directly Ask Gemini AI About App Usage Tips

In the digital age, users often face a question: how to quickly find the best way to use an app? To simplify this process, Google has recently launched a new feature in its Play Store called 'Ask Play About This App'. This feature is powered by Google's Gemini AI technology, allowing users to ask questions directly on the app page, avoiding the hassle of online searches. With the update to Play Store version 46.1.39-31, this new feature has begun to roll out for some apps.

May 30, 2025

190

Google's Big Move! Open Source Evaluation Framework LMEval Launched, Making AI Model Comparisons More Transparent

Recently, Google officially released the open source framework LMEval, aimed at providing standardized evaluation tools for large language models (LLMs) and multimodal models. The launch of this framework not only simplifies cross-platform model performance comparisons, but also supports assessments in areas such as text, images, and code, showcasing Google's latest breakthroughs in the field of AI evaluations. AIbase has compiled the latest developments of LMEval and its impact on the AI industry. Standardized Evaluations: Simplified Cross-Platform Model Comparisons

May 29, 2025

260

# Google Launches SignGemma: Innovative Model to Convert Sign Language into Spoken Text

Recently, Google announced a new artificial intelligence model called SignGemma on its social media platforms. This model will be able to convert sign language into spoken text. This innovation is expected to be added to the open-source Gemma series later this year and eventually applied to several of Google's products, such as Gemini Live. The Background of Sign Language Conversion Technology Sign language is an important tool for deaf people to communicate with others, and its usage range is becoming increasingly widespread.

May 29, 2025

790

Google's Pichai Says AI Hardware Collaboration with Apple is Unparalleled, Sparking Tech Discussions

May 28, 2025

380

Musk on AI: It Will Replace Search, Google's Market Share Dropped Below 90% for the First Time in a Decade

Recently, Musk made this assertion on his social media account and @-ed the chatbot Grok under his xAI, sparking industry热议. The report he cited showed that Google's search market share fell below 90% for the first time in a decade. According to the report, Google's global search engine share has dropped to 89.71%, the lowest point since 2015. The report pointed out that users are gradually getting tired of the traditional search experience full of SEO optimization and ad interference, driven by AI.

May 28, 2025

160

Anthropic Introduces Claude Conversational Voice Mode for Mobile Devices, Searches Google Docs, Calendars, etc.

San Francisco-based artificial intelligence startup Anthropic has announced a major update to its Claude conversational AI robot: a brand-new voice mode. This feature is now available on the mobile application in Apple's App Store (iOS devices) and Google's Play Store (Android devices). In addition to the introduction of the voice mode, Anthropic has also expanded web search capabilities for all free users. These updates are designed to enhance diversity and functionality of Claude.

May 28, 2025

430

Red Hat Joins Hands with Google and NVIDIA to Launch LLM-D Open Source Project to Solve Dual Challenges of Large-Scale AI Inference Cost and Latency

The global leader in open-source solutions, Red Hat, recently announced the launch of the revolutionary open-source project LLM-D, specifically addressing the urgent needs for large-scale inference in generative AI. The project, founded by industry giants such as CoreWeave, Google Cloud, IBM Research, and NVIDIA, aims to meet the most stringent production service-level goals through breakthrough technologies that enable cloud-based large language model inferencing.

May 27, 2025

250

Google Adds Gemini AI Assistant to Chrome Browser with Real-time Screen Awareness Gaining Attention

Google has recently introduced a brand new Gemini AI assistant within its Chrome browser, delivering a revolutionary experience for users. The new assistant can perceive screen content in real time, making browsing more intelligent for users. Currently, this feature is only available to AI Pro and AI Ultra subscribers and is still in the Chrome beta version. As artificial intelligence technology advances rapidly, Google continues to seek deeper integration of AI into everyday applications. Gemini

May 27, 2025

200

Google Launches LMEval: A New Tool for Uniformly Evaluating Large Language and Multimodal Models

May 27, 2025

420

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

Google Launches New Visual Language Model PaLI-3 with Strong Performance and Fewer Parameters

学术头条

This article is from AIbase Daily

AI News Recommendations

Google Launches New AI Shopping Feature for Virtual Try-On and Personalized Shopping Experience

Google Play Store Introduces New Feature: Users Can Directly Ask Gemini AI About App Usage Tips

Google's Big Move! Open Source Evaluation Framework LMEval Launched, Making AI Model Comparisons More Transparent

# Google Launches SignGemma: Innovative Model to Convert Sign Language into Spoken Text

Google's Pichai Says AI Hardware Collaboration with Apple is Unparalleled, Sparking Tech Discussions

Musk on AI: It Will Replace Search, Google's Market Share Dropped Below 90% for the First Time in a Decade

Anthropic Introduces Claude Conversational Voice Mode for Mobile Devices, Searches Google Docs, Calendars, etc.

Red Hat Joins Hands with Google and NVIDIA to Launch LLM-D Open Source Project to Solve Dual Challenges of Large-Scale AI Inference Cost and Latency

Google Adds Gemini AI Assistant to Chrome Browser with Real-time Screen Awareness Gaining Attention

Google Launches LMEval: A New Tool for Uniformly Evaluating Large Language and Multimodal Models