Best HallusionBench AI Tools & Models - Premium HallusionBench News

AI News

Xi Xiaoyao Technology Talk | Stop Saying GPT-4V is Amazing! It Can't Even Recognize Peking Duck, Can You Believe It??

The newly proposed image reasoning benchmark HallusionBench is used to examine visual language models like GPT-4V, revealing issues with language and visual hallucinations. Models like GPT-4V exhibit a high error rate of up to 90% in generating language hallucinations influenced by parametric memory within HallusionBench. Additionally, models such as GPT-4V are prone to geometric and other visual illusions, indicating that their current visual capabilities are still limited. Simple image manipulations can easily mislead these models, reflecting their fragility.

5.5k 3 days ago

Xi Xiaoyao Technology Talk | Stop Saying GPT-4V is Amazing! It Can't Even Recognize Peking Duck, Can You Believe It??

Empowering the future, your artificial intelligence solution think tank

English 简体中文繁體中文にほんご

FirendLinks:

AI Newsletters AI Tools MCP Servers AI News AIBase LLM Leaderboard AI Ranking

Business Cooperation Site Map