Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Information

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

Tools

GEO Brand Visibility

All-in-One GEO Brand Insights Platform

AI Brand Monitoring Tool

Analyze & Track How AI Models Cite Your Brand

AI Search Visibility Checker

Detect brand's visibility on AI platforms

Service

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

Tools

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

AI Deployment Calculator

Enter Your Large Model Computing Requirements for Instant GPU, Memory & Server Configuration Recommendations

AI Tutorial

Groq Launches Large Model Inference Chip, Exceeding GPU Speeds at 500 Tokens Per Second

站长之家

Published inAI News · 1 min read · Feb 20, 2024

161

Groq Inc. has unveiled a large-scale inference chip that surpasses traditional GPUs and Google's TPU with a speed of 500 tokens per second. The team, including founder Jonathan Ross, hails from Google's TPU division. The chip employs Groq's proprietary LPU solution and aims to outperform NVIDIA within three years, with a price tag of approximately $20,000. It boasts extremely fast API access speeds and supports a variety of open-source LLM models.

Large Model Inference Chip Groq GPU

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

MongoDB Launches Voyage AI New Model: Interact with Databases Using Natural Language, Vector Search Accuracy Improved

MongoDB launches the Voyage AI model series, optimizing vector search performance and adding an AI assistant and automatic embedding features, enabling the database to understand semantics and interact intelligently. The core breakthrough lies in improving the accuracy of data semantic understanding, allowing developers to query data using natural language without writing complex query statements.

Jan 16, 2026

110

OpenAI Releases GPT-5.2-Codex Programming Model API, Officially Opened

OpenAI launches GPT-5.2-Codex, its most advanced agentic programming model, optimized for complex, long-cycle software development. It elevates AI from a coding assistant to an autonomous agent, achieving significant improvements in long-term task performance, reliability, and understanding of massive codebases.....

Jan 16, 2026

170

Google Veo 3.1 Makes a Major Upgrade! Enhanced Image Consistency + Native Vertical Format + 4K Upscaling

Google DeepMind's AI video generation model Veo3.1 has received a major update, with core improvements to the "Ingredients to Video" feature, significantly enhancing the consistency of characters, objects, textures, and backgrounds. It also adds native vertical output and professional-level 4K upscaling capabilities, upgrading AI video from a demonstration tool to a practical production tool.

Jan 16, 2026

160

Video Conferencing Giant Zoom Surpasses Records with Federated AI in the World's Toughest AI Exam

Video conferencing giant Zoom set a world record in a top AI benchmark test, achieving 48.1% and surpassing giants like Google. The key to its success lies in adopting a federated AI approach rather than directly training the underlying model.

Jan 16, 2026

130

Valuation Surges to $1.3 Billion! AI Video Company Higgsfield, Founded by Former Snap Executive, Secures Large Funding

AI video startup Higgsfield raised $130M in Series A, reaching a $1.3B valuation and unicorn status. Founded by ex-Snap AI lead, it offers AI video tools for consumers, creators, and marketers.....

Jan 16, 2026

100

OpenAI's Mental Health Safety Lead Joins Anthropic, Revealing the Debate on AI Model Emotional Defense

AI chatbots are deeply involved in human emotional lives, and addressing user psychological crises has become an urgent ethical challenge in the industry. Recently, Andrea Volonino, the former head of model policy at OpenAI, left the company to join her former supervisor at competitor Anthropic. During her time at OpenAI, she was responsible for the safety policies of GPT-4 and the next-generation reasoning models, and her departure highlights the unprecedented ethical dilemmas in the field of AI emotional interaction.

Jan 16, 2026

130

Google Launches TranslateGemma Translation Model, Easily Manageable on Mobile Phones

Google launched the TranslateGemma translation model series based on the Gemma3 architecture, offering three parameter sizes: 4B, 12B, and 27B. It supports translation in 55 core languages and features multimodal image translation capabilities, enabling seamless translation of text and text within images.

Jan 16, 2026

120

DeepSeek Launches Engram Module: Implanting Conditional Memory Axes into Sparse Large Models, Efficiency Significantly Improved

The DeepSeek team launched the Engram module, introducing a 'conditional memory axis' into sparse large language models, aiming to address the issue of computational resource waste when traditional Transformers process repetitive knowledge. This module, as a complement to the mixture-of-experts model, integrates N-gram embedding technology into the model, improving efficiency in processing repetitive patterns.

Jan 15, 2026

190

Billions Invested in Wafer-Level Giant Chip! OpenAI Teams Up with Cerebras to Build the World's Largest AI Inference Platform, Challenging NVIDIA's Dominance 15 Times Faster

OpenAI and Cerebras have partnered to deploy a 750-megawatt Cerebras wafer-level system, building the world's largest AI inference platform. The project will start in 2026 and be fully operational by 2028, with a transaction value exceeding $10 billion. Cerebras chips integrate 4 trillion transistors and have a much larger area than traditional GPUs. This move shows that large model manufacturers are accelerating their efforts to reduce reliance on traditional GPUs.

Jan 15, 2026

450

Baidu ERNIE-5.0-0110 Officially Released, Ranking Second in Mathematical Ability Globally

Baidu released the new generation Wenxin large model ERNIE-5.0-0110, ranking eighth with 1460 points on the LMArena global text ranking list, making it the only Chinese domestic large model to enter the top ten. Its ability to handle mathematical problems is particularly outstanding, rising to the second globally,仅次于GPT-5.2-High.

Jan 15, 2026

200

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Brand Visibility

AI Brand Monitoring Tool

AI Search Visibility Checker

GEO Services​

AI Model Compatibility Checker

AI Deployment Calculator

Groq Launches Large Model Inference Chip, Exceeding GPU Speeds at 500 Tokens Per Second

站长之家

This article is from AIbase Daily

AI News Recommendations

MongoDB Launches Voyage AI New Model: Interact with Databases Using Natural Language, Vector Search Accuracy Improved

OpenAI Releases GPT-5.2-Codex Programming Model API, Officially Opened

Google Veo 3.1 Makes a Major Upgrade! Enhanced Image Consistency + Native Vertical Format + 4K Upscaling

Video Conferencing Giant Zoom Surpasses Records with Federated AI in the World's Toughest AI Exam

Valuation Surges to $1.3 Billion! AI Video Company Higgsfield, Founded by Former Snap Executive, Secures Large Funding

OpenAI's Mental Health Safety Lead Joins Anthropic, Revealing the Debate on AI Model Emotional Defense

Google Launches TranslateGemma Translation Model, Easily Manageable on Mobile Phones

DeepSeek Launches Engram Module: Implanting Conditional Memory Axes into Sparse Large Models, Efficiency Significantly Improved

Billions Invested in Wafer-Level Giant Chip! OpenAI Teams Up with Cerebras to Build the World's Largest AI Inference Platform, Challenging NVIDIA's Dominance 15 Times Faster

Baidu ERNIE-5.0-0110 Officially Released, Ranking Second in Mathematical Ability Globally

AI News Recommendations

MongoDB Launches Voyage AI New Model: Interact with Databases Using Natural Language, Vector Search Accuracy Improved

OpenAI Releases GPT-5.2-Codex Programming Model API, Officially Opened

Google Veo 3.1 Makes a Major Upgrade! Enhanced Image Consistency + Native Vertical Format + 4K Upscaling

Video Conferencing Giant Zoom Surpasses Records with Federated AI in the World's Toughest AI Exam

Valuation Surges to $1.3 Billion! AI Video Company Higgsfield, Founded by Former Snap Executive, Secures Large Funding

OpenAI's Mental Health Safety Lead Joins Anthropic, Revealing the Debate on AI Model Emotional Defense

Google Launches TranslateGemma Translation Model, Easily Manageable on Mobile Phones

DeepSeek Launches Engram Module: Implanting Conditional Memory Axes into Sparse Large Models, Efficiency Significantly Improved

Billions Invested in Wafer-Level Giant Chip! OpenAI Teams Up with Cerebras to Build the World's Largest AI Inference Platform, Challenging NVIDIA's Dominance 15 Times Faster

Baidu ERNIE-5.0-0110 Officially Released, Ranking Second in Mathematical Ability Globally

GEO Services