Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Information

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

Submit Your Model

Submit Your Model Info & Services - Precision Marketing & User Targeting

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

AI Search Visibility Checker

Detect brand's visibility on AI platforms

Tools

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

Information

AI Dataset Collection

Large-scale datasets and benchmarks for training, evaluating, and testing models to measure

Tools

Intelligent Document Recognition

Comprehensive Text Extraction and Document Processing Solutions for Users

AI Tutorial

AI Generated Optimization Metal Kernel PyTorch Inference Speed Increases by an Amazing 87%

AIbase基地

Published inAI News · 5 min read · Sep 5, 2025

On Apple devices, AI technology is showing remarkable potential. According to the latest research from Gimlet Labs, AI can automatically generate optimized Metal kernels, which have improved PyTorch inference speed by 87%. This breakthrough not only enhances performance but also achieved an average 1.87x acceleration on 215 PyTorch modules tested, with some workloads seeing hundreds of times faster speeds.

Researchers selected eight AI models from multiple top institutions, including Anthropic, DeepSeek, and OpenAI, to generate optimized GPU kernels for Apple devices. This process does not require modifying user code or using new frameworks, directly improving model performance on Apple hardware.

In the experiment, the research team used a Mac Studio (equipped with Apple M4Max chip) for testing, with the baseline set to PyTorch's eager mode. The experiment used 215 PyTorch modules from the KernelBench dataset, which were divided into three categories, covering from simple matrix multiplication to full model architectures.

The testing process included receiving input and PyTorch code, generating Metal kernels, and evaluating their correctness. Data showed that as the number of attempts increased, the correctness of AI-generated kernels gradually improved. For example, on the fifth attempt, the proportion of correctly implemented kernels reached 94%. In addition, the models demonstrated cross-level capabilities when generating kernels, although non-inference models could sometimes also generate effective kernels.

The experimental results showed that the GPT-5 model achieved a 4.65x speed increase on certain tasks. More surprisingly, the o3 model reduced latency by 9000 times in some cases. The study also found that a single model did not always perform best on certain tasks, and combining multiple models could generate better kernels.

To further improve performance, researchers tried introducing additional contextual information, such as CUDA implementations and gputrace performance analysis data. The results showed that this approach achieved an average 1.87x performance acceleration, which was three times higher than the 1.31x acceleration achieved by ordinary agents.

It should be noted that the researchers emphasized that this work is not intended to demonstrate the ultimate performance limit, but rather to verify the feasibility of AI in kernel generation, aiming to reduce the burden on developers through automation. Overall, this study marks an important advancement in AI technology in the field of hardware optimization.

github:https://github.com/ScalingIntelligence/KernelBench/

Key Points:
🌟 AI automatically generates Metal kernels, increasing PyTorch inference speed by 87%.
⚡️ Achieves an average 1.87x acceleration on 215 PyTorch modules, with some workloads increasing by hundreds of times.
🔍 The study aims to verify the feasibility of AI in kernel generation, helping hardware optimization.

AINeologism MetalCore PyTorch AppleDevices

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Legal and Taxation Company Steuerrecht.com Enhances Efficiency with ChatGPT Business

Small tax firm Steuerrecht.com uses ChatGPT Business to automate routine tasks, enabling its 10-person team to focus on complex cases and business growth, challenging larger competitors.....

Oct 28, 2025

Anthropic Launches Enhanced Claude for Financial Analysts with Excel Extension and Real-Time Data Connectivity

Anthropic launches Claude for Finance with Excel plugin, data connectivity, and AI agent skills to enhance financial workflow efficiency, currently in enterprise testing.....

Oct 28, 2025

120

Millions of Users Confess Suicidal Thoughts to ChatGPT Every Week OpenAI Urgently Upgrades GPT-5 Safety Mechanisms to Address Psychological Crises

AI chatbots like ChatGPT are becoming informal mental health support, with over 1 million users weekly expressing suicidal thoughts and hundreds of thousands showing mental illness symptoms, raising concerns about their trustworthiness.....

Oct 28, 2025

ChatGPT Becomes a Versatile Life Assistant! Connect to Eight Platforms Like Spotify, Booking, Canva Instantly

OpenAI upgrades ChatGPT to a smart agent with eight integrated apps for travel, design, and more. Users can invoke authorized accounts via conversation, enabling actions like generating Spotify playlists without manual input.....

Oct 27, 2025

160

Google Earth Integrates Gemini Large Model to Identify Storm and Drought Risks

Google integrates Gemini into Earth, enabling natural language queries for geospatial reasoning like storm and drought analysis using weather, satellite, and population data.....

Oct 27, 2025

150

AntBaiLing Team Releases the Next-Generation Efficient Inference Model Ring-mini-sparse-2.0-exp

Ant's Ring-mini-sparse-2.0-exp model enhances long-sequence decoding via MoE architecture with sparse attention, optimizing inference performance for complex sequences.....

Oct 27, 2025

120

Meituan Launches LongCat-Video Video Generation Model with Native Support for 5-Minute Continuous Output

Meituan's LongCat-Video, a DiT-based model, simulates physics for text-to-video tasks, advancing AI's real-world understanding and world model research.....

Oct 27, 2025

160

Baidu Collaborates with Shanghai University of Sport to Launch Sports Big Model 2.0

Baidu and Shanghai University of Sport jointly launched 'SUS Sports Model 2.0', an AI model for sports analytics. The event also featured the inaugural Shanghai Sports AI Innovation Competition finals, highlighting tech-sports integration trends.....

Oct 27, 2025

130

OpenAI Enters the Music Creation Field, Collaborates with Juilliard Students to Develop a New AI Music Model

OpenAI is developing an AI music generation tool that creates music from text or audio prompts, expanding multimodal content capabilities. It focuses on video scoring and precise accompaniment for applications like custom background music.....

Oct 27, 2025

SoftBank Invests $2.25 Billion to Further Support OpenAI, AI Music and Super Funding Plan Accelerate

SoftBank invests an additional $22.5B in OpenAI, totaling $32.5B, covering 75% of OpenAI's $40B funding plan. This demonstrates strong confidence in AGI and supports OpenAI's IPO push, highlighting major capital bets on AI's future.....

Oct 27, 2025

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

AI Generated Optimization Metal Kernel PyTorch Inference Speed Increases by an Amazing 87%

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Legal and Taxation Company Steuerrecht.com Enhances Efficiency with ChatGPT Business

Anthropic Launches Enhanced Claude for Financial Analysts with Excel Extension and Real-Time Data Connectivity

Millions of Users Confess Suicidal Thoughts to ChatGPT Every Week OpenAI Urgently Upgrades GPT-5 Safety Mechanisms to Address Psychological Crises

ChatGPT Becomes a Versatile Life Assistant! Connect to Eight Platforms Like Spotify, Booking, Canva Instantly

Google Earth Integrates Gemini Large Model to Identify Storm and Drought Risks

AntBaiLing Team Releases the Next-Generation Efficient Inference Model Ring-mini-sparse-2.0-exp

Meituan Launches LongCat-Video Video Generation Model with Native Support for 5-Minute Continuous Output

Baidu Collaborates with Shanghai University of Sport to Launch Sports Big Model 2.0

OpenAI Enters the Music Creation Field, Collaborates with Juilliard Students to Develop a New AI Music Model

SoftBank Invests $2.25 Billion to Further Support OpenAI, AI Music and Super Funding Plan Accelerate

GEO Services