Apple and LM Studio Achieve a Breakthrough Collaboration: Four Mac Studios Successfully Run Trillion-Parameter Large Model

AIbase基地

Published inAI News · 5 min read · Jun 22, 2026

During the recent WWDC (Worldwide Developers Conference), the AI software platform LM Studio partnered with Apple to showcase a highly technologically impactful achievement: successfully running Moonshot AI's flagship model, Kimi K2.6, on a cluster of four Mac Studios. This demonstration highlighted the significant potential of Apple Silicon architecture in handling ultra-large-scale AI models.

The Kimi K2.6 model employs an advanced MoE (Mixture of Experts) architecture, with a total parameter count of up to one trillion. Although the dynamic expert scheduling mechanism reduces the computational load during inference by activating only about 32 billion parameters, loading the full weight of the entire model still presents a stringent memory challenge—requiring at least approximately 2TB of memory capacity when calculated at FP16 precision. In traditional data center environments, this typically requires server clusters consisting of 8 to 16 high-end GPUs, with costs often reaching millions of dollars.

However, this demonstration bypassed this barrier through an innovative technical approach. Four Mac Studios equipped with M3 Ultra chips were interconnected via Thunderbolt5 interfaces, utilizing the RDMA-over-Thunderbolt technology available in the latest version of macOS. This broke the boundaries of physical devices, allowing direct sharing of memory between multiple devices. The total of approximately 2TB of unified memory was integrated into a single logical "large memory pool," easily accommodating the weights of the trillion-parameter model. During the live demonstration, the cluster showcased excellent performance, generating about 28 tokens per second, and consuming significantly less power than traditional GPU computing centers.

In addition, LM Studio also released a key component called LM Link during this collaboration. This tool is based on the Tailscale Mesh VPN architecture and allows users to securely access this local Mac Studio cluster remotely through end-to-end encrypted channels. This means users do not need to be present at the host; whether using a MacBook or iPhone, they can remotely call the cluster's computing power for inference from any network environment. All sensitive data is processed within a closed loop locally, without passing through third-party cloud servers.

This demonstration was not only a technological showcase but also sent a clear industry signal: Apple Silicon, with its unified memory architecture and efficient multi-device connectivity capabilities, is becoming a new choice for large model deployment. For enterprises that require frequent and long-term operation of large model inference, this solution replaces expensive cloud rental costs with hardware purchase, offering significant cost advantages over long-term operations.

As the performance of "consumer-grade" hardware clusters continues to improve, the organizational barriers for AI technology applications are being further lowered. This achievement indicates that the source of cutting-edge artificial intelligence innovation will no longer be limited to a few tech giants with large supercomputing centers. A decentralized computing network may soon experience new development opportunities.

Locally Run Trillion-Parameter Models: Apple and LM Studio Team Up to Unlock the Full Potential of Mac Studio

At WWDC 2026, LM Studio and Apple demoed Moonshot AI's trillion-parameter MoE model Kimi K2.6 running smoothly on a cluster of four Mac Studios. This challenges the assumption that large models require the cloud, proving consumer hardware can handle cutting-edge AI and marking a milestone for local deployment.....

Enterprise AI Transformation Gains a New Tool: Qingyun Technology's Computing Cloud Integrates MiniMax-M3 Model

Enterprises face challenges in efficiently and cost-effectively implementing AI. Qingyun Technology's Crest Computing platform has integrated the domestic open-source large model MiniMax-M3, offering new computing power support. MiniMax-M3 excels in three core technologies, including outstanding context processing capabilities, and relies on its self-developed architecture to help enterprises easily deploy AI business.

JD.com Launches A2P2 Protocol: The First Smart Agent Autonomous Payment Standard, Dividing into Six Levels from L0 to L5

JD.com released the country's first smart agent autonomous payment protocol, A2P2, which for the first time categorizes AI payment capabilities into six levels from L0 to L5. The protocol focuses on the intermediate stages of L3 and L4, achieving a progressive transition from user confirmation to full autonomous decision-making by the smart agent, providing a framework for standardization of AI payments.

GLM-5.2 Model by Zhipu AI is Officially Open Sourced: Focuses on 1M Lossless Context and Long-Range Coding Tasks

On June 17, Zhipu AI open-sourced GLM-5.2, a large model focused on code generation and long-horizon tasks. It ranked 2nd globally and 1st among open-source models in Code Arena's front-end evaluation. Since early 2025, Zhipu has advanced its code foundation with GLM-4.5/4.7, and GLM-5.2 extends to complex engineering tasks over days to months.....

SpaceX to Acquire Cursor via Stock at a Valuation of $60 Billion, Deal Expected to Complete in Q3 2026

SpaceX has reached a final merger agreement with Anysphere, the developer of the AI coding tool Cursor. The implicit equity valuation of Cursor is approximately $60 billion, with shareholders receiving shares of SpaceX, making it a stock-only transaction without cash payment. The deal is expected to be completed in the third quarter of 2026, pending regulatory approval. Cursor is an AI-powered coding product that has rapidly grown based on large models.

Google Releases Android 17 and Wear OS 7: Full Integration of Gemini Omni and Lyria3 Multi-Modal Models

On June 16, Google released Android 17 final, Wear OS 7, and Pixel Drop, injecting new AI infrastructure into Pixel devices and deepening the on-device AI ecosystem. The core strategy leverages Gemini Omni multimodal model to fully implement multimodal capabilities and reconstruct underlying system interactions.....

Alipay New Product Token Pay Integrates Latest Domestic Large Model MiniMax M3

MiniMax's latest M3 model integrates Alipay's TokenPay solution, enabling enterprise batch purchase and one-click distribution of token seats. Users can complete in-app payments via voice through Alipay AI Pay on MiniMax Agent's Mix Claw service. MiniMax becomes one of the first global AI model providers to embed Token Pay into its core subscription system, marking Alipay's first MaaS payment solution for large-scale application.....

China's Multimodal Large Model Reaches a Milestone: MiniMax M3 Is Officially Open-Sourced and Response Speed Doubles

Xiyu Technology open-sourced its native multimodal flagship model MiniMax M3, with 428B total parameters and 23B activated parameters, the first of its kind in the industry. It previously released weights and a sparse attention mechanism paper, sparking widespread attention. The model ranks first in comprehensive performance among open-source models.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Ranking Monitor

AI Conversation Insight

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Ranking Optimization

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

LLM API Proxy Checker

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

Apple and LM Studio Achieve a Breakthrough Collaboration: Four Mac Studios Successfully Run Trillion-Parameter Large Model

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Four Mac Studios Overcome Cloud Clusters! Apple Teams Up with LM Studio to Run Trillion-Parameter Large Models Locally

Locally Run Trillion-Parameter Models: Apple and LM Studio Team Up to Unlock the Full Potential of Mac Studio

Enterprise AI Transformation Gains a New Tool: Qingyun Technology's Computing Cloud Integrates MiniMax-M3 Model

Bezos Invests 400 Million Dollars to Lead Investment, British AI Unicorn CuspAI's Valuation Soars to 2.6 Billion Dollars

JD.com Launches A2P2 Protocol: The First Smart Agent Autonomous Payment Standard, Dividing into Six Levels from L0 to L5

GLM-5.2 Model by Zhipu AI is Officially Open Sourced: Focuses on 1M Lossless Context and Long-Range Coding Tasks

SpaceX to Acquire Cursor via Stock at a Valuation of $60 Billion, Deal Expected to Complete in Q3 2026

Google Releases Android 17 and Wear OS 7: Full Integration of Gemini Omni and Lyria3 Multi-Modal Models

Alipay New Product Token Pay Integrates Latest Domestic Large Model MiniMax M3

China's Multimodal Large Model Reaches a Milestone: MiniMax M3 Is Officially Open-Sourced and Response Speed Doubles

AI News Recommendations

Four Mac Studios Overcome Cloud Clusters! Apple Teams Up with LM Studio to Run Trillion-Parameter Large Models Locally

Locally Run Trillion-Parameter Models: Apple and LM Studio Team Up to Unlock the Full Potential of Mac Studio

Enterprise AI Transformation Gains a New Tool: Qingyun Technology's Computing Cloud Integrates MiniMax-M3 Model

Bezos Invests 400 Million Dollars to Lead Investment, British AI Unicorn CuspAI's Valuation Soars to 2.6 Billion Dollars

JD.com Launches A2P2 Protocol: The First Smart Agent Autonomous Payment Standard, Dividing into Six Levels from L0 to L5

GLM-5.2 Model by Zhipu AI is Officially Open Sourced: Focuses on 1M Lossless Context and Long-Range Coding Tasks

SpaceX to Acquire Cursor via Stock at a Valuation of $60 Billion, Deal Expected to Complete in Q3 2026

Google Releases Android 17 and Wear OS 7: Full Integration of Gemini Omni and Lyria3 Multi-Modal Models

Alipay New Product Token Pay Integrates Latest Domestic Large Model MiniMax M3

China's Multimodal Large Model Reaches a Milestone: MiniMax M3 Is Officially Open-Sourced and Response Speed Doubles