A new study conducted high-pressure tests on 12 mainstream large models, finding that their performance significantly declined when facing shortened deadlines and increased penalties. For example, the failure rate of Gemini 2.5 Pro increased from 18.6% to 79%, and GPT-4o also experienced a near-halving drop. In critical tasks such as biosecurity, the models even made serious mistakes by skipping key steps.
SemiAnalysis reports that OpenAI halted large-scale pre-training of new frontier models post-GPT-4o due to convergence issues. GPT-5 is an optimized version of GPT-4o without architectural breakthroughs, while Google's TPUv7 excels in large-scale training for models like Gemini 3.....
DeepSeek-Math-V2, a 236B MoE model with 21B active parameters, achieves 75.7% on MATH, 30/4 on AIME 2024, and 53.7% on Math Odyssey using self-verification. Open-sourced under Apache 2.0.....
OpenAI discontinues GPT-4o API access for developers, urging migration. The model remains available for personal users but is no longer the default. Developers should explore alternatives.....
A collection of chatbot AI products, including GPT-4o, Gemini, Qwen, Deepseek, Claude & Grok.
A powerful open-source Kimi K2 chat platform that surpasses GPT-4 in programming and math benchmark tests through Kimi AI. Enterprise-grade Kimi AI with a 95% cost reduction.
Showcases a diverse collection of AI art images and prompts generated by OpenAI's GPT-4o.
GPT-4.1 is a model with significant improvements in programming, instruction following, and long-text understanding.
openai
$540
Input tokens/M
$1080
Output tokens/M
128k
Context Length
$2.88
$11.52
1M
$14.4
$57.6
$18
$72
$0.36
400k
$0.72
-
minimax
$1
$8
4M
deepseek
$1.94
$7.92
$216
$432
8.2k
$1.08
$4.32
mistral
$2.16
256k
unsloth
GLM-4-32B-0414 is a large language model with 32 billion parameters, comparable in performance to GPT-4o and DeepSeek-V3. It supports both Chinese and English, and excels in code generation, function calling, and complex task processing.
GLM-4-32B-0414 is a new member of the GLM family with 32 billion parameters, offering performance comparable to GPT-4o and DeepSeek-V3, and supports local deployment.
zai-org
GLM-4-32B-Base-0414 is a new member of the GLM family. It has 32 billion parameters and is pre-trained on 15T high-quality data. Its performance can be comparable to that of advanced models such as GPT-4o and DeepSeek-V3. This model supports convenient local deployment and performs excellently in code generation, function calls, search-based Q&A, etc.
GLM-4-32B-0414 is a new member of the GLM family, a high-performance large language model with 32 billion parameters. This model was pre-trained on 15T of high-quality data, including a large amount of synthetic reasoning data, and performs excellently in multiple task scenarios such as code generation, function calls, and search-based Q&A. Its performance can rival that of larger-scale models like GPT-4o and DeepSeek-V3.
Psychotherapy-LLM
This model is a specialized psychological counseling model fine-tuned through preference learning based on Llama-3.1-8B-Instruct, excelling in psychological counseling sessions with a win rate surpassing GPT-4o.
AtlaAI
Atla Selene Mini is currently the most advanced small judge language model (SLMJ), with performance comparable to models 10 times its size, surpassing GPT-4o in multiple benchmarks.
openbmb
MiniCPM-o 2.6 is a GPT-4o-level multimodal large model that runs on mobile devices, supporting vision, voice, and live stream processing
VITA-MLLM
VITA-1.5 is a multimodal interaction model designed to achieve GPT-4o level real-time vision and voice interaction capabilities.
CISCai
This is the GGUF quantized version of the Qwen2.5-Coder-32B-Instruct model, which uses an advanced importance matrix quantization method to significantly reduce storage and computational resource requirements while maintaining the model's effectiveness. This model is the current state-of-the-art open-source large language model for code, with coding capabilities comparable to GPT-4o.
c01zaut
MiniCPM-V 2.6 is a GPT-4V-level multimodal large language model supporting single-image, multi-image, and video understanding, optimized for RK3588 NPU
MiniCPM-V is a mobile GPT-4V-level multimodal large language model that supports single-image, multi-image, and video understanding, equipped with visual and optical character recognition capabilities.
Sami92
A text classification model fine-tuned based on XLM-R Large, specifically designed to identify factual and non-factual statements in German texts. The model uses weakly supervised learning. It is first trained on a Telegram dataset annotated by GPT-4o and then continues to be trained on a manually annotated dataset, achieving an accuracy of 0.9 on the test set.
EmergentMethods
Phi-3-mini-4k-instruct-graph is a fine-tuned version of Microsoft's Phi-3-mini-4k-instruct, specifically designed for entity relationship extraction from general text data, aiming to achieve comparable quality and accuracy to GPT-4 in generating entity relationship graphs.
internlm
InternLM-XComposer2.5 is an outstanding text-image understanding and creation model that achieves GPT-4V level performance with just 7B parameters, supporting 24K interleaved text-image context and extendable to 96K long context.
InternLM-XComposer2.5 is an outstanding image-text understanding and creation model that achieves GPT-4V level capabilities with just 7 billion parameters, supporting long-context window expansion.
ruslandev
A language model fine-tuned based on Meta-Llama-3-8B-Instruct. It improves data quality through GPT-4o and focuses on enhancing Russian language capabilities. In the MT-Bench evaluation, its Russian score exceeds that of GPT-3.5-turbo.
yyupenn
WhyXrayCLIP is a model capable of aligning X-ray images with text descriptions, fine-tuned on the MIMIC-CXR dataset based on OpenCLIP (ViT-L/14), with clinical reports processed by GPT-4.
MiniCPM-V 2.6 is a multimodal large model launched by OpenBMB, surpassing GPT-4V in single-image, multi-image, and video understanding tasks, and supports real-time video understanding on iPad.
KomeijiForce
This is a multilingual classifier based on GPT-4 distillation, specifically designed to evaluate the Natural Language Inference (NLI) relationship between character responses and preset personality statements in role-playing dialogues.
leafspark
WikiChat-v0.2 is a dialogue model currently under training, based on OpenOrca GPT-4 data, cosmopedia, and dolly15k datasets, supporting English text generation tasks.
An image generation and editing tool based on the OpenAI GPT-4o/gpt-image-1 model, supporting image generation through text prompts, image editing (such as repair, extension, composition, etc.), and being compatible with multiple MCP clients.
A client project based on Python 3.13 that integrates MCP services and the GPT-4 model, providing interactive tool invocation and network search functionality.
This project is a stdio server based on the Model Context Protocol (MCP), used to forward prompts to OpenAI's ChatGPT (GPT-4o), supporting advanced summarization, analysis, and reasoning functions, suitable for assistant integration in the LangGraph framework.
An image analysis MCP server based on the GPT-4o-mini model that can handle image content analysis from URLs or local paths
A server that interacts with ChatGPT through the MCP protocol for advanced text analysis and reasoning.
An image analysis MCP server based on the GPT-4o-mini model that identifies and describes content by receiving image URLs.
The urlDNA MCP Server is a service that provides native tool usage for security-oriented LLM agents (such as OpenAI GPT-4.1 and Claude 3 Desktop), interacting directly with the urlDNA threat intelligence platform through the API. It supports various tools, including URL scanning, searching, and quick checking, and can be accessed via the SSE protocol.
An SSE proxy server based on Node.js that enables real-time response stream transmission of models such as GPT-4 via the Browserbase API
An intelligent chatbot based on Streamlit that uses GPT-4o to automatically route user requests to different tools (such as chatting, image generation, database queries, voice synthesis, etc.), supporting rapid experimentation with AI tool routing functions.