The issue of 'large model hallucination' (outputting factual errors) is particularly severe in high-risk fields such as healthcare and law. The two mainstream methods in the industry to combat hallucination—expanding training data and setting up defense mechanisms—have limitations: data cannot cover all facts, and defense mechanisms often lead to AI being overly cautious.
At Tsinghua University's 'AI Medical New Paradigm' forum, Baichuan Intelligent CEO Wang Xiaochuan released the next-generation medical model 'Baichuan-M4' and AI family doctor 'Baixiaoyi'. The model topped three authoritative benchmarks, solving the 'factual hallucination' problem in medical AI, marking a breakthrough in precision and application form of AI in the medical vertical domain.....
Baichuan Intelligence launches next-gen medical AI model Baichuan-M4 and AI family doctor 'Baixiaoyi', addressing AI healthcare's consultation-touch gap. Baichuan-M4 tops medical benchmarks with 3.3% hallucination rate and strong evidence-based reasoning, advancing AI in healthcare.....
Recently, the DeepSeek web version triggered concerns about 'conversation privacy leaks' after users input special characters, causing irrelevant text to appear. The official team quickly investigated, attributing it to model hallucination rather than a data leak, alleviating user concerns.....
Eliminate hallucinations. Multi-modal RAG doesn't forget information. Intelligently orchestrate cutting-edge models and deliver excellent task performance.
Basin is a reliable coding tool designed to prevent errors and hallucinations generated by AI.
A leaderboard for comparing the hallucination rates of large language models when summarizing short documents.
An open-source evaluation model for detecting hallucinations, based on the Llama-3 architecture with 70 billion parameters.
Alibaba
$15.8
Input tokens/M
$12.7
Output tokens/M
64
Context Length
Openai
$8.75
$70
400
Iflytek
$2
-
$525
$1050
128
Baichuan
$15
32
openchs
This is a Swahili-English translation model fine-tuned based on the opus-mt-mul-en model of Helsinki-NLP. The model is specifically optimized and trained for the children's hotline service scenario, trained with synthetic hotline dialogue data, equipped with a mechanism to prevent hallucination generation, and the translation quality is guaranteed by monitoring the BLEU score through an early stopping strategy.
nightmedia
Qwen3-Next-80B-A3B-Instruct-q2-mlx is an extremely quantized version in MLX format converted from the Qwen3-Next-80B-A3B-Instruct model, mainly used for text generation tasks. This version uses q2 quantization, and the model size is approximately 23GB. As a proof-of-concept version, there may be repetition and hallucination issues.
HugoHE
M-Hood is a series of models specifically designed to mitigate the hallucination phenomenon in object detection. Through novel fine-tuning strategies and a revised benchmark dataset, it significantly reduces false alarms on out-of-distribution data and enhances the safety and reliability of object detection systems.
stelterlab
DeepSeek-R1-0528 is an upgraded large language model launched by DeepSeek. It has significant improvements in reasoning ability and reducing hallucination rate, and its overall performance is close to that of leading models.
QuantTrio
A quantized version model developed based on DeepSeek-R1-0528-Qwen3-8B, with significant improvements in reasoning ability and reduction of hallucination rate, suitable for various natural language processing tasks.
Inpris
Humains-Junior is an AI assistant trained by Humains.com based on the Microsoft Phi-3.5-mini-instruct model, specifically optimized for the customer service scenario. This model is fine-tuned using 300 million tokens, with strict instruction-following ability, reduced hallucination, and powerful function call ability, and it implements identity awareness.
grounded-ai
This model is designed to detect hallucination phenomena in language model outputs, i.e., responses that are coherent but factually incorrect or out of context.
TEEN-D
A claim verification model fine-tuned from Llama-3.2-3B-Instruct, specifically designed to detect hallucinations or unsupported statements in AI-generated text.
5CD-AI
Vintern-3B-R-beta is a multimodal large language model focused on complex reasoning tasks based on images, capable of decomposing reasoning steps and effectively controlling hallucination phenomena.
DISLab
Gen-8B-R2 is a generation model focused on reducing hallucination issues in RAG systems, particularly suitable for handling retrieval noise and information overload.
yaxili96
FactCG is a text classification model based on the DeBERTa-v3-large architecture, specifically designed to detect unsubstantiated hallucinations in content generated by large language models.
KRLabsOrg
LettuceDetect is a hallucination detection model based on ModernBERT, specifically designed for RAG applications with support for long-context processing.
LettuceDetect is a hallucination detection model based on ModernBERT, specifically designed for RAG applications, capable of identifying tokens in answers that are not supported by the context.
SeaLLMs
SeaLLMs-v3 is the latest achievement in the large language model series for Southeast Asian languages. It performs excellently among models of the same scale, can effectively handle various Southeast Asian language tasks, and provides safe and reliable responses. This model is specially optimized to reduce hallucination phenomena and is sensitive to the local context.
SeaLLMs-v3 is the latest version of the large language model series for Southeast Asian languages. It achieves state-of-the-art performance among models of the same scale and excels in tasks such as world knowledge, mathematical reasoning, translation, and instruction following. It is specially optimized for reliability and security, reducing hallucination phenomena.
gokaygokay
An image caption generation model fine-tuned on the DocCI dataset based on PaliGemma-3b, capable of generating detailed descriptions of 200-350 characters with reduced hallucination
TroyDoesAI
An optimized model based on microsoft/Phi-3-mini-128k-instruct, focusing on improving context adherence and reducing hallucination phenomena, suitable for RAG application scenarios.
blueapple8259
This model is trained using the Korean textbook dataset tiny-textbooks. It has poor performance and severe hallucination issues.
stanford-oval
WikiChat is a large language model fine-tuned based on LLaMA-2 (7B). It prevents chatbots from generating hallucinations by anchoring a small number of samples on Wikipedia, significantly improving the accuracy and factuality of conversations.
moranyanuka
Official checkpoint of BLIP base model fine-tuned on MS-COCO dataset using MOCHA reinforcement learning framework to mitigate open-vocabulary description hallucination
Cognee is an open - source project that provides memory functions for AI agents. It builds a dynamic knowledge graph through a modular ECL pipeline, supports multiple data sources and formats, reduces hallucinations, and lowers costs.
Omni-NLI is a self-hosted multi-interface (REST and MCP) server that focuses on natural language inference tasks, used to verify the support, contradiction, or neutral relationship between texts, and can help reduce AI hallucination and improve application reliability.
Hivemind is an Obsidian plugin that provides AI firewall capabilities for fictional world - building, research, and knowledge management. It ensures that AI tools collaborate based on the real information in the user's notes through timeline views, relationship graphs, and canonical workflows, preventing AI hallucinations.
Chainguard is an MCP server that provides task tracking, syntax validation, long-term memory, and intelligent context management for Claude Code, including code semantic search, hallucination prevention, and a kanban system.
An MCP service that prevents AI hallucinations. When AI is uncertain, it can ask humans for help instead of being over - confident, improving development efficiency through a simple Q&A mechanism.
This is a Quran search engine server based on the MCP protocol, aiming to provide accurate and hallucination-free Quranic verse query services for AI assistants. It uses a dedicated search engine to handle Arabic normalization, root, and lemma matching, ensuring the absolute accuracy of the returned verse texts, while the AI is only responsible for understanding the user's natural language query intentions.
A development documentation server based on the MCP protocol, providing functions such as document crawling, local loading, precise search, and detail retrieval, to solve the document hallucination problem in AI development.
MCP-NixOS is a model context protocol server that prevents AI assistants from having hallucinations about the NixOS system, providing real-time access to NixOS packages, system options, Home Manager settings, and nix-darwin configurations.
A development document server based on the MCP protocol, providing accurate framework document retrieval services and solving the API hallucination problem in AI development.
DevDocs-MCP is a localized MCP server that provides fixed-version and authoritative documentation data for AI assistants, eliminating AI hallucinations and ensuring the accuracy of API context.
Libragen is a local RAG library construction tool used to connect AI assistants (such as Claude) with your actual documents and code libraries. It reduces AI hallucinations by creating a searchable knowledge base, supports building from local files or Git repositories, and is directly integrated into the AI workflow through the MCP protocol.
An MCP server based on two - stage SQL generation, which converts natural language to SQL, reduces the hallucination risk and enhances the trust of non - technical users.
FOCUS DATA MCP Server is an AI assistant service that converts natural language into SQL statements. It uses a two-step generation scheme to control LLM hallucinations and increase the trust of non-technical users in SQL results.
AI Coding Assistant Security Scanner, scans code vulnerabilities, detects AI hallucination packages, and prevents prompt injection attacks through MCP or CLI, supports 12 languages and more than 1,700 security rules
RagAlgo is an MCP server that provides mathematically scored financial context (South Korean stocks/cryptocurrencies) for AI agents. It focuses on using daily closing data to build a 'factual state' to prevent AI hallucinations caused by real - time market noise, aiming to build investment advisors rather than high - frequency trading robots.
A physical layer code inspection tool that verifies whether radio frequency and physical calculations violate physical limits through an MCP server and captures physical hallucinations of AI in the engineering workflow
A Java class analysis service based on the MCP protocol. It provides accurate code analysis capabilities for LLMs by decompiling dependent JAR packages and solves the dependency hallucination problem in AI coding.
GitMCP is a free and open - source remote MCP server that can transform any GitHub project into a documentation center, enabling AI tools to access the latest documentation and code and reducing hallucinations.