In the field of artificial intelligence, especially in the application of large language models (LLMs), accuracy has always been a topic of great concern. To improve the performance of LLMs in answering complex questions, Retrieval-Augmented Generation (RAG) technology was born. This technology allows the model to retrieve relevant information from a knowledge base before answering a question, thereby generating more accurate and well-supported answers. However, RAG technology also has some shortcomings, especially in handling the diversity of human language. To address these issues, Lexical Diversity-aware RAG (DRAG) was introduced.

image.png

DRAG technology addresses the weaknesses of RAG, especially in the retrieval and generation stages, by conducting detailed analysis and flexible calibration to enhance accuracy. In the retrieval stage, DRAG uses a "Diversity-Aware Relevance Analyzer" (DRA) to break down the question into three types of components: invariant components, variant components, and supplementary components. DRA sets relevance evaluation criteria based on the characteristics of each component, thus filtering out documents that are more relevant to the core content of the question.

In the generation stage, DRAG introduces a "Risk-Guided Sparse Calibration Strategy" (RSC) to reduce the impact of irrelevant information on the model's answer generation. RSC evaluates the risk of each word, focusing particularly on high-risk words that may be affected by noise, and performs targeted calibration to improve the quality of the final generated results.

Through these two innovations, DRAG technology significantly improves accuracy when dealing with complex questions. Practical tests have shown that the accuracy of models using DRAG is 45.5% higher than that of traditional RAG. This advancement marks an important step forward in artificial intelligence's ability to understand and generate natural language, especially in dealing with diverse language expressions, where DRAG demonstrates stronger capabilities.

As DRAG technology continues to develop, future artificial intelligence models are expected to provide more accurate and reliable answers in more application scenarios.

Key Points:   

📝 DRAG technology improves retrieval accuracy by breaking down question components.   

🔍 The DRA analyzer and RSC calibration strategy work together to reduce the interference of irrelevant information.   

🚀 After using DRAG, the model's accuracy increased by 45.5%.