The data to be translated: DeepEval is a framework designed for evaluating and unit testing language model applications. It offers a variety of metrics to assess the performance of responses generated by language model applications in terms of relevance, consistency, unbiasedness, and toxicity. DeepEval's offline evaluation method is straightforward and easy to use, allowing for quick integration into existing pipelines. It provides multiple built-in evaluation metrics and supports custom evaluation metrics. Through DeepEval's Web UI, engineers can conveniently view and analyze their evaluation results.