With the rapid development of artificial intelligence technology, data intelligence has become a key factor in corporate competitiveness. However, with the frequent emergence of "hallucination" issues in large models, the limitations of multimodal applications due to data bottlenecks, and the challenges in utilizing enterprise private knowledge, the industry urgently needs a more powerful data management system. To this end, Tongfang CNKI Data Science officially launched AIKBase Vector Database Management System V2.0, aiming to provide AI with a smarter "data brain" and reshape the intelligent data infrastructure.

AIKBase V2.0 is a multimodal data management system that combines the advantages of search-based and vector-based systems. It features five core characteristics: domestic independent control, unified management of multi-mode data, millisecond-level vector retrieval, accurate semantic fusion queries, and distributed cluster expansion. These features can comprehensively empower large models and help various industries achieve intelligent upgrades.

WeChat Screenshot_20250808091037.png

In terms of features, the flexible integration capabilities and the compatible retrieval engine of AIKBase V2.0 allow it to easily adapt to any large model, connecting data pipelines for scenarios such as RAG and knowledge enhancement. It fully supports domestic systems such as Kunpeng, Phytium CPU, UnionTech, and Kylin, complying with national information innovation standards, providing dual protection for enterprise data security. The system supports data migration from various mainstream databases, intelligently translating unstructured data into vectors and storing them uniformly in the data warehouse. Whether inserting new data or updating old data, it can perform quick operations. In addition, the vector and scalar fusion retrieval technology of AIKBase V2.0 allows free combinations of vector, scalar, and full-text retrieval, achieving millisecond-level response even with billions of data, accurately understanding "semantics." Its distributed cluster architecture can be easily expanded, supporting scalable architectures for large-scale data, ensuring high-performance retrieval and reliable service, meeting business growth needs.

In performance testing, AIKBase V2.0 was compared with open-source databases such as ANN-Benchmarks, pgvector, Milvus, and ElasticSearch using open-source evaluation tools. The results showed that in terms of maximum throughput QPS under 90% recall rate queries, AIKBase V2.0 outperformed the aforementioned open-source databases. At the same time, it had higher data write throughput and shorter index construction time, demonstrating its advantages of "fast storage, accurate search, and quick response."

The application scenarios of AIKBase V2.0 are extensive, providing private knowledge bases for large models to "eliminate hallucinations," making generated results more accurate and timely. It supports multimodal retrieval, achieving second-level semantic association between text, images, and videos, supporting text-to-image and image-to-text cross-modal retrieval. In addition, the hybrid retrieval function of AIKBase V2.0 combines the "semantic understanding" of vector retrieval with the "precise matching" of full-text retrieval, significantly improving the accuracy of retrieval results.

Currently, AIKBase has been deeply integrated into the CNKI product matrix, providing strong support for core functions such as AI-enhanced retrieval and academic research assistants. The multimodal hybrid retrieval makes knowledge acquisition more intelligent, and the millisecond-level response speed enhances the user experience efficiently.