Recently, AIbase obtained the latest information from social media platforms and learned about Shisa.AI, a provider of HuggingFace models that specializes in Japanese fine-tuning. Their newly released bilingual Japanese-English model has drawn significant attention in the industry. This article will provide you with a detailed interpretation of Shisa.AI's latest achievements and its groundbreaking progress in the field of Japanese AI.

111.jpg

Shisa V2405B: The Birth of Japan's Strongest Open Source Model

According to AIbase, Shisa.AI has recently released the Shisa V2405B model based on Llama3.1. This open-source model is hailed as "the strongest large language model ever trained in Japan." The model not only performs excellently in Japanese tasks but also retains strong English processing capabilities, showcasing the outstanding performance of a bilingual model.

Test data shows that Shisa V2405B surpasses GPT-4 and GPT-4 Turbo in multiple Japanese benchmark tests, and is on par with the latest GPT-4o and DeepSeek-V3 in Japanese tasks. This breakthrough marks the rise of Japan's domestic AI laboratories in the global AI competition, opening up new possibilities for Japanese AI applications.

Focused on Japanese Optimization, Fine-Tuning Technology Upgraded

Shisa.AI is a Tokyo-based startup specializing in developing and deploying advanced open-source AI language and speech models for the Japanese market. AIbase learned that compared to earlier models, the Shisa V2 series abandoned expensive continuous pre-training and tokenizer expansion, focusing instead on optimizing the post-training process. By using a synthetic data-driven approach, they significantly improved the model's performance.

Their core dataset, ultra-orca-boros-en-ja-v1, which has been filtered, regenerated, and resampled, is considered one of the most powerful bilingual Japanese-English datasets currently available, suitable for enhancing the Japanese capability of almost any base model. This dataset has been made freely available under the Apache2.0 license, providing valuable resources for developers worldwide.

Widely Applicable Model Family, Covering 7B to 405B Parameters

The Shisa V2 series covers models of varying sizes, ranging from 7B to 405B parameters, meeting diverse needs from lightweight devices to high-performance computing. AIbase learned that these models perform exceptionally well in tasks such as Japanese grammar, role-playing, and translation. They outperform their respective base models in tests like shisa-jp-ifeval (Japanese instruction-following test), shisa-jp-rp-bench (Japanese role-playing benchmark), and shisa-jp-tl-bench (Japanese-English translation benchmark).

Notably, Shisa V2405B incorporates a small amount of Korean and traditional Chinese data during training, further enhancing its multilingual capabilities and offering more possibilities for cross-language applications.

Open Source Spirit Driving Global AI Innovation

Shisa.AI's efforts have not only enhanced the performance of Japanese AI but also promoted the development of the global AI community through open-source initiatives. AIbase noticed that the training logs for the Shisa V2 series are publicly available on the Weights and Biases platform. The training process used an AWS SageMaker 4-node H100 cluster combined with advanced technologies like Axolotl, DeepSpeed, and Liger Kernel, ensuring efficient model development.

In addition, Shisa.AI plans to open-source its Japanese-specific benchmarking tools, aiding research and evaluation of large Japanese language models and providing more support for developers worldwide.

Future Prospects: Japan's Global Competitiveness in AI

Shisa.AI's success demonstrates that even small AI labs can secure a place in the global AI competition. The release of their open-source models and datasets provides strong support for the popularization of Japanese AI applications. AIbase believes that as Shisa.AI continues to update its models and resources, Japan's position in the global AI landscape will be further solidified.

For developers with complex Japanese task requirements, the Shisa V2 series is undoubtedly a powerful tool worth trying. AIbase recommends following Shisa.AI's official website and HuggingFace page for more technical details and opportunities to experience the models.

Through its Shisa V2 series models, Shisa.AI showcases Japan's innovative strength in the AI field. Whether for academic research or commercial applications, these open-source models pave the way for the future development of Japanese AI.