Finnish Modern BERT is a multilingual encoder model based on the modern BERT architecture, pre-trained on Finnish, Swedish, English, code, Latin, and North Sami. The model is trained on 400 billion tokens and supports a context length of up to 128,000 tokens. It is specifically designed for handling Finnish official languages and long document scenarios.
Natural Language Processing
TransformersMultiple Languages