Apple and the Ohio State University research team have released the FS-DFM (Few-Step Discrete Flow-Matching) model. This innovative language model excels in long text generation, producing text quality comparable to traditional models that require thousands of iterations with just 8 rapid iterations. At the same time, its writing speed can be increased by up to 128 times, breaking previous efficiency bottlenecks in long text generation.

image.png

The design philosophy of the FS-DFM model differs from mainstream language models. Self-regressive models like ChatGPT generate text word by word, with each word depending on the previous content. In contrast, diffusion models use a parallel strategy, generating multiple words at once and gradually refining the result through multiple iterations. The FS-DFM further simplifies the diffusion model, aiming to achieve high-quality text generation with fewer steps.

To achieve this breakthrough, the Apple research team proposed a clever three-step approach. First, the model is specially trained to flexibly adapt to different numbers of iterations. Second, they introduced a "teacher" model to guide the process, ensuring that updates in each iteration are both significant and accurate, thus avoiding over-adjustment issues. Finally, the team optimized the iteration mechanism, allowing the model to generate the final text with fewer and more robust steps.

In performance evaluation, the FS-DFM was compared with the Dream model with 7 billion parameters and the LLaDA model with 8 billion parameters. Test results showed that even though the FS-DFM has only 170 million to 1.7 billion parameters, it demonstrated lower perplexity (a measure of text accuracy and fluency, the lower the better) and more stable entropy (a measure of the model's confidence in word selection). These results prove the potential of the FS-DFM model in AI long-form writing.

Project: https://machinelearning.apple.com/research/fs-dfm

**Key Points:**  

📝 **The FS-DFM model requires only 8 iterations to generate text quality comparable to traditional models with thousands of iterations.**  

🚀 **Writing speed is improved by up to 128 times, significantly increasing the efficiency of long text generation.**  

🔍 **Performance tests show that the FS-DFM outperforms other large models in key metrics such as perplexity and entropy.**