Small Model Training Efficiency Surges 100 Times! Thinking Machine Introduces Online Policy Distillation, OpenAI's Former CTO Likes It Personally
Thinking Machine's online policy distillation boosts small model training efficiency by 50-100x on specific tasks. Combining RL and supervised learning, it overcomes traditional AI training limitations, creating an 'AI coach' that has drawn industry-wide attention.....