Cursor is an AI-powered coding platform that recently announced an upgrade to its Tab model. The Tab model is a system that provides developers with automatic code completion suggestions. This upgrade significantly reduced the number of low-quality suggestions and improved the accuracy of the suggestions. Specifically, the new Tab model reduced the number of suggestions by 21% compared to the previous version, while increasing the acceptance rate by 28%.

image.png

Cursor stated in its blog that achieving a high acceptance rate is not just about making the model smarter, but also about knowing when to provide suggestions and when not to. To address this challenge, Cursor considered training a separate model to predict whether a suggestion would be accepted. The company cited a 2022 study that showed this approach was successful in GitHub Copilot. The study used a logistic regression filter to analyze features such as programming language, recent acceptance history, and training characters, hiding those suggestions with lower scores.

However, Cursor believes that although this solution can predict the probability of a user accepting a suggestion, it hopes for a more general mechanism that can reuse the powerful code representations learned by the Tab model. Cursor aims to avoid generating low-quality suggestions in the first place by modifying the structure of the Tab model, rather than filtering them afterward.

Therefore, Cursor adopted a policy gradient method, which is a form of reinforcement learning. When users accept suggestions, the model receives a reward; when suggestions are rejected, the model is penalized; and when the model chooses to remain silent, it receives no feedback. This method requires "online" data, i.e., feedback collected from the currently used model. Cursor solves this issue by deploying new checkpoints to users multiple times a day and quickly retraining the model based on new interactions.

Cursor stated that the current process from deploying checkpoints to collecting data takes only 1.5 to 2 hours, which is already relatively fast in the AI industry, but there is still room for further acceleration. The company's Tab model processes over 400 million requests per day, and Cursor hopes this improvement will enhance the coding experience for developers and plans to further develop these methods in the future.

Online reinforcement learning is one of the most exciting directions in this field. An engineer working on post-training at OpenAI praised this on social media, stating that Cursor seems to be the first company to successfully implement this technology at scale.

Not long ago, Cursor's parent company Anysphere announced a $900 million funding round, valuing the company at $9.9 billion, and launched a new "value-packed" plan that costs $200 per month, promising 20 times the usage of the $20 per month "Professional" plan. Additionally, Cursor also updated its platform in the same month, adding features such as automatic code review, memory functionality, and one-click setup of the model context protocol server.

Key points:

🌟 After the upgrade, the Cursor Tab model reduced the number of suggestions by 21% and increased the acceptance rate by 28%.  

🤖 It uses real-time reinforcement learning, allowing the model to self-adjust based on user feedback.  

💰 The parent company of Cursor, Anysphere, raised $900 million and launched a new plan and feature improvements.