Genie, an AI programmer released by Cosine, has scored an impressive 30.08% on the authoritative testing platform SWE-Bench, leading by a significant margin over Devin's 13.8% and Swe-agent+GPT-4's 12.47%. It is hailed as the world's strongest AI programmer.

The birth of Genie was not achieved overnight. As early as December 2022, Alistair Pullen, co-founder of Cosine, showcased the prototype of Genie at a roadshow at the University of London. His original intention was to develop an AI robot capable of end-to-end automatic coding and optimization without any human intervention.

image.png

Genie's significant lead over other well-known products is closely tied to its unique training data and methods. Unlike conventional large-model fine-tuning, Genie uses a special dataset that includes the reasoning processes of human programmers, such as the complete transmission of information, gradual discovery of knowledge, and decision-making steps based on real cases.

QQ截图20240813103417.jpg

Genie employs a unique "self-improvement mechanism" during training. Initial training on a large amount of high-quality data brings the model to an "optimal" state. Subsequently, developers use Genie to generate synthetic data, which is injected into subsequent model training to enrich errors and complex situations. This process is akin to a mother teaching a child to walk; whenever Genie stumbles or adopts an incorrect posture, it receives timely corrections.

QQ截图20240813103503.jpg

Genie's functionalities cover feature development, bug fixing, code refactoring, minor code changes, code testing, and writing code documentation and updates. It supports dozens of mainstream programming languages including JavaScript, Python, Java, C#, C++, and more, almost encompassing all programming needs.

Renowned developer Mckay expressed his high expectations for Genie and hopes to test the product soon. He currently has access to Devin and can easily evaluate Genie's performance.

Although Genie is currently in the application testing phase, Alistair has indicated that registration is now open for applications, with test permissions to be issued within the next 2-3 weeks, along with some surprise features at launch.

Experience URL: https://cosine.sh/register