AlphaСode, a code generation AI

DeepMind, known for its developments in the field of artificial intelligence and building neural networks capable of playing computer and board games at the human level, recently unveiled the AlphaCode project which describes how a machine learning system for code generation that you can participate in programming competitions on the Codeforces platform and demonstrate an average result.

It is mentioned that the project uses the “Transformer” neural network architecture in combination with other sampling and filtering methods to generate various unpredictable code variants corresponding to natural language text.

The method how it works AlphaСode is based on filtering, grouping and sorting, whereupon it then proceeds to select the most optimal working code from the generated stream of options, which is then checked to ensure that the correct result is obtained (in each task of the competition, an example of input data and a corresponding result) to this example, which should be obtained after the execution of the program).

We detail AlphaCode, which uses transformer-based language models to generate code at an unprecedented scale, then intelligently filters out a small set of promising programs.

We validate our performance using competitions hosted on Codeforces, a popular platform that hosts regular competitions that attract tens of thousands of entrants from around the world who come to test their coding skills. We selected 10 recent contests for evaluation, each newer than our training data. AlphaCode was roughly level with the average competitor, marking the first time an AI code generation system has reached a competitive level of performance in programming competitions.

For approximate system training machine learning, it is highlighted that the base code available in the public GitHub repositories was used. After preparing the initial model, an optimization phase was carried out based on a collection of code with examples of problems and solutions offered to the participants of the Codeforces, CodeChef, HackerEarth, AtCoder and Aizu contests.

In total, for the formation of AlphaCode 715 GB of GitHub code used and more than a million examples of solutions to typical problems of the competition. Before proceeding to code generation, the text of the task went through a normalization phase, in which everything superfluous was excluded and only the significant parts remained.

To test the system, 10 new Codeforces contests with more than 5.000 participants were selected, held after completing the training of the machine learning model.

I can safely say that the results of AlphaCode exceeded my expectations. I was skeptical because even in simple competitive problems, it is often required not only to implement the algorithm, but also (and this is the hardest part) to invent it. AlphaCode managed to perform at the level of a promising new competitor. I can't wait to see what's to come!



The results of the assignments allowed for the AlphaCode system to enter approximately in the middle of the qualification of these competitions (54,3%). AlphaCode's predicted overall score was 1238 points, guaranteeing entry into the Top 28% among all Codeforces participants who participated in competitions at least once in the last 6 months.

It should be noted that it is observed that the project is still in the initial stage of development and that in the future it is planned to improve the quality of the generated code, as well as to develop AlphaСode towards systems that help to write code, or application development tools that people without programming skills can use.

Finally if you are interested in knowing more about it, you should know that a key development feature is the ability to generate code in Python or C++, taking as text input a statement of the problem in English.

You can check the details In the following link.

The content of the article adheres to our principles of editorial ethics. To report an error click here!.

Be the first to comment

Leave a Comment

Your email address will not be published.



  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.