CodeCarbon, an open source tool that tracks pollution generated by machine learning research

The damage to the climate caused by greenhouse gas emissions is more than evident and to help the research community to understand the contribution of artificial intelligence to climate change and adopt new research paradigms in which reducing emissions Treated as a critical performance measure, a group of international AI researchers and data scientists have collaborated to design software capable of estimating the carbon footprint of IT operations.

CodeCarbon is open source software designed to help companies monitor their AI carbon footprint.

Comet, a provider of MLOps solutions, has partnered with a consortium of AI and data science companies from around the world: MILA, the AI ​​research lab led by Yoshua Bengio in Montreal, BCG GAMMA, the analytics division and data science from Boston Consulting Group and Haverford College in Pennsylvania, to create open source software.

About CodeCarbon

CodeCarbon is a software python based which will allow programmers to make their code more efficient and reduce the amount of CO2 generated for the use of computing resources and will motivate them to do so.

Software not only estimates the amount of CO2 produced for the use of IT resources, it also gives developers advice on how to reduce emissions selecting your cloud infrastructure in regions that use low energy sources.

Yoshua Bengio, MILA founder and Turing Prize winner, said:

“AI is a powerful technology and a force for good, but it is important to be aware of its growing environmental impact. The CodeCarbon project aims precisely to achieve this goal and I hope it will inspire the AI ​​community to calculate, disclose and reduce their carbon footprint. ”

Sylvain Duranton, Managing Director and Senior Partner at Boston Consulting Group (BCG) and Global Director at BCG GAMMA, said:

“Based on recent history, the use of IT in general, and AI in particular, will continue to grow exponentially around the world. In this context, CodeCarbon can help organizations to ensure that their collective carbon footprint increases as little as possible ”.

In the deep learning-focused research environment, advancements in artificial intelligence are largely being achieved by creating larger models, aggregating larger data sets, and harnessing greater computing power.

Training a powerful learning algorithm can require the use of multiple computers over days or weeks.

For architectures like VGG, BERT, GPT-2 and GPT-3, which have millions of configurations and are trained on multiple GPUs for several weeks, this can be a difference of several hundred kilograms of CO-eq.

OpenAI's GPT-2 launched in 2019 is based on 1.5 billion parameters, while its successor GPT-3 was launched last year, whose 175 billion parameters make it more than 100 times larger than its predecessor. As the larger models continue to advance in the field, the amount of energy consumed to train them will also increase.

CodeCarbon has a tracking mechanism module that records the amount of energy used by major cloud computing providers and privately hosted on-premises data centers.

Then, the system uses data from public sources to estimate the volume of CO2 generated, verifying the statistics of the electrical network to which the equipment is connected.

The tracker estimates the CO2 produced for each experiment using a particular AI module, storing emissions data for projects and for the entire organization.

The idea is that CodeCarbon will help IT and AI companies limit their carbon footprint as they grow. CodeCarbon will generate a dashboard that will allow companies to easily see the amount of emissions generated by training their machine learning models.

The ability to track CO2 emissions represents a significant advance in the ability of developers to use energy resources wisely and therefore reduce the impact of their work in an increasingly fragile environment.


The content of the article adheres to our principles of editorial ethics. To report an error click here!.

Be the first to comment

Leave a Comment

Your email address will not be published.



  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.