Masakhane, an open source project that enables machine translation of more than 2000 African languages


When we usually hear about open source projects in most cases programs come to mind or utilities for the purposes of daily work. Although it is not the case as such, since open source covers many more areas.

One of them is artificial intelligence that is currently growing in an incredible exponential way, despite the fact that a few years ago it was believed that it would be something that would develop well many years later.

Artificial intelligence (AI) is currently used for various cases, of which the most popular are for the detection of objects, people, patterns among other things. It is also used within translators, many of which are patented by companies.

But in this case we will talk about an open source project which has aroused the interest of many since is developed to cover a great need in the African territory, which is communication since it is currently estimated that in Africa there are around 2000 languages.

Masakhane a project that must be fulfilled for the common good

The project we will talk about is "Masakhane" which is a project that was founded by South African IA researchers Jade Abbott and Laura Martinus and the project is collaborating with AI researchers and data scientists from across Africa.

When they met at a conference related to machine learning and natural language processing (NLP) this year, they discussed a project to translate African languages ​​into machine learning models and started Masakhane. The name of the project "Masakhane" is a word that means "to do together" in Zulu.

Languages ​​that allow machine translation in Masakhane include not only native languages Africans, but also the Nigerian dialect Pidgin in English and Arabic spoken in North and Central Africa. Unlike European languages, these languages ​​do not have specific reference points or large data sets.

In addition to the importance of multiple opportunities for Africans, the benefits of developers participating in Masakhane are listed as "The success of Africans AI projects is an African AI researcher." It could lead to relaxed restrictions.

Actually in Masakhane has around 60 developers in Africa (South Africa, Kenya and Nigeria) of which each participant collects data in their native language and trains the model.

In Kenya, English is often used in schools and other public places, but in everyday life different languages ​​are used for each tribe, so Siminyu felt there was a communication gap. Was. Therefore, AI developer Siminyu decided to join Masakhane.

Siminyu believes that the translation of African languages ​​using machine learning will lead to a growth in the use of AI in Africa, helping people in Africa to use AI in their lives. Siminyu argues that projects across the continent, like Masakhane, they are important for connecting African developers and research communities for long-term and sustainable collaboration.

“Language differences are a barrier, and removing the language barrier will allow many Africans to participate in the digital economy and, ultimately, the AI ​​economy. "I feel that it is the responsibility of those who participate in Masakhane to get people who are not involved in the AI ​​society," said Siminyu.

The assistants by Masakhane say the developer community in Africa is expanding rapidly and that the benefits of machine translation for African languages ​​are significant.

We can solve the problem. We have experts, we have knowledge and intelligence… I think they will become a foothold to contribute to the world. Says an African developer.

Finally, if you want to know more about the project, you can check the details on its official website. The link is this. 

The content of the article adheres to our principles of editorial ethics. To report an error click here!.

Be the first to comment

Leave a Comment

Your email address will not be published.



  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.