OpenBytes, a new LSF project that aims to make open data more available and accessible

During the Linux Foundation Membership Summi, the Linux Foundation unveiled two major new projects «OpenBytes and the NextArch Foundation ».

One of them "OpenBytes" is the product of a partnership with the Graviti dataset management platform and states that the project promises to be an 'open data community' as well as a new standard and data format primarily for artificial intelligence applications, while NextArch, spearheaded by Tencent, is dedicated to creating software development architectures that support a variety of environments.

The goal of the OpenBytes project is to reduce legal risks for organizations and individuals interested in sharing their data sets with other AI / ML projects. Data controllers are often hesitant to share their data sets due to concerns about licensing restrictions.

According to the Linux Foundation, being able to assure data administrators that their data rights are protected and that their data will not be misused will help make more data sets open and accessible.

“The OpenBytes project and community will benefit all AI developers, academics and professionals by alike, large and small companies, as they provide access to higher quality open data sets and make companies
AI deployments are faster and easier, ”said Mike Dolan, general manager. and senior vice president of projects
from the Linux Foundation.

The legal risks of artificial intelligence and machine learning can be seen in several recent lawsuits. Last year, for example, IBM was charged with violating the Illinois Biometric Privacy Act when it used photographs of the plaintiff in its Variety of Faces dataset. Additionally, separate lawsuits were filed last year against Amazon, Google, Microsoft, and facial recognition company FaceFirst for allegedly using this data set to train their facial recognition algorithms.

Based on this, OpenBytes will enable a community of developers and data scientists, led by Graviti, to create standards and data formats that allow everyone to contribute.

"For a long time, dozens of artificial intelligence projects have been held back by a widespread lack of high-quality data from real-world use cases," says Edward Cui, founder of Graviti and a former machine learning expert in the group of advanced technologies from Uber. . “Acquiring better data is essential for AI development to advance. To achieve this, there is an urgent need to create an open data community based on collaboration and innovation. Graviti believes that it is our social responsibility to play our role.

When creating an open data format and standard, The OpenBytes project can reduce liability risks for data contributors. Owners of data sets are often reluctant to share them publicly due to their lack of knowledge of the different data licenses. If data providers understand that ownership of your data is well protected and will not be misused, more open data will be accessible.

The OpenBytes project will also create a standard format for published, shared and exchanged data on your open platform. A unified format will help data providers and consumers easily find the relevant data they need and facilitate collaboration. These OpenBytes features will make high-quality data more available and accessible, which is valuable to the entire AI community and will save resources on repetitive data collection.

“The OpenBytes project and community will benefit all AI developers, whether academic or professional, in companies large or small, by enabling access to more high-quality open data sets and making AI implementation more quick and easy, ”says Mike Dolan, CEO and senior vice president of projects for the Linux Foundation.

Finally, if you are interested in knowing more about it, you can consult the details In the following link.

The content of the article adheres to our principles of editorial ethics. To report an error click here!.

Be the first to comment

Leave a Comment

Your email address will not be published.



  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.