At the beginning of the month, the Manjaro Linux developers (an Arch Linux-based distribution) they made known the news that they have started testing a new telemetry system called MDD (Manjaro Data Donor).
The developers mention that this service is designed to collect statistics from the system and send them to a central project server, this in order to obtain more precise data on the actual number of users and their settings.
About «MDD»is raises the idea that telemetry was enabled by default, following an opt-out model (users would have to manually deactivate it). However, initially, This idea has been criticized both by some developers and by the user community, who consider that it could go against the principles of privacy and transparency that many users value in Linux.
It is because of that In response to these concerns, it appears that “Manjaro Data Donor” is more likely be implemented under an opt-in model, where the user explicitly gives their consent. The idea is to include an option to activate MDD as part of the welcome wizard after the first login.
As for motivations, It is mentioned that currently, Manjaro estimates the number of users by analyzing requests sent to their server via NetworkManager. While this method allows them to obtain data, They are not precise enough, as they have several limitations:
- It does not provide an exact estimate due to the dynamics of IP addresses (users with dynamic IPs or working behind NATs).
- It does not allow for reliable statistical tracking of users over time.
It is because of that The project seeks with the new system MDD overcome these limitations by collecting more precise and specific data about your system, such as your hardware configuration, desktop environments used, and Manjaro versions in use.
If implemented correctly, MDD could offer significant benefits to the development team, helping them to:
- Prioritize features and optimizations based on the most commonly used hardware and environments.
- Gain a better understanding of distribution growth and usage.
- Analyze usage trends to adjust distribution development.
- Improve planning for new releases, based on observed performance in various configurations.
As for how MDD will work, mentions that MDD uses the inxi tool, running it with the -Fxxx parameter, which generates a detailed system report. This report includes:
- General Information host name, kernel version, and desktop component versions.
- Hardware: data about processor, GPU, RAM, storage, partitions, and disks (including serial numbers).
- Screens: size, resolution and configuration.
- Net: MAC addresses of network devices.
- Software and processes: base tool versions (such as systemd, gcc, bash, PipeWire), installed packages, and number of running processes.
On the part of the Privacy and anonymization, due to concerns about potential privacy risks, it is mentioned that the level of detail of the data collected, such as disk serial numbers or MAC addresses, It is ensured that the data is anonymized and no IP addresses are stored, The inclusion of certain sensitive elements could be interpreted as unnecessary.
As to How Manjaro will work with data collected, it is mentioned that These will be sent to the project server, where they are stored in a database managed by ClickHouse DBMS, while the analysis and visualization of these statistics is done through Grafana, a tool known for its ability to create interactive and dynamic dashboards.
Finally, it is mentioned that users interested in reviewing the data that will be sent can run the tool to see what will be sent:
mdd --dry-run
And if the user agrees to the data being transmitted, they simply need to run MDD again, this time without any arguments to send the data:
mdd
If you are interested in knowing more about it, you can check the details in the following link