Sapling, a Git-compatible source code control system

sapling

Sapling emphasizes ease of use while scaling to the world's largest repositories.

Facebook unveiled through a blog post the source code management system sapling used in the development of internal projects of the company. The system aims to provide a version control interface familiar that can scale to very large repositories spanning tens of millions of files, commits, and branches.

The main idea of ​​the system is that by interacting with a special part of the server that provides repository storage, all operations scale based on number of files actually used in the code the developer is working on, and are not dependent on the total size of the entire repository.

For example, a developer may use only a small portion of code from a very large repository, and only this small portion, and not the entire repository, will be transferred to their system. The working directory is filled dynamically, as the repository files are accessed, which, on the one hand, allows you to significantly speed up the work with your part of the code, but on the other hand, slows it down when you access it for the first time to new files and requires constant network access (provided separately and offline-commit preparation mode).

In addition to adaptive data loading, Sapling also implements optimizations aimed at reducing the information load with a history of changes. (for example, 3/4 of the data in a repository with the Linux kernel is change history).

To work effectively with the change history, the data associated with it is stored in a segmented view, which allows you to download separate parts of the commit graph from the server. The client can ask the server for information about the relationship of several confirmations and download only the necessary part of the graph.

The project has been in development for the past 10 years and was created to solve problems when accessing very large monolithic repositories with a master branch, where the practice of using the "rebase" operation instead of "merge" was practiced.

At that time, there were no open solutions for working with such repositories, and Facebook engineers decided to create a new version control system that would meet the company's needs, rather than split projects into small repositories, which would lead to more complicated dependency management (at one time, to solve a similar problem, Microsoft created GVFS layer).

Initially, Facebook used the Mercurial system and the Sapling project was initially developed as an addition to Mercurial. Over time, the system became an independent project with its own protocol, storage format, and algorithms, which was also extended with the ability to interact with Git repositories.

For work, the command line utility "sl" is proposed, which implements typical concepts, workflows, and an interface familiar to developers familiar with Git and Mercurial. The terminology and commands in Sapling are slightly different from Git and closer to Mercurial.

Among the additional features of Sapling, highlights the support for “smart registration” (smartlog), which allows you to visually assess the status of your repository, highlight the most important information and filter out minor details. For example, when you run the sl utility with no arguments, only your own local changes are displayed (foreign ones are collapsed), the status of external branches, changed files, and new versions of commits are displayed. In addition, an interactive web interface is provided for quick navigation through the smart log, change tree, and commits.

Another notable improvement in Sapling is that it makes the process of fixing and analyzing errors and reverting to a previous state much easier. For example, the commands "sl undo", "sl redo", "sl uncommit" and "sl unmend" are suggested for reversing many operations, "sl hide" and "sl unhide" for temporarily hiding commits and for interactive navigation. through states Sapling also supports the concept of a commit stack, which allows you to organize a review step by step by breaking down complex functionality into a smaller, more understandable incremental set of changes (from a basic framework to a final feature). .

Separately, a server part was developed for effective remote work with repositories and a virtual file system to work with a local portion of a part of the repository as if it were a complete repository (the developer sees the entire repository, but only the requested data is copied to the local system, which is accessed).

The code for these components used in Facebook's infrastructure is not open yet, but the company has promised to release it in the future. However, the Mononoke server (in Rust) and VFS EdenFS (in C++) prototypes can already be found in the Sapling repository. These components are optional and the Sapling client is enough to work with, which supports cloning Git repositories, interacting with Git LFS-based servers, and working with git hosts like GitHub.

Several plugins have been prepared for Sapling, including the ReviewStack interface for reviewing changes (code under GPLv2), which allows you to process pull requests on GitHub and use a change stack view.

If you are interested in knowing more about it, you can consult the details In the following link.


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.