micron Technology (a company specialized in the production of DRAM and flash memory) unveiled the introduction of a new engine called "HSE" (Heterogeneous-memory Storage Engine), which was developed with the specifics of use in mind on NAND flash based SSD drives (X100, TLC, QLC 3D NAND) or read-only memory (NVDIMM).
The engine is made in the form of a library to embed in other applications and supports data processing in key-value formatr. The HSE code is written in C and distributed under the Apache 2.0 license.
Among the motor applications, applications for low-level data storage are mentioned in NoSQL DBMS, software warehouses (SDS, software-defined storage) such as Ceph and Scality RING, platforms for processing large amounts of data (Big Data), high-performance computing systems (HPC), Internet of things (IoT) devices and solutions for machine learning systems.
HSE is optimized not only for maximum performance, but also to ensure the durability of various kinds of SSD drives. The high speed was achievedthrough a hybrid storage model: the most relevant data is cached, reducing the number of disk accesses.
As an example of integrating the new engine into third-party projects, a document-oriented version of the MongoDB DBMS was prepared, which was translated to use HSE.
Technologically, HSE is based on an additional kernel module mpool, which implements a specialized interface for storing objects for solid state drives, taking into account their capabilities and characteristics, allowing to obtain fundamentally different characteristics of speed and durability. Mpool is also a Micron Technology development opened concurrently with the HSE, but stands out as a separate infrastructure project. Mpool assumes the use of persistent memory and zone storage, but currently only traditional SSDs are supported.
Performance testing with the YCSB package (Yahoo Cloud Serving Benchmark) showed a significant increase in performance when using 2TB storage with 1KB data block processing. A particularly significant performance increase is observed in the test with an even distribution of read and write operations.
For example, MongoDB with the HSE engine turned out to be about 8 times faster than the version with the standard WiredTiger engine, and the RocksDB DBMS engine outperformed the HSE more than 6 times. Excellent indicators are also visible in the tests, which show 95% of read operations and 5% of changes or additions.
Another test performed involves only read operations, it shows a profit of about 40%. The increase in survivability of SSDs during write operations compared to the RocksDB-based solution is estimated at 7 times.
Key Features of HSE:
- Support for standard and advanced operators to process data in key / value format;
- Full transaction support and with the ability to isolate storage segments by creating snapshots (snapshots can also be used to maintain separate collections in a store).
- Ability to use cursors to traverse data in snapshot-based representations.
- A data model optimized for mixed load types in a single repository.
- Flexible mechanisms to manage storage reliability.
- Customizable data orchestration schemes (distribution through different types of memory present in the repository).
- A library with a C API which can be dynamically linked to any application.
The ability to scale to terabytes of data and hundreds of billions of keys in storage.
- Effective processing of thousands of parallel operations.
- Significant increase in bandwidth, reduced latency, and increased read / write for various types of workloads compared to typical workarounds.
- The ability to use different classes of SSDs in the same storage to optimize performance and durability.
You can access the engine code from the link below.