They propose a compiler for Python that uses the Copy-and-Patch technique

python logo

Python is a high-level programming language.

Recently one of the main CPython developers unveiled a new JIT compiler for Python using the Copy-and-Patch technique, which is a recent and innovative compilation technique that stands out for its speed, ease of maintenance and its complete integration with the existing interpreter.

Copy and Patch is based on the use of a predefined library of binary code fragments known as "templates" to output optimized machine code. These templates are pre-built implementations of AST (Abstract Syntax Tree) nodes or byte opcodes that contain missing values, such as immediate literals, stack variable offsets, and branch and call targets.

It allows you to systematically generate variants of binary templates in C++ in a clean and pure way. Uses the Clang+LLVM compiler infrastructure to hide specific platform details at a low level.

During runtime, optimization and code generation become simpler tasksl Find a data table that contains the appropriate template, create an instance of it, and place it in the desired position using the Copy-and-Patch process, adjusting any missing values ​​to be patched at runtime.

Looking at it from a simpler perspective, it consists of compiling (Copy) the existing source code and adjusting missing values ​​or specific modifications (Patch).

Copy-and-Patch greatly facilitates the automatic conversion of an interpreter written in the C language in a JIT compiler, eliminating the need to create code generation logic and compilation representations separately. By using a common code generator, fixing errors in the interpreter results in automatically solving the same problems in JIT.

The Copy-and-Patch approach relies on the similarity between relocating code in memory When the linker loads object files and substituting machine instructions instead of bytecode in JIT are similar tasks. During program execution, lThe bytecode instructions generated by the interpreter are listed, and precompiled machine code is copied for each instruction into an executable memory area, then of this the instructions They are dynamically modified to replace processed data in real time. In the case of JIT, predefined templates are copied from already compiled functions and replaced with the necessary values, such as arguments and constants).

The implementation of a JIT with the Copy-and-Patch technique involves compiling an object file in ELF format orusing LLVM. This objected file contains information about the bytecode instructions and details about the necessary data replacement. During execution, JIT replaces the bytecode instructions generated by the interpreter with machine code representations, simultaneously adjusting the data necessary for calculations. Although the JIT implementation requires LLVM as a dependency during compilation, the runtime components are not tied to external dependencies, reducing to approximately 300 lines of handwritten C code and 3000 lines of generated C code.

In terms of performance, the proposed JIT with the Copy-and-Patch technique presents notable improvements compared to traditional approaches. When contrasted with conventional JITs (LLVM -O0), it stands out for 100 times faster code generation and a resulting code that It is 15% more efficient. In the area of ​​WebAssembly compilation (Liftoff), the new JIT demonstrates 5x faster code generation, and the resulting code runs 50% faster.

When compared to an optimization JIT such as LuaJIT, which uses manually written assembly code, the proposed JIT outperformed in 13 of 44 tests. Although on average it fell behind in performance by 35%, it is essential to highlight that this difference is offset by a significant simplification in maintenance and a reduction in implementation complexity. This balance between performance and efficiency in code management positions the proposed JIT as an attractive alternative in the performance landscape.

Finalmenand if you are interested in learning more about it, you can check the details In the following link.


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.