PaSh passes into the hands of the Linux Foundation

Several days ago the PaSh project (which develops tools for parallel execution of shell scripts) and the Linux Foundation have announced that the project will pass to the latter which will provide the infrastructure and services necessary to continue development.

And is that PaSh has made great strides in parallelizing shell scripts, achieving significant performance improvements. On modern multiprocessor computers, PaSh can perform tasks such as web crawling and indexing, COVID19-related analytics, natural language processing, and other workloads in a fraction of its original time.

The Linux Foundation, the non-profit organization that enables massive innovation through open source, announced today that it will host the PaSh project. PaSh is a system for automatically parallelizing POSIX shell scripts that optimizes programs and speeds up execution times, generating faster results for data scientists, engineers, biologists, economists, administrators, and programmers.

The project is supported by MIT, Rice University, Stevens Institute of Technology, and the University of Pennsylvania and is governed by a Technical Steering Committee that includes Nikos Vasilakis, a research scientist at MIT; Michael Greenberg, assistant professor at the Stevens Institute of Technology; and Konstantinos Kallas, Ph.D. student at the University of Pennsylvania.

pash includes a JIT compiler, runtime, and annotation library:

  • Runtime for its part provides a set of primitives to support parallel execution of scripts.
  • The annotation library is the one that defines a set of properties that describe situations in which individual POSIX and GNU Coreutils commands can be parallelized.
  • While the compiler is in charge of performing the analysis of the proposed Shell script on the fly in an abstract syntax tree (AST), it divides it into fragments suitable for parallel execution and forms, based on them, a new version of the script, parts of which can be run simultaneously.
    The compiler takes the information about the commands that can be parallelized from the annotation library. In the process of generating a parallel executable version of the script, additional Runtime constructs are substituted in the code.

"The Linux Foundation provides the technical governance infrastructure and services that PaSh has come to require as it has grown more mature," said Nikos Vasilakis, Chairman of the PaSh Project Technical Steering Committee. "We built the project to improve and speed up shell script execution in the face of new crawling, indexing, and natural language processing changes."

"Shell scripts have been widely used for half a century, and recent trends toward 'containerization' have only increased in importance," said Michael Greenberg, member of the PaSh Project Technical Steering Committee. “Correct and automated parallelization of shell scripts has been a problem for several decades. PaSh promises a speed boost for shell users of all kinds.

To speed up shell scripts, PaSh provides a source-to-source parallelization compiler, a program that takes a programmer's shell script as input and returns a new program that is significantly faster than the original program. 

Since PaSh is source to source, allows optimized shell script to be inspected and executed using the same tools, in the same environment and with the same data as the original script. 

A small runtime library and associated annotations in programs commonly used in shell scripts complete the picture, providing the PaSh compiler with high-performance primitives and supporting its key functions.

"The PaSh Project represents innovation in computer science and open source software," said Mike Dolan, general manager and senior vice president of Projects at the Linux Foundation. “As software development evolves to address machine learning, containerization, artificial intelligence and more, PaSh appears to support developers and data scientists who need more from their scripting tools. We are happy to host this important work at the Linux Foundation, a natural home for a project like this.

Finally if you are interested in knowing more about it of the note, you can consult the details in the following link.


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: Miguel Ángel Gatón
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.