A Short PaSh Tutorial

Quick jump: Introduction | Running Scripts | What Next?

This short tutorial covers the pash’s main functionality. Before proceeding, make sure you have installed PaSh

Introduction

PaSh is a system for parallelizing POSIX shell scripts. It has been shown to achieve order-of-magnitude performance improvements on shell scripts.

N.b.: PaSh is still under heavy development.

Example Script

Consider the following spell-checking script, applied to two large markdown files f1.md and f2.md (line 1):

The first cat streams two markdown files into a pipeline that converts characters in the stream into lower case, removes punctuation, sorts the stream in alphabetical order, removes duplicate words, and filters out words from a dictionary file (lines 1–7). A second pipeline (line 7) counts the resulting lines to report the number of misspelled words to the user.

If you’re new to shell scripting, try to run each part of the pipeline separately and observe the output. For example, run cat f1.md f2.md | tr A-Z a-z in your terminal to witness the lower-case conversion.

Visually, the script can be thought as executing sequentially as follows:

The first pipeline (left; parts omitted) processes f1.md and f2.md sequentially through all pipeline stages and writes to out. After it executes to completion, the second pipeline starts its sequential execution.

Parallelizing Scripts with PaSh

PaSh transforms and executes each pipeline in a data-parallel fashion. Visually, the parallel script would look like this for 2x-parallelism (i.e., assuming that the computer on which we execute the script has at least two CPUs and that PaSh is invoked with -w value of 2).

Given a script, PaSh converts it to a dataflow graph, performs a series of semantics-preserving program transformations that expose parallelism, and then converts the dataflow graph back into a POSIX script. The new parallel script has POSIX constructs added to explicitly guide parallelism, coupled with PaSh-provided Unix-aware runtime primitives for addressing performance- and correctness-related issues.

Running Scripts

All scripts in this guide assume that $PASH_TOP is set to the top directory of the PaSh codebase (e.g., /opt/pash on docker)

To run scripts in this section of the tutorial, make sure you are in the intro directory of the evaluation:

In the following examples, you can avoid including $PASH_TOP before pa.sh by adding PASH_TOP in your PATH, which amounts to adding an export PATH=$PATH:$PASH_TOP in your shell configuration file.

Intro: Hello, Parallel World!

The simplest script to try out pash is hello-world.sh, which applies an expensive regular expression over the system’s dictionary file.

To run hello-world.sh sequentially, you would call it using bash:

To run it in parallel with PaSh:

At this point, you might be interested in running pa.sh --help to get a first sense of the available options. Of particular interest is --with or -w, which specifies the degree of parallelism sought by PaSh (e.g., -w 2).

A More Interesting Script: Demo Spell

We will use demo-spell.sh – a pipeline based on the original Unix spell program by Johnson – to confirm that the infrastructure works as expected. We need to setup the appropriate input files for this script to execute:

After inputs are configured, let’s take a quick look at demo-spell.sh:

The script streams the input file into a pipeline that converts characters to lower case, removes punctuation, sorts in alphabetical order, removes duplicate words, and filters out words from a dictionary file.

Next, let’s run it on sequential inputs:

We prefix the script with the time command, which should also output how long it took for the script to execute. On our evaluation infrastructure, the script takes about 41s.

To execute it using pash with 2x-parallelism:

On our evaluation infrastructure, the 2x-parallel script takes about 28s.

You can check that the results are correct by:

Assuming you have more than 8 CPUs, you could also execute it with 8x-parallelism using:

On our evaluation infrastructure, the 8x-parallel script takes about 14s.

To view the parallel code emitted by the compiler, you can inspect the log:

The contents of the parallel script are shown after the line (4) Executing script in ... and for 2x parallelism (--width 2) they should look like this:

rm -f "#file2"
...
mkfifo "#file2"
...
{ cat scripts/input/100M.txt >"#file2" & }
{ tr -cs A-Za-z "\\n" <"#file4" >"#file6" & }
{ /home/eurosys21/pash/runtime/auto-split.sh "#file2" "#file14" "#file15" & }
{ tr A-Z a-z <"#file32" >"#file17" & }
{ tr A-Z a-z <"#file15" >"#file18" & }
{ cat "#file33" "#file34" >"#file4" & }
{ /home/eurosys21/pash/runtime/auto-split.sh "#file6" "#file19" "#file20" & }
{ sort <"#file35" >"#file22" & }
{ sort <"#file20" >"#file23" & }
{ sort -m "#file36" "#file37" >"#file8" & }
{ /home/eurosys21/pash/runtime/auto-split.sh "#file8" "#file25" "#file26" & }
{ uniq <"#file38" >"#file28" & }
{ uniq <"#file26" >"#file29" & }
{ cat "#file39" "#file40" >"#file30" & }
{ uniq <"#file30" >"#file10" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file14" "#file32" "/tmp/pash_eager_intermediate_#file1" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file17" "#file33" "/tmp/pash_eager_intermediate_#file2" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file18" "#file34" "/tmp/pash_eager_intermediate_#file3" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file19" "#file35" "/tmp/pash_eager_intermediate_#file4" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file22" "#file36" "/tmp/pash_eager_intermediate_#file5" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file23" "#file37" "/tmp/pash_eager_intermediate_#file6" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file25" "#file38" "/tmp/pash_eager_intermediate_#file7" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file28" "#file39" "/tmp/pash_eager_intermediate_#file8" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file29" "#file40" "/tmp/pash_eager_intermediate_#file9" & }
{ /home/eurosys21/pash/runtime/eager.sh "#file10" "#file41" "/tmp/pash_eager_intermediate_#file10" & }
{ comm -13 scripts/input/dict.txt "#file41" & }
source /home/eurosys21/pash/runtime/wait_for_output_and_sigpipe_rest.sh ${!}
rm -f "#file2"
...

Note that most stages in the pipeline are repeated twice and proceed in parallel (i.e., using &). This completes the “quick-check”.

What Next?

This concludes the first PaSh tutorial. This section includes pointers for further exploration, depending on your needs.

The PaSh Repo

PaSh consist of three main components and a few additional “auxiliary” files and directories. The three main components are:

  • annotations: DSL characterizing commands, parallelizability study, and associated annotations. More specifically, (i) a lightweight annotation language allows command developers to express key parallelizability properties about their commands; (ii) an accompanying parallelizability study of POSIX and GNU commands. guides the annotation language and optimized aggregator library

  • compiler: Shell-dataflow translations and associated parallelization transformations. Given a script, the PaSh compiler converts it to a dataflow graph, performs a series of semantics-preserving program transformations that expose parallelism, and then converts the dataflow graph back into a POSIX script.

  • runtime: Runtime components such as eager, split, and associated combiners. Apart from POSIX constructs added to guide parallelism explicitly, PaSh provides Unix-aware runtime primitives for addressing performance- and correctness-related issues.

These three components implement the contributions presented in the EuroSys paper. They are expected to be usable with minimal effort, through a few different installation means presented below.

The auxiliary directories are:

  • docs: Design documents, tutorials, installation instructions, etc.
  • evaluation: Shell pipelines and script used for the evaluation of pash.

PaSh Concepts in Depth

Academic papers associated with PaSh offer substantially deeper overviews of the concepts underpinning several PaSh components.