Stacks¶
A stack is a reusable software environment (Python venv, Julia depot, Conda env, …) installed once on a backend and reused across many task runs. ScriptHut handles the bookkeeping: detecting when the stack needs to rebuild, where to put it on each backend, and how to expose it to tasks.
You write the install script. ScriptHut hashes its inputs, caches the result, and runs the script again only when something changes.
When to use one¶
You probably want a stack when:
- A task's first step is "set up the runtime" (
pip install -r requirements.txt,julia --project=. -e 'Pkg.instantiate()',conda env create -f env.yml). - The same runtime is needed by many tasks (a parameter sweep, a multi-step pipeline).
- You want collaborators to get the same environment without coordinating manual setup.
You don't need one if the task's runtime is already pre-built (a container image with everything baked in, or modules pre-installed on the cluster).
Defining a stack¶
Stacks live under the top-level stacks: section of scripthut.yaml. They can be defined globally or in a project-local config (see From a project).
stacks:
- name: julia-1.11
backends: [mercury-nb, pythia-nb] # empty = every SSH backend
cache_dir: /scratch/me/stacks # parent dir on the backend
inputs:
julia_version: "1.11.3" # literal values; any change forces rebuild
input_files:
- Manifest.toml # contents hashed; resolved relative to this yaml
- Project.toml
prep: |
module load julia/1.11.3
mkdir -p ${STACK_DIR}/depot
JULIA_DEPOT_PATH="${STACK_DIR}/depot:" \
julia --project=. -e 'using Pkg; Pkg.instantiate()'
init: |
module load julia/1.11.3
export JULIA_DEPOT_PATH="${STACK_DIR}/depot:"
Fields¶
| Field | Type | Default | Description |
|---|---|---|---|
name |
string | required | Identifier referenced by CLI commands. |
backends |
list of strings | [] (all SSH backends) |
Which backends this stack is available on. Non-SSH backends (Batch, EC2) are skipped silently for now. |
cache_dir |
string | ~/.cache/scripthut/stacks |
Parent directory on the backend. ~ is expanded remotely, so it refers to the backend user's $HOME. |
inputs |
dict | {} |
Named literal values hashed into the stack identity. Use this for versions, flags, anything the install depends on that isn't a file. |
input_files |
list of paths | [] |
Local files whose contents are hashed in. Relative paths are resolved against the directory of the YAML that declared the stack. |
prep |
string (bash) | "" |
One-time install script. Runs with STACK_DIR set to the cache directory; set -euo pipefail is applied so any failure aborts. |
init |
string (bash) | "" |
Per-task env layer exported on every task that uses this stack. Same ${STACK_DIR} is available. |
Lifecycle¶
A stack lives at <cache_dir>/<name>/<hash>/ on the backend, where <hash> is a 12-character SHA-256 of (name, prep, sorted inputs, input_files contents). A successful install leaves a .ready sentinel file inside; that's the marker ScriptHut uses to know it can skip the install.
~/.cache/scripthut/stacks/julia-1.11/
├── 9b2f1c7d4ae3/ ← old build (different Manifest.toml)
└── e6a08f3b15c2/ ← current build
├── .ready ← sentinel; only written on success
└── depot/ ← whatever `prep` populated
When you change any input (literal value, file contents, prep script, or stack name), the hash changes, the old directory becomes "stale", and the next install builds into a new directory next to it. Old hash directories don't auto-evict — you remove them with scripthut stack delete <name> (which clears every hash for that stack at once).
States¶
| State | Meaning |
|---|---|
missing |
No directory at the current hash. Next install runs prep from scratch. |
installing |
The hash directory exists but no .ready sentinel — a previous prep started and didn't finish (most often because it failed). scripthut stack install --rebuild recovers. |
ready |
Directory + sentinel both present. Safe to reference from tasks. |
CLI¶
All four commands compose with --backend X to scope to one backend; without it, they run against every backend the stack declares (or every SSH-based backend if the list is empty).
scripthut stack list # configured stacks (no SSH)
scripthut stack check [<name>] [--backend X] # per-backend state table
scripthut stack install <name> [--backend X] [--rebuild]
scripthut stack delete <name> [--backend X]
list¶
Prints the configured stacks and their declared backends. Doesn't touch any backend — pure config view.
NAME BACKENDS INPUTS
julia-1.11 mercury-nb,pythia-nb julia_version=1.11.3
python-ml (all SSH) python_version=3.12
check¶
For each (stack × backend) pair, opens an SSH connection and probes for the cache directory and sentinel. Reports:
STACK BACKEND STATE HASH BUILT SIZE NOTE
julia-1.11 mercury-nb ready e6a08f3b15c2 2d ago 1.4G
julia-1.11 pythia-nb missing e6a08f3b15c2 - -
python-ml mercury-nb installing 1ad3f7c92e58 - - prep exit 1: ...
check exits non-zero if any stack is missing or installing — handy as a CI gate before submitting work.
install¶
Idempotent: a no-op if the stack is already ready at the current hash. Otherwise:
- Wipes the hash directory if it's in
installingstate (or if--rebuildis passed). - Creates the hash directory.
- Runs
prepviabash -swithSTACK_DIRexported andset -euo pipefailin effect. - Writes
.readyonly ifprepexits 0.
Failed installs leave the directory in installing state so check reports it; recover with install --rebuild.
delete¶
Removes the whole <cache_dir>/<name>/ directory, not just the current hash. Use this to free disk on a backend or to force a clean reinstall.
Hashing in detail¶
The hash is computed from:
name=<stack name>
prep=<full prep script text>
inputs=<sorted key=value pairs>
files=<for each input_file: path + contents>
A few corollaries that matter in practice:
- Whitespace in
prepmatters. Reformatting the script changes the hash. Use comments to record intent rather than restructuring the script. - Sorting of
inputsis stable. YAML dict order doesn't affect the hash; you can reorder freely. - Missing input files hash to
<missing>. A typo ininput_files: [requirments.txt](sic) produces a stable but wrong hash; once you fix the typo, the hash changes and the stack reinstalls. - The hash is 12 hex chars (~48 bits of entropy) — plenty for cache distinctness on a single backend, but not a cryptographic guarantee.
You can inspect the current hash for a stack without touching any backend by reading scripthut stack check <name> --json.
Tying stacks to tasks (future)¶
Today the CLI verbs are the integration point: you install the stack manually before submitting work, and tasks reference its installed binaries via ${STACK_DIR} or by re-module loading the same modules in their own commands. A follow-up will let tasks declare stack: julia-1.11 and have ScriptHut:
- Lazily install the stack on the target backend if it's not ready.
- Splice the stack's
init:block into the task's resolved env (after backend env, before workflow env). - Expose
STACK_DIRto the task automatically.
Until that lands, treat stacks as a managed cache you populate explicitly with scripthut stack install.
What's not supported yet¶
- EC2 / AWS Batch backends. Stacks are filesystem-based for v1; a Docker-image variant keyed on the same hash is planned for these backends but not implemented.
- Lazy auto-install at submission. You currently install explicitly; the lazy path will come with task-stack binding.
- Locks across concurrent installs. If two
scripthut stack installinvocations race on the same hash, they may both write into the same directory. Practically rare — for now, run one install at a time per backend. - Garbage-collecting old hashes. Only the current hash is kept fresh; old ones accumulate until you
delete. A future GC command will prune everything except the current hash for each named stack.