Environments¶
ScriptHut resolves the environment for each task by walking an ordered chain of env rules from five layers — backend → server → workflow (config) → workflow (document) → task — against a seed of SCRIPTHUT_* runtime variables. Every layer contributes rules to the same list; later rules see earlier rules' effects, so conditionals can branch on any value set upstream (including the seed).
The EnvRule shape¶
Every entry in any env: list is an EnvRule:
env:
- set: # always-applied
PROJECT: training
LOG_LEVEL: info
- if: # optional guard
SCRIPTHUT_BACKEND: mercury # single value = equality
set:
SCRATCH: /scratch/${USER} # ${name} expanded against env-so-far
init: "module load gcc/12 cuda/11"
- if:
SCRIPTHUT_BACKEND: [anvil, delta] # list value = OR
set:
SCRATCH: /tmp/work/${USER}
init: "module load gcc cuda-toolkit"
- if:
SCRIPTHUT_BACKEND: mercury
GPU: "1" # multiple keys = AND
append:
PATH: /opt/cuda/bin # joined with ":" to existing value
| Field | Type | Description |
|---|---|---|
if |
object | Optional guard. Each key must match the env-so-far. A list value matches if the actual value is in the list (OR). When all keys match, the rule applies. |
set |
object | Variables to write. Overwrites any prior value. ${name} is expanded against env-so-far. |
append |
object | Variables to extend. Joined to the existing value with : (creates the value if absent). ${name} is expanded. |
init |
string | Bash text appended (newline-joined) into the extra_init block. Runs before set:/append: exports in the generated script, so user-set vars override anything module load / source placed into the env. Followed by cd <working_dir> and the task command. ${name} is expanded. |
include |
list of strings | Names of env_groups to inline at this position. See Reusable groups. |
Where rules live¶
env: is a list of rules accepted at five locations, evaluated in this order:
| Layer | Defined in | Typical use |
|---|---|---|
| Backend | env: on each backend entry in scripthut.yaml |
Cluster-specific facts — scratch paths, module-system bootstrap |
| Server | top-level env: in scripthut.yaml |
Org-wide defaults across all clusters |
| Workflow (config) | env: on each workflow entry in scripthut.yaml |
Workflow-specific overrides for ops-controlled config |
| Workflow (document) | top-level env: inside the JSON the generator emits |
Project-controlled config — lives in your repo, ships with your code. See Task JSON → Environment Variables |
| Task | env: on each task in the JSON generator output |
Per-task adjustments |
Rules from each layer are concatenated in that order, then evaluated top-to-bottom. There is no separate "merge" step — set overwrites and append extends, so layer X overrides layer X-1 simply by being written later.
SCRIPTHUT_* runtime seed¶
Before any user rule runs, ScriptHut seeds the env with run-context variables. Any rule's if: can branch on these:
| Variable | Always present | Description |
|---|---|---|
SCRIPTHUT_BACKEND |
yes | Name of the backend this task runs on. |
SCRIPTHUT_WORKFLOW |
yes | Name of the workflow. |
SCRIPTHUT_RUN_ID |
yes | Unique 8-char run identifier. |
SCRIPTHUT_CREATED_AT |
yes | ISO 8601 timestamp of run creation. |
SCRIPTHUT_GIT_REPO |
git workflows only | Repository URL. |
SCRIPTHUT_GIT_BRANCH |
git workflows only | Branch name. |
SCRIPTHUT_GIT_SHA |
git workflows only | Resolved commit hash. |
These keys are protected: any rule attempting to set: or append: to a key starting with SCRIPTHUT_ is rejected with a warning and ignored.
${name} expansion¶
String values in set:, append:, and init: may reference other variables with ${name}. Expansion happens against the env as resolved so far, so it sees the seed plus everything earlier rules have written. Unknown names expand to an empty string and log a warning. Only the ${name} form is interpolated — bare $name is left alone, so shell expressions written into init: are passed through unchanged.
Reusable groups¶
Define a rule list once with env_groups: and inline it from any env: rule using include::
env_groups:
monitoring:
- set:
OTEL_EXPORTER: otlp
OTEL_ENDPOINT: "http://collector:4318"
backends:
- name: mercury
env_groups:
gpu-stack: # mercury's flavor of "gpu-stack"
- init: "module load gcc/12 cuda/11"
- append: { PATH: /opt/cuda/bin }
env:
- include: [gpu-stack, monitoring]
env_groups: is accepted on each backend, on the server (top-level), on each workflow in scripthut.yaml, and on the workflow JSON document itself (top-level alongside tasks: — see Task JSON → Environment Variables). Scoping: a group defined at layer X is visible to that layer's own env: and to all later layers. Names defined at a later layer shadow earlier ones, so each backend can declare its own gpu-stack group and any workflow's include: [gpu-stack] resolves to the right one based on SCRIPTHUT_BACKEND.
Includes can be guarded — an if: clause on the include-rule is inherited by every inlined rule (ANDed with any guard the inlined rule already carries):
Cycles between groups are detected and raise a clear error. Unknown group names log a warning and are skipped.
Inspecting the resolved env¶
The task detail page in the UI has an Env tab that shows the resolved env for the selected task, with per-key provenance — every set / append operation lists its source layer (and which group, if any) and contribution. The same information is available as JSON at:
Use this when a value isn't what you expect: the provenance pinpoints which layer wrote it.
Worked example: cluster-specific module loads¶
The canonical case — "Mercury needs module load gcc/12 cuda/11, Anvil needs module load gcc cuda-toolkit, the laptop needs nothing":
backends:
- name: mercury
type: slurm
ssh: { host: mercury.example.edu, user: alice }
env:
- set: { SCRATCH: /scratch/${USER} }
- init: "source /etc/profile.d/modules.sh"
- name: anvil
type: pbs
ssh: { host: anvil.example.edu, user: alice }
env:
- set: { SCRATCH: /tmp/work/${USER} }
workflows:
- name: train
backend: mercury # this workflow targets mercury
command: "python generate_tasks.py"
env:
- if: { SCRIPTHUT_BACKEND: mercury }
init: "module load gcc/12 cuda/11"
- if: { SCRIPTHUT_BACKEND: anvil }
init: "module load gcc cuda-toolkit"
The same workflow definition, ported to a different backend (e.g. by adding a second train-anvil workflow that points to anvil), gets the right module loads automatically — no per-backend duplication of the workflow.