Writing a Task Generator¶
A task generator is any executable (script, binary, etc.) that prints valid JSON to stdout. The JSON must match the Task JSON Format — typically the wrapped object form {"tasks": [...]}, optionally with top-level env: / env_groups:.
This page shows generator skeletons in several languages. Each example produces a simple parameter sweep that you can adapt.
Python¶
#!/usr/bin/env python3
"""Generate tasks for ScriptHut."""
import argparse
import json
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--count", type=int, default=5)
parser.add_argument("--partition", default="normal")
args = parser.parse_args()
tasks = []
for i in range(1, args.count + 1):
tasks.append({
"id": f"task-{i:03d}",
"name": f"Task {i}",
"command": f"python process.py --index {i}",
"partition": args.partition,
"cpus": 2,
"memory": "8G",
"time_limit": "2:00:00",
})
print(json.dumps({"tasks": tasks}, indent=2))
if __name__ == "__main__":
main()
Configure in scripthut.yaml:
workflows:
- name: batch-processing
backend: hpc-cluster
command: "python /path/to/generate_tasks.py --count 20 --partition gpu"
max_concurrent: 5
description: "Batch processing pipeline"
Bash¶
#!/bin/bash
# Generate tasks as JSON using heredoc
cat <<'EOF'
{
"tasks": [
{
"id": "step-1",
"name": "Download",
"command": "wget https://example.com/data.csv",
"time_limit": "00:30:00"
},
{
"id": "step-2",
"name": "Process",
"command": "python process.py data.csv",
"deps": ["step-1"],
"cpus": 4,
"memory": "16G"
}
]
}
EOF
Static JSON File¶
The simplest approach — just cat a pre-existing JSON file:
workflows:
- name: static-tasks
backend: hpc-cluster
command: "cat /shared/tasks/my_pipeline.json"
description: "Run predefined pipeline"
Julia¶
#!/usr/bin/env julia
using JSON
tasks = [
Dict(
"id" => "sim-$i",
"name" => "Simulation $i",
"command" => "julia run_sim.jl --seed $i",
"partition" => "normal",
"cpus" => 4,
"memory" => "8G",
"time_limit" => "6:00:00",
"env" => [Dict("include" => ["julia-1.10"])]
)
for i in 1:10
]
println(JSON.json(Dict("tasks" => tasks), 2))
R¶
#!/usr/bin/env Rscript
library(jsonlite)
tasks <- lapply(1:10, function(i) {
list(
id = sprintf("analysis-%03d", i),
name = sprintf("Analysis %d", i),
command = sprintf("Rscript run_analysis.R --chunk %d", i),
partition = "normal",
cpus = 1,
memory = "4G",
time_limit = "1:00:00"
)
})
cat(toJSON(list(tasks = tasks), auto_unbox = TRUE, pretty = TRUE))