Job parameters#

When instantiating a SLURM client or executor, you can provide the parameters argument

from pyslurmutils.client import SlurmScriptRestClient

parameters={"time_limit": "02:00:00"}
client = SlurmScriptRestClient(
    url=url,
    user_name=user_name,
    token=token,
    log_directory=log_directory,
    parameters=parameters,
)

The parameters can also be overridden when submitting jobs

future = executor.submit(..., slurm_arguments={"parameters": parameters})

Documentation on all available SLURM job parameters can be found here.

Environment variables#

SLURM job environment variables can be passed in two ways.

Either by passing the environment parameter. For example:

parameters = {"environment": {"MYVAR1":"MYVALUE1"}}

Or by defining local environment variables that start with SLURM_ENV_. For example:

export SLURM_ENV_MYVAR2=MYVALUE2

In this example, the final SLURM job parameter environment will be:

environment = {
    "MYVAR1": "MYVALUE1",   # from 'environment' parameter
    "MYVAR2": "MYVALUE2",   # from SLURM_ENV_*
}

Note that the SLURM_ENV_ prefix is stripped before passing it to SLURM.

Priority of environment variables from high to low:

  1. parameters["environment"] (explicit SLURM environment parameter)

  2. SLURM_ENV_* (local environment variables prefixed with SLURM_ENV_)

Common SLURM Job Parameters#

SLURM Job Parameters Mapping#

srun/sbatch

REST

Description

--array=0,6,16-32
array="0,6,16-32"

Job array index specification

--job-name=my_process
name="my_process"

Specify a name for the job

--input=input.txt
standard_input="input.txt"

Path to stdin file

--output=output.txt

N/A

Path to stdout file. Cannot be used directly from the API. Use the parameters log_directory and std_split of SlurmRestExecutor instead.

--error=error.txt

N/A

Path to stderr file. Cannot be used directly from the API. Use the parameters log_directory and std_split of SlurmRestExecutor instead.

--time-min=30
--time-min=10:02:03
time_minimum=30
time_minimum="10:02:03"

Minimum run time for the job (minutes or HH:MM:SS)

--time=30
--time=10:02:03
time_limit=30
time_limit="10:02:03"

Maximum run time for the job

--begin=16:00:00
begin_time=1735689600 (Unix timestamp)

Defer job allocation until specified time

--deadline=18:00:00
end_time=1735689600 (Unix timestamp)

Expected end time for the job

--gres=gpu:2

N/A

Generic consumable resources (legacy)

--gpus=3
--gpus=volta:3
tres_per_job="gres/gpu=2"

Number of GPUs required for the job

--gpus-per-node=3
--gpus-per-node=volta:3
tres_per_node="gres/gpu=2"

Number of GPUs required per node

--gpus-per-task=3
--gpus-per-task=volta:3
tres_per_task="gres/gpu=2"

Number of GPUs required per task

--cpus-per-task=4
cpus_per_task=4

Number of CPUs required per task

--cpus-per-gpu=8
cpus_per_tres="gres/gpu=8"

Number of CPUs allocated per GPU

--nodes=4
--nodes=1,4,8
nodes="4"
nodes="1,4,8"

Node count or range specification

--exclude=hpc1,hpc8
excluded_nodes="hpc1,hpc8"

Exclude specific nodes from allocation

--exclusive
exclusive="true"

Request exclusive node allocation

--nodelist=node042
allocation_node_list="node042"

Request specific nodes

--partition=gpu_long
partition="gpu_long"

Request specific partition

--mem-per-cpu=1024
--mem-per-cpu=1G
memory_per_cpu=1024
memory_per_cpu="1G"

Memory required per CPU

--mem-per-gpu=1024
--mem-per-gpu=1G
memory_per_tres="gres/gpu=1024"
memory_per_tres="gres/gpu=1G"

Memory required per GPU

--mem=1024
--mem=1G
memory_per_node=1024
memory_per_node="1G"

Total memory per node

Example Usage#

For example this script

from pyslurmutils.client import SlurmScriptRestClient

SCRIPT = """#!/usr/bin/env python3
import os
host = os.gethostname()
pid = os.getpid()
ncpus = len(os.sched_getaffinity(pid))
print(f'{host=}, {pid=}, {ncpus=}')
"""

parameters = {"cpus_per_task": 4}
client = SlurmScriptRestClient(
    url=url,
    user_name=user_name,
    token=token,
    log_directory=log_directory,
    parameters=parameters,
)

The parameters can also be overridden when submitting jobs

job_id = client.submit_script(SCRIPT, parameters=parameters)
try:
    print(client.wait_finished(job_id))
    client.print_stdout_stderr(job_id)
finally:
    client.clean_job_artifacts(job_id)

The output confirms that 4 CPUs are available to the job

COMPLETED
STDOUT/STDERR: /tmp_14_days/<username>/slurm_logs/pyslurmutils.<hostname>.15120368.outerr
-------------------------------------------------------------------------------------
host='<slurmnode>', pid=1774392, ncpus=4