slurm_script_generator.squeue#
Functions
|
Entry point for the |
Classes
|
Interface to SLURM job accounting via |
|
A single job record from SLURM accounting ( |
|
Interface to the SLURM job queue via |
|
A single job entry from the SLURM queue. |
- class slurm_script_generator.squeue.SAcct(user: str | None = None, days: int = 7, partition: str | None = None)[source]#
Bases:
objectInterface to SLURM job accounting via
sacct.- Parameters:
user (str, optional) – If given, fetch only jobs for this user.
days (int) – Number of days of history to look back (default: 7).
partition (str, optional) – If given, filter to this partition.
Examples
>>> a = SAcct(user='alice', days=30) >>> a.summary() {'total': 42, 'completed': 30, 'failed': 5, ...}
- jobs(user: str | None = None, state: str | None = None, partition: str | None = None) List[SAcctJob][source]#
Return accounting records matching the given criteria.
- jobs_by_partition() Dict[str, List[SAcctJob]][source]#
Return a mapping of partition -> list of jobs in that partition.
- jobs_by_state() Dict[str, List[SAcctJob]][source]#
Return a mapping of state -> list of jobs in that state.
- class slurm_script_generator.squeue.SAcctJob(job_id: int, user: str, name: str, state: str, partition: str, num_nodes: int, num_cpus: int, elapsed: str, cpu_time_raw: int, exit_code: str)[source]#
Bases:
objectA single job record from SLURM accounting (
sacct).- property cpu_hours: float#
- cpu_time_raw: int#
- elapsed: str#
- exit_code: str#
- property is_cancelled: bool#
- property is_completed: bool#
- property is_failed: bool#
- property is_timeout: bool#
- job_id: int#
- name: str#
- num_cpus: int#
- num_nodes: int#
- partition: str#
- state: str#
- user: str#
- class slurm_script_generator.squeue.SQueue(user: str | None = None, partition: str | None = None)[source]#
Bases:
objectInterface to the SLURM job queue via
squeue.- Parameters:
user (str, optional) – If given, only fetch jobs belonging to this user by default.
Examples
>>> q = SQueue() >>> q.summary() {'total_jobs': 42, 'running': 30, 'pending': 12, 'users': {...}, 'by_state': {...}}
>>> q.wait_until_done(job_name='training_*') >>> q.wait_until_done(job_id=12345) >>> q.wait_until_done(user='alice')
- jobs(job_name: str | None = None, job_id: int | str | None = None, user: str | None = None, state: str | None = None, partition: str | None = None) List[SQueueJob][source]#
Return jobs matching the given criteria.
- Parameters:
job_name (str, optional) – Job name or glob pattern (e.g.
'train_*').job_id (int or str, optional) – Exact job ID.
user (str, optional) – Username to filter by.
state (str, optional) – SLURM state code, e.g.
'R'or'PD'.partition (str, optional) – Partition name to filter by.
- Return type:
list of SQueueJob
- jobs_by_partition() Dict[str, List[SQueueJob]][source]#
Return a mapping of partition name -> list of jobs in that partition.
- jobs_by_state() Dict[str, List[SQueueJob]][source]#
Return a mapping of state code -> list of jobs in that state.
- jobs_by_user() Dict[str, List[SQueueJob]][source]#
Return a mapping of username -> list of their jobs.
- refresh() SQueue[source]#
Re-run
squeueand update the cached job list.- Returns:
self, for chaining.
- Return type:
- summary() dict[source]#
Return a summary dict with total counts, per-user counts, and per-state counts.
- Returns:
Keys:
total_jobs,running,pending,users(dict of user -> job count),by_state(dict of state code -> job count).- Return type:
dict
- wait_until_done(job_name: str | None = None, job_id: int | str | None = None, user: str | None = None, poll_interval: float = 30.0, timeout: float | None = None, verbose: bool = True) None[source]#
Block until all matching jobs leave the active queue.
Supports glob patterns in job_name (
*and?wildcards). At least one filter argument must be provided.- Parameters:
job_name (str, optional) – Job name or glob pattern, e.g.
'train_*'.job_id (int or str, optional) – A specific job ID to wait for.
user (str, optional) – Wait for all jobs belonging to this user to finish.
poll_interval (float) – Seconds between queue polls. Defaults to 30.
timeout (float, optional) – Maximum seconds to wait before raising
TimeoutError.verbose (bool) – Print progress messages. Defaults to True.
- Raises:
ValueError – If no filter is specified.
TimeoutError – If timeout is exceeded before all jobs finish.
- class slurm_script_generator.squeue.SQueueJob(job_id: int, user: str, name: str, state: str, partition: str, num_nodes: int, num_cpus: int, time_used: str, time_limit: str, reason: str, priority: int)[source]#
Bases:
objectA single job entry from the SLURM queue.
- property is_active: bool#
- property is_pending: bool#
- property is_running: bool#
- job_id: int#
- name: str#
- num_cpus: int#
- num_nodes: int#
- partition: str#
- priority: int#
- reason: str#
- state: str#
- property state_name: str#
- time_limit: str#
- time_used: str#
- user: str#
- wait_until_done(poll_interval: float = 30.0, timeout: float | None = None, verbose: bool = True) None[source]#
Block until this specific job leaves the active queue.
- Parameters:
poll_interval (float) – Seconds between queue polls. Defaults to 30.
timeout (float, optional) – Maximum seconds to wait before raising
TimeoutError.verbose (bool) – Print progress messages. Defaults to True.
- slurm_script_generator.squeue.main() None[source]#
Entry point for the
slurm-queuecommand-line tool.Sub-commands#
- show (default)
Print a per-user queue summary table.
- list
Print individual jobs, optionally filtered and sorted.
- stats
Print partition and state breakdown statistics.
- wait
Block until matching jobs leave the active queue.