Starting and managing jobs with PBS
Batch processing is a method of running software programs called jobs in batches automatically. While users are required to submit the jobs, no other interaction by the user is required to process the batch. Batches may automatically be run at scheduled times as well as being run on the available computer resources. For additional background, see Batch Computing Overview. GSDC TEM computing cluster uses the Portable Batch System as implemented in Altair's PBS Pro across shared resources.
URL
Job scripts
Job scripts form the basis of batch jobs. A job script is simply a text file with instructions of the work to
execute. Job scripts are usually written in bash and thus mimic commands a user would execute interactively through a shell,
but instead are executed on specific resources allocated by the scheduler when available.
Scripts can also be written in other languages - commonly Python.
See our job scripts page for a detailed discussion of job scripts and examples.
Submitting jobs
In the examples that follow, job.pbs, script_name etc. represent a job script files submitted for batch execution.
PBS Pro can be used to schedule both interactive jobs and batch compute jobs.
To submit a batch job, use the qsub command followed by the name of your PBS batch script file.
Propagating environment settings
Some users find it useful to set environment variables in their login environment that can be temporarily used for multiple batch jobs without modifying the job script. This practice can be particularly useful during iterative development and debugging work.
PBSPro has two approaches to propagation:
- Specific variables can be forwarded to the job upon request.
- The entire environment can be forwarded to the job.
In general, the first approach is preferred because the second may have unintended consequences.
These settings are controlled by qsub arguments that can be used at the command line or as directives within job scripts. Here are examples of both approaches:
# Selectively forward runtime variables to the job (lower-case v)
$> qsub -v DEBUG=true,CASE_NAME job.pbs
When you use the selective option (lower-case v), you can either specify only the variable name to propagate the current value (as in CASE_NAME in the example),
or you can explicitly set it to a given value at submission time (as in DEBUG).
Managing jobs
Here are some of the most useful commands for managing and monitoring jobs that have been launched with PBS. Most of these commands will only modify or query data from jobs that are active on the same system.
qdel
Canceling a single job
Run qdel with the job ID to kill a pending or running job.
Stopping all of your own jobs
Kill all of your own pending or running jobs. (Be sure to use backticks as shown.)
qstat
Status of all your own jobs
Run this to see the status of all of your own unfinished jobs.
Your output will be similar to what is shown just below. Most column headings are self-explanatory – NDS for nodes, TSK for tasks, and so on.
In the status (S) column, most jobs are either queued (Q) or running (R). Sometimes jobs are held (H), which might mean
they are dependent on the completion of another job.
tem-ce-al9.sdfarm.kr:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
838.tem-ce-al9* USERID cpuQ cryosparc* 95965 1 1 8000m -- R 00:01
840.tem-ce-al9* USERID gpuQ cryosparc* 84478 1 6 16gb -- R 00:00
841.tem-ce-al9* USERID gpuQ cryosparc* 84594 1 6 16gb -- R 00:00
Following are examples of qstat with some other commonly used options and arguments.
Status of an unfinished job
Get a long-form summary of the status of an unfinished job.
Warning
Use the above command only sparingly; it places a high load on PBSPro.
Status of jobs within some periods
Get a single-line summary of the status of an unfinished or recently completed job (within 72 hours).
Status of jobs on a specified queue
Get information about unfinished jobs in a specified execution queue.
Status of jobs by queue
See job activity by queue (e.g., pending, running) in terms of numbers of jobs.
Status of all of your jobs
Display information for all of your pending, running, and finished jobs.
Status of all your own jobs with comments
Display information for all of your unfinished jobs with exec_host and any scheduler_comment below the basic information.
Status of all jobs
Display information for all the jobs (including other users jobs)
Interactive jobs
Interactive jobs provide an interactive session on a compute node, useful for debugging, testing code, and running short tasks that require user interaction.
Users can start an interactive job on GSDC TEM login nodes using the qsub -I command.
The -I flag is used to request an interactive session. The following example shows how to start an interactive job with specified resources on cpuQ:
The result for the above command is following:
qsub: waiting for job 850.tem-ce-al9.sdfarm.kr to start
qsub: job 850.tem-ce-al9.sdfarm.kr ready
[USERID@tem-cpu00-al9 ~]$ Do something
...
[USERID@tem-cpu00-al9 ~]$ exit
logout
qsub: job 850.tem-ce-al9.sdfarm.kr completed
Interactive jobs with GUI(X11)-based applications
User can also start an interactive job supporting GUI(X11)-based applications using qsub -X -V -I command.
The following example shows how to start an interactive job with specified resources on cpuQ, having environment variables and X11-forwarding attributes to be set:
The result for the above command is following: