Click [slideshow] to begin presentation.
Running a Condor Job: Overview
Running a Condor Job: Overview
- Make it batch-ready
- Choose a Universe
- Create a submit file
- Submit the job
- Monitor your job's status
Running a Job
Step1: Make it Batch-Ready
Only non-interactive jobs can be submitted to Condor
Jobs must:
- be able to run "in the background"
- no GUI;
- non-interactive — no user input (can be taken from an input file)
- use STDIN, STDOUT and STDERR plus input/output data files only.
These are the same restrictions as under traditional batch
systems
Running a Job
Choose a Universe
- Vanilla
-
Any batch-style computation will run here; Condor-related features restricted.
- Standard
-
Full fat Condor:
- checkpointing and job migration;
- remote IO.
- Parallel
-
e.g., MPI jobs can be run under Condor, but. . .?!?
- Others
-
Java, . . .
More on Vanilla and Standard later. . .
Running a Job
Create a Submit File
- A small ASCII text file.
- cf. an SGE qsub file.
- Specifies: executable; universe; STDIN, STDOUT and STDERR files to use;
data files if any; and more. . .
Example
executable = my_prog.exe
universe = standard
output = my_prog.$(Process).out
error = my_prog.$(Process).err
log = my_prog.log
arguments = 3000
queue
arguments = 4500
queue
Running a Job
Submit the Job
- Setup Environment
-
prompt> export CONDOR_CONFIG=<path_to_condor>/etc/condor_config
#
# Ensure Condor programs can find the Condor
# configuration!
prompt> export PATH=$PATH:/<path_to_condor_exes>/bin
#
# Add the Condor programs to your PATH (e.g., condor_submit,
# condor_q...)
- Submit and Monitor Job
-
prompt> condor_submit my_job.cond_sub
prompt> condor_q
#
# condor_q -global
Running a Job
First Practical Session 1/3
Try Condor for yourselves!
- Login to man2condor.nw-grid.ac.uk using OpenSSH or PuTTY, using
the username and password you have been given.
- Set up your environment:
source /opt/condor-7.4.2/condor.sh
- Check the status of the pool using condor_status.
- Notice that some nodes are busy with non-Condor activity, while others
are available to Condor.
- Change directory to first-practical:
cd first-practical
- Notice that there are three examples to run: hello.cmd, hello-2.cmd
and loop.cmd.
Running a Job
First Practical Session 2/3
Run two vanilla universe jobs
- Examine the two vanilla universe jobs, the hello* files:
- Look in the Fortran source files, hello*.f90 notice that one
prints a message to STDOUT; the second writes to a file,
myfile.txt.
- In the Condor submit files, hello*.cmd, notice the file-transfer
related commands.
- Compile and submit the two vanilla universe jobs:
gfortran -o hello hello.f90
condor_submit hello.cmd
gfortran -o hello-2 hello-2.f90
condor_submit hello-2.cmd
- If you are quick you may catch your jobs in the Condor pool queue by
using condor_q.
- Check the output and error files, and also the newly-created
file myfile.txt.
Running a Job
First Practical Session 3/3
- Examine the standard universe job submit file, loop.cmd
- Notice that we submit more than one computation.
- Compile and submit the standard universe job:
condor_compile gcc -o loop.remote loop.c
condor_submit loop.cmd
- Quickly check the queue using condor_q and notice your jobs
waiting or running.
- You can check the progress of one of your jobs by, for example
tail -f loop.4.out