Sun Grid Engine Job Arrays

1. Why?

Suppose you wish to run a large number of largely identical jobs: you may wish to run the same program many times with different arguments or parameters; or perhaps process a thousand different input files. One might write a Perl script to generate all the required qsub files and a BASH script to submit them all. However this is not a good use of your time and it will do horrible things to the submit (login) node on a cluster.

Much better to use an SGE Array Job!

2. What?

An SGE array job might be described as a job with a for-loop built in. Here is a simple example:

  #!/bin/bash

  #$ -cwd
  #$ -S /bin/bash

  #$ -t 1-1000
      # ...tell SGE that this is an array job, with "tasks" numbered from 1 
      #    to 10000...

  ./myprog < data.$SGE_TASK_ID > results.$SGE_TASK_ID

Computationally, this is equivalent to 1000 individual queue submissions in which SGE_TASK_ID takes the values 1, 2, 3. . . 1000, and where input and output files are indexed by the ID. However:

only one qsub command is issued (and only one qdel command would be required to delete all jobs);
only one entry appears in qstat output;
the load on the SGE submit node (i.e., the cluster login node) is vastly less than that of submitting 1000 separate jobs!

A slight variation — run each job in a separate directory (folder):

  #!/bin/bash

  #$ -cwd
  #$ -S /bin/bash

  #$ -t 1-1000

  mkdir myjob-$SGE_TASK_ID
  cd myjob-$SGE_TASK_ID
  ../myprog-one > one.output
  ../myprog-two < one.output > two.output

3.

More on SGE Job Arrays can be found at:

4.	A More General For Loop

It is not necessary that SGE_TASK_ID starts at 1; not must the increment be 1. For example:

  #$ -t 100-995:5

so that SGE_TASK_ID takes the values 100, 105, 110, 115... 995.

Incidently, in the case in which the upper-bound is not equal to the lower-bound plus an integer-multiple of the increment, for example

  #$ -t 1-42:6

SGE automatically changes the upper bound, viz

  prompt> qsub array.qsub
  Your job-array 2642.1-42:6 ("array.qsub") has been submitted

  prompt> qstat
  job-ID   prior  name        user      state  submit/start at      queue    slots ja-task-ID 
  -------------------------------------------------------------------------------------------
   2642  0.00000  array.qsub  simonh    qw     04/24/2009 12:29:29           1 1-37:6

5.	Related Environment Variables

There are three more automatically created environment variables one can use, as illustrated by this simple qsub script:

  #!/bin/bash

  #$ -cwd 
  #$ -S /bin/bash

  #$ -t 1-37:6

  echo "The ID increment is: $SGE_TASK_STEPSIZE"

  if [[ $SGE_TASK_ID == $SGE_TASK_FIRST ]]; then
      echo "first"
  elif [[ $SGE_TASK_ID == $SGE_TASK_LAST ]]; then
      echo "last"
  else
      echo "neither"
  fi

6.	A List of Input Files

One can be sneaky — suppose we have a list of input files, rather than input files explicitly indexed by suffix:

  #!/bin/bash

  #$ -cwd
  #$ -S /bin/bash

  #$ -t 1-42

  $INFILE=`awk "NR==$SGE_TASK_ID" my_file_list.text`
      #
      # ...or used sed:    
      #        sed -n "${SGE_TASK_ID}p" my_file_list.text
      #

  ./myprog < $INFILE

RCS::Intro Linux and HPC::Editors

Page Contents:

Page Group

Related Pages

Sun Grid Engine Job Arrays

1.

Why?

2.

What?

3.

More

4.

A More General For Loop

5.

Related Environment Variables

6.

A List of Input Files