Stuff

UoM::RCS::Talby::Danzek::SGE



Page Group

Basic Config:

Extra Stuff:

Applications:

Scripts Etc.







Serial and Parallel, with/out Interactive all on One Hostgroup

What we got:
  • A few dozen nodes.
  • Each has 12 cores.
What we want:
  • Users to be able to run serial, SMP and MPI jobs on the FAT nodes.
  • The MPI jobs either fitted wherever they can, so as to start as quickly as possible,
  • or fitted to use whole nodes only, to run as efficiently as possible.
  • To be able to run interactively too.
  • To be sure that a user running an interactive job does not get a shell on a node on which a batch job has been running for a long time and screw the node up — so mutual-subordination is definitely necessary!
Implementation — three queues. . .
  C6100-STD.q
  C6100-STD-serial.q
  C6100-STD-interactive.q
. . .a complex attribute. . .
  #name               shortcut     type        relop requestable consumable default  urgency 
  #------------------------------------------------------------------------------------------

  interactive         inter        BOOL        ==    FORCED      NO         0        0
. . .plus a resource quota. . .
  {
    name         C6100-STD.rqs
    description  NONE
    enabled      TRUE
    limit        hosts {@C6100-STD} to slots=12
  }
. . .with resource quotas and some subordination
  • The resource quota allows us to run multiple queues at the same time on the same hosts (without overloading these hosts).
  • Each of the three queues subordinates the two others.
Why all those queues?
  • We want to be able to impose different limits on each queue from the others.
  • We may want the interactive queues to be serial only.
Usage
  C6100-STD-serial.q
      qsub [[no flags needed at all]] ...

  C6100-STD-interactive.q
      qrsh -l inter ...

  C6100-STD.q
      qsub -pe smp.pe ...
      qsub -pe orte.pe ...
      qsub -pe orte-12.pe ...