Stuff

UoM::RCS::Talby::Danzek::SGE



Page Group

Basic Config:

Extra Stuff:

Applications:

Scripts Etc.







Danzek SGE Journal

2012 March 12

Why are resource-reservations not working properly? Discussion lists (see below) suggest that there is a bug/feature. The default value of default_duration is INFINITY and this causes resource-reservations to fail sometimes. Change to queue length for example. . .

  qconf -ssconf
  .
  .
  default_duration                  168:00:00
  .
. . .and problem goes away?

Diagnostics:

  tail -f /$SGE_ROOT/default/common/schedule | grep RESERV
or
  tail -f /$SGE_ROOT/default/common/schedule | grep <JOB_NUM_WITH_RR>
      # ...that's the id/number of the job on which a r-r was set...

http://permalink.gmane.org/gmane.comp.clustering.gridengine.users/19485:

Re: 6.2u4: resource reservation not working for some jobs

Hi Richard,

rems0 schrieb:
> Hi list, hi reuti,
> 
> resource reservation is still not working for me.
> In the scheduler configuration (qconf -ssconf)
> max_reservation is set to 20.
> default_duration is set to 8760:00:00.

are you monitoring your scheduler?

------------------------- >8 -------------------------
qconf -ssconf | grep params
params            MONITOR=1
------------------------- 8< -------------------------

The location of the schedule file is:

<sge_root>/<cell>/common/schedule

Do you find any RESERVING statements in the log file?

> As mentioned in the bug above, I changed default_duration in the
> scheduler configuration from INFINITY to 8760 hours (1 year).
> 
> Should this be enough or should I also change s_rt and/or h_rt in the
> queue definitions? These are both also set to INFINITY.

This should be enough.

2012 March 12

Problems with one or two MACE users hogging the IB-connected queues. This is done by exploiting the urgency policy preceding this date which was:

  Weight Deadline     :  zero
  Weight Waiting Time :  zero
with
  bash> qconf -sc
  
  #name               shortcut     type        relop requestable consumable default  urgency 
  #------------------------------------------------------------------------------------------
  .         .        .        .        .        .        .        .
  .         .        .        .        .        .        .        .
  slots     s        INT      <=       YES      YES      1        1000
  .         .        .        .        .        .        .        .
  .         .        .        .        .        .        .        .

Urgency Policy Change:

  Weight Waiting Time :  0.1

Resource Quota Changes:

  -- added 256 slot global limit for each user