Other Stuff

UoM::RCS::Talby


Page Contents:


Page Group:

2010:

2009: 2008:


Related Pages:





Condor Config on SMP Machines

Background Info

On SMP machines, Condor divides the machine up into VMs. A consequence is that if a non-Condor process starts on such a machine, on which two Condor jobs are running, only one Condor job will be suspended. Much more on this, to be found in Configuring The Startd for SMP Machines, in the Condor Manual.

Debugging

First, we need to know what Condor thinks is going on:

  STARTD_DEBUG = D_LOAD
means that condor_startd will now log such stuff as
  2/6 14:33:25 vm1: SystemLoad: 1.00  CondorLoad: 0.00  OwnerLoad: 1.00
  2/6 14:33:25 vm2: SystemLoad: 1.58  CondorLoad: 1.29  OwnerLoad: 0.29
which is what Condor is seeing on a two-CPU machine when we have one Condor job running at 99 and two other processes each running at 49.

New Config

This configuration aims to chuck all Condor jobs of an SMP machine when just one non-Condor (significant) process is running:

TotalNonCondorLoadAvg   = (TotalLoadAvg - TotalCondorLoadAvg)


KeyboardBusy            = (KeyboardIdle < $(MINUTE))
ConsoleBusy             = (ConsoleIdle  < $(MINUTE))
CPUIdle                 = ($(NonCondorLoadAvg) <= $(BackgroundLoad))
#
####CPUBusy             = ((VirtualMachineID == 1) && ($(NonCondorLoadAvg) >= $(HighLoad))) || ((VirtualMachineID == 2) && ($(NonCondorLoadAvg) >= $(HighLoad)))
####CPUBusy             = ($(NonCondorLoadAvg) >= $(HighLoad))
CPUBusy                 = ($(TotalNonCondorLoadAvg)  >= $(HighLoad))
#