Resource-based SLA examples

A host-type guarantee SLA

Hosts owned by specific departments or projects can be guaranteed to users simply and easily using SLAs. Guarantee SLAs allow you to configure guaranteed resources while ensuring unused resources are accessible to other users, within reason. This is achieved through loans to short jobs; The longest pending guarantee SLA jobs wait for the guaranteed resources to become available is the configured loan duration policy.

lsb.resources configuration:
Begin GuaranteedResourcePool
NAME = Proj2Pool
TYPE = hosts
DISTRIBUTION = [productsSLA, 30%] [accountingSLA, 20]
LOAN_POLICIES = QUEUES[shortJobs] DURATION[10]
End GuaranteedResourcePool
lsb.serviceclasses configuration:
Begin ServiceClass
NAME = productsSLA
GOALS = [GUARANTEE]
ACCESS_CONTROL = FAIRSHARE_GROUPS[products]
AUTO_ATTACH = Y
End ServiceClass
 
Begin ServiceClass
NAME = accountingSLA
GOALS = [GUARANTEE]
ACCESS_CONTROL = USERS[accountingUG]
AUTO_ATTACH = Y
End ServiceClass

bresources -g output shows the configured guaranteed resource pool:

> bresources -g
                                                 GUAR    GUAR
POOL_NAME           TYPE    STATUS  TOTAL  FREE  CONFIG  USED
productsGuarantee   hosts   ok      50     50    35      0

bsla output shows the configured SLAs:

> bsla
SERVICE CLASS NAME: productsSLA
ACCESS CONTROL: FAIRSHARE_GROUPS[products/]
AUTO ATTACH: Y
 
GOAL:  GUARANTEE 
 
                  GUAR    GUAR  TOTAL
POOL NAME  TYPE   CONFIG  USED   USED
Proj2Pool  hosts  15      0      0
 
SERVICE CLASS NAME: accountingSLA
ACCESS CONTROL: USERS[accountingUG/]
AUTO ATTACH: Y
 
GOAL:  GUARANTEE 
 
                  GUAR    GUAR  TOTAL
POOL NAME  TYPE   CONFIG  USED   USED
Proj2Pool  hosts     20   0      0

Jobs submitted to fairshare queues by users in the fairshare group products are auto-attached to the guarantee SLA productsSLA, and allocated up to 30% of hosts in the resource pool. Jobs submitted by users in the accountingUG usergroup are auto-attached to the guarantee SLA accountingSLA, and allocated up to 20 hosts. Since a list of hosts or host groups is not included in the guaranteed resource pool configuration, all available hosts are included in the resource pool.

Once each guarantee is met, jobs can run outside the guarantee based on the overall job priority. Unused resources can be borrowed by jobs from queue shortJobs with runtimes (or estimated runtimes) of 10 minutes or less.

In this example loans could be enabled for all queues; restricting loans to a single queue containing only short jobs may improve scheduling performance.

A slot-type guarantee SLA

In some cases fairshare distributions consider only slot usage, so slot-type guarantee SLAs can be used to guarantee resources.

lsb.resources configuration:
Begin GuaranteedResourcePool
NAME = linuxPool
TYPE = slots
HOSTS = linuxHG
DISTRIBUTION = [engSLA, 30%] [devSLA, 35%]
LOAN_POLICIES = QUEUES[all] CLOSE_ON_DEMAND
End GuaranteedResourcePool
 
Begin GuaranteedResourcePool
NAME = solarisPool
TYPE = slots
HOSTS = linuxHG
DISTRIBUTION = [engSLA, 50%] [devSLA, 20%]
LOAN_POLICIES = QUEUES[all] CLOSE_ON_DEMAND
End GuaranteedResourcePool
lsb.serviceclasses configuration:
BeginServiceClass
NAME = engSLA
GOALS = [GUARANTEE]
ACCESS_CONTROL = FAIRSHARE_GROUPS[eng]
AUTO_ATTACH = Y
End ServiceClass
 
BeginServiceClass
NAME = devSLA
GOALS = [GUARANTEE]
ACCESS_CONTROL = FAIRSHARE_GROUPS[dev]
AUTO_ATTACH = Y
End ServiceClass

bresources -g output shows the configured guaranteed resource pool:

> bresources -g
                                            GUAR     GUAR
POOL_NAME    TYPE   STATUS   TOTAL   FREE   CONFIG   USED
linuxPool    slots  ok       100     90     65       65
solarisPool  slots  ok       100     100    70       70

bsla output shows the configured SLAs:

> bsla
SERVICE CLASS NAME: engSLA
AUTO ATTACH: Y
ACCESS CONTROL: FAIRSHARE_GROUPS[eng/]
 
GOAL:  GUARANTEE 
 
                  GUAR    GUAR  TOTAL
POOL NAME  TYPE   CONFIG  USED   USED
linuxPool    slots   30   0      0
solarisPool  slots   50   0      0
 
SERVICE CLASS NAME: devSLA
AUTO ATTACH: Y
ACCESS CONTROL: FAIRSHARE_GROUPS[dev/]
 
GOAL:  GUARANTEE 
 
                  GUAR    GUAR  TOTAL
POOL NAME  TYPE   CONFIG  USED   USED
linuxPool    slots   35   0      0
solarisPool  slots   20   0      0

Jobs submitted by users in fairshare groups eng and dev are auto-attached to guarantee SLAs. Each guarantee SLA has a share in both the linuxPool resource pool and the solarisPool resource pool. Once the guarantees are met, additional SLA jobs can run on slots not reserved for guarantees.

Unused resources can be borrowed by jobs of any length from any queue, so long as there is no pending load on the guarantees.

Large memory jobs in a guarantee SLA

Clusters running large memory jobs can use guaranteed resource pools to limit the number of large memory jobs running on each host. By forming a slot-type resource pool limited to one slot on each host, guarantee SLA jobs using the resource pool will run on a single slot per host, spreading out memory consumption and improving performance.

lsb.resources configuration:
Begin GuaranteedResourcePool
NAME = bigMemPool
TYPE = slots
HOSTS = bigMemHG
SLOTS_PER_HOST = 1
DISTRIBUTION = [bigMemSLA, 50%]
LOAN_POLICIES = QUEUES[all]
End GuaranteedResourcePool
lsb.serviceclasses configuration:
BeginServiceClass
NAME = bigMemSLA
GOALS = [GUARANTEE]
ACCESS_CONTROL = QUEUES[bigMem]
AUTO_ATTACH = Y
End ServiceClass

bresources -g output shows the configured guaranteed resource pool:

> bresources -g
                                            GUAR     GUAR
POOL_NAME    TYPE   STATUS   TOTAL   FREE   CONFIG   USED
bigMemPool   slots  ok       100     90     50       50

bsla output shows the configured SLAs:

> bsla
SERVICE CLASS NAME: devSLA
ACCESS CONTROL: QUEUES[bigMem]
AUTO ATTACH: Y
 
GOAL:  GUARANTEE 
 
                  GUAR    GUAR  TOTAL
POOL NAME  TYPE   CONFIG  USED   USED
bigMemPool    slots   50   0      0

Jobs submitted to queue bigMem are auto-attached to the guarantee SLA bigMemSLA, and allocated a single slot on one of the hosts in the hostgroup bigMemHG (as configured in the guaranteed resource pool bigmemPool). Unused slots can be borrowed by jobs of any length from any queue.