Knowledge Center Contents Previous Next Index |
Adding Resources
Contents
- About Configured Resources
- Add New Resources to Your Cluster
- Static Shared Resource Reservation
- External Load Indices
- Modifying a Built-In Load Index
About Configured Resources
LSF schedules jobs based on available resources. There are many resources built into LSF, but you can also add your own resources, and then use them same way as built-in resources.
For maximum flexibility, you should characterize your resources clearly enough so that users have satisfactory choices. For example, if some of your machines are connected to both Ethernet and FDDI, while others are only connected to Ethernet, then you probably want to define a resource called
fddi
and associate thefddi
resource with machines connected to FDDI. This way, users can specify resourcefddi
if they want their jobs to run on machines connected to FDDI.Add New Resources to Your Cluster
- Log in to any host in the cluster as the LSF administrator.
- Define new resources in the
Resource
section oflsf.shared
. Specify at least a name and a brief description, which will be displayed to a user bylsinfo
.See Configuring lsf.shared Resource Section.
- For static Boolean resources and static or dynamic string resources, for all hosts that have the new resources, add the resource name to the RESOURCES column in the
Host
section oflsf.cluster.
cluster_name
.See Configuring lsf.cluster.cluster_name Host Section.
- For shared resources, for all hosts that have the new resources, associate the resources with the hosts (you might also have a reason to configure non-shared resources in this section).
See Configuring lsf.cluster.cluster_name ResourceMap Section.
- Reconfigure your cluster.
Configuring lsf.shared Resource Section
Configured resources are defined in the
Resource
section oflsf.shared
. There is no distinction between shared and non-shared resources.You must specify at least a name and description for the resource, using the keywords RESOURCENAME and DESCRIPTION.
- A resource name cannot begin with a number.
- A resource name cannot contain any of the following characters
: . ( ) [ + - * / ! & | < > @ =A resource name cannot be any of the following reserved keywords: cpu cpuf io logins ls idle maxmem maxswp maxtmp type model status it mem ncpus nprocs ncores nthreads define_ncpus_cores define_ncpus_procs define_ncpus_threads ndisks pg r15m r15s r1m swap swp tmp utTo avoid conflict with inf
andnan
keywords in 3rd-party libraries, resource names should not begin withinf
or nan (upper case or lower case). Resource requirment strings, such as-R "infra"
or-R "nano"
will cause an error. Use-R "defined(infxx)"
or-R "defined(nanxx)"
, to specify these resource names.Resource names are case sensitive Resource names can be up to 39 characters in length You can also specify:
- The resource type (TYPE = Boolean | String | Numeric). The default is Boolean.
- For dynamic resources, the update interval (INTERVAL, in seconds)
- For numeric resources, where a higher value indicates greater load (INCREASING = Y)
- For numeric shared resources, where LSF releases the resource when a job using the resource is suspended (RELEASE = Y)
When the optional attributes are not specified, the resource is treated as static and Boolean.
Defining consumable resources
Specify resources as consumable in the CONSUMABLE column of the RESOURCE section of lsf.shared to explicitly control if a resource is consumable. Static and dynamic numeric resources can be specified as consumable. CONSUMABLE is optional. The defaults for the consumable attribute are:
- Built-in indicies:
- The following are consumable:
r15s
,r1m
,r15m
,ut
,pg
,io
,ls
,it
,tmp
,swp
,mem
.- All other built-in static resources are not consumable. (e.g.,
ncpus
,ndisks
,maxmem
,maxswp
,maxtmp
,cpuf
,type
,model
,status
,rexpri
,server
,hname
).- External shared resources:
- All numeric resources are consumable.
- String and boolean resources are not consumable.
You should only specify consumable resources in the rusage section of a resource requirement string. Non-consumable resources are ignored in rusage sections.
A non-consumable resource should not be releasable. Non-consumable numeric resource should be able to be used in
order
,select
andsame
sections of a resource requirement string.When LSF_STRICT_RESREQ=Y in
lsf.conf
, LSF rejects resource requirement strings where anrusage
section contains a non-consumable resource.Viewing consumable resources
Use
lsfinfo -l
to view consumable resources. For example:lsinfo -l switch
RESOURCE_NAME: switch DESCRIPTION: Network Switch TYPE ORDER INTERVAL BUILTIN DYNAMIC RELEASE CONSUMABLE Numeric Inc 0 No No No Nolsinfo -l specman
RESOURCE_NAME: specman DESCRIPTION: Specman TYPE ORDER INTERVAL BUILTIN DYNAMIC RELEASE CONSUMABLE Numeric Dec 0 No No Yes YesExample
Begin Resource RESOURCENAME TYPE INTERVAL INCREASING CONSUMABLE DESCRIPTION # Keywords patchrev Numeric () Y () (Patch revision) specman Numeric () N () (Specman) switch Numeric () Y N (Network Switch) rack String () () () (Server room rack) owner String () () () (Owner of the host) elimres Numeric 10 Y () (elim generated index) End ResourceResources required for JSDL
The following resources are pre-defined to support the submission of jobs using JSDL files.
Begin Resource RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION osname String 600 () (OperatingSystemName) osver String 600 () (OperatingSystemVersion) cpuarch String 600 () (CPUArchitectureName) cpuspeed Numeric 60 Y (IndividualCPUSpeed) bandwidth Numeric 60 Y (IndividualNetworkBandwidth) End ResourceConfiguring lsf.cluster.
cluster_name
Host SectionThe Host section is the only required section in
lsf.cluster.
cluster_name
. It lists all the hosts in the cluster and gives configuration information for each host.Define the resource names as strings in the Resource section of
lsf.shared
. You may list any number of resources, enclosed in parentheses and separated by blanks or tabs.If you need to define shared resources across hosts, you must use the
ResourceMap
section.String resources cannot contain spaces. Static numeric and string resources use following syntax:
resource_name
=resource_value
Resource_value
must be alphanumeric.For dynamic numeric and string resources, use
resource_name
directly.If resources are defined in both the resource column of the Host section and the ResourceMap section, the definition in the resource column takes affect.
Example
Begin Host HOSTNAME model type server r1m mem swp RESOURCES #Keywords hostA ! ! 1 3.5 () () (mg elimres patchrev=3 owner=user1) hostB ! ! 1 3.5 () () (specman=5 switch=1 owner=test) hostC ! ! 1 3.5 () () (switch=2 rack=rack2_2_3 owner=test) hostD ! ! 1 3.5 () () (switch=1 rack=rack2_2_3 owner=test) End Host
Configuring lsf.cluster.
cluster_name
ResourceMap SectionResources are associated with the hosts for which they are defined in the
ResourceMap
section oflsf.cluster.
cluster_name
.For each resource, you must specify the name and the hosts that have it.
If the
ResourceMap
section is not defined, then any dynamic resources specified inlsf.shared
are not tied to specific hosts, but are shared across all hosts in the cluster.Example
A cluster consists of hosts
host1
,host2
, andhost3
.Begin ResourceMap RESOURCENAME LOCATION verilog (5@[all ~host1 ~host2]) synopsys (2@[host1 host2] 2@[others]) console (1@[host1] 1@[host2] 1@[host3]) xyz (1@[default]) End ResourceMapIn this example:
- 5 units of the
verilog
resource are defined onhost3
only (all hosts excepthost1
andhost2
).- 2 units of the
synopsys
resource are shared betweenhost1
andhost2
. 2 more units of thesynopsys
resource are defined onhost3
(shared among all the remaining hosts in the cluster).- 1 unit of the
console
resource is defined on each host in the cluster (assigned explicitly). 1 unit of thexyz
resource is defined on each host in the cluster (assigned with the keyword default).
restriction:
For Solaris machines, the keywordint
is reserved.Resources required for JSDL
If you plan to submit jobs using JSDL files, you must uncomment the following lines:
RESOURCENAME LOCATION osname [default] osver [default] cpuarch [default] cpuspeed [default] bandwidth [default]RESOURCENAME
The name of the resource, as defined in
lsf.shared
.LOCATION
Defines the hosts that share the resource. For a static resource, you must define an initial value here as well. Do not define a value for a dynamic resource.
Possible states of a resource:
- Each host in the cluster has the resource
- The resource is shared by all hosts in the cluster
- There are multiple instances of a resource within the cluster, and each instance is shared by a unique subset of hosts.
Syntax
([resource_value
@][host_name
... | all [~host_name
]... | others | default] ...)
- For
resource_value
, square brackets are not valid.- For static resources, you must include the resource value, which indicates the quantity of the resource. Do not specify the resource value for dynamic resources because information about dynamic resources is updated by ELIM.
- Type square brackets around the list of hosts, as shown. You can omit the parenthesis if you only specify one set of hosts.
- Each set of hosts within square brackets specifies an instance of the resource. The same host cannot be in more than one instance of a resource. All hosts within the instance share the quantity of the resource indicated by its value.
- The keyword
all
refers to all the server hosts in the cluster, collectively. Use the not operator (~) to exclude hosts or host groups.- The keyword
others
refers to all hosts not otherwise listed in the instance.- The keyword
default
refers to each host in the cluster, individually.Non-batch configuration
The following items should be taken into consideration when configuring resources.
- In
lsf.cluster.
cluster_name
, theHost
section must precede theResourceMap
section, since theResourceMap
section uses the host names defined in theHost
section.- Use the RESOURCES column in the
Host
section of thelsf.cluster.
cluster_name
file to associate static Boolean resources with particular hosts.- Most resources specified in the
ResourceMap
section are interpreted by LSF commands as shared resources, which are displayed usinglsload -s
orlshosts -s
. The exceptions are:
- Non-shared static resources
- Dynamic numeric resources specified using the
default
keyword. These are host-based resources and behave like the built-in load indices such asmem
andswp
. They are viewed usinglsload -l
orlsload -I
.Static Shared Resource Reservation
You must use resource reservation to prevent over-committing static shared resources when scheduling.
The usual situation is that you configure single-user application licenses as static shared resources, and make that resource one of the job requirements. You should also reserve the resource for the duration of the job. Otherwise, LSF updates resource information, assumes that all the static shared resources can be used, and places another job that requires that license. The additional job cannot actually run if the license is already taken by a running job.
If every job that requests a license and also reserves it, LSF updates the number of licenses at the start of each new dispatch turn, subtracts the number of licenses that are reserved, and only dispatches additional jobs if there are licenses available that are not already in use.
Reserving a static shared resource
To indicate that a shared resource is to be reserved while a job is running, specify the resource name in the
rusage
section of the resource requirement string.Example
You configured licenses for the Verilog application as a resource called
verilog_lic
. To submit a job that will run on a host when there is a license available:bsub -R "select[defined(verilog_lic)] rusage[verilog_lic=1]" myjob
If the job can be placed, the license it uses will be reserved until the job completes.
External Load Indices
If you have specific workload or resource requirements at your site, the LSF administrator can define
external resources
. You can use both built-in and external resources for LSF job scheduling and host selection.External load indices report the values of dynamic external resources. A dynamic external resource is a site-specific resource with a numeric value that changes over time, such as the space available in a directory. Use the external load indices feature to make the values of dynamic external resources available to LSF, or to override the values reported for an LSF built-in load index. For detailed information about the external load indices feature, see the
Platform LSF Configuration Reference
.Modifying a Built-In Load Index
An
elim
executable can be used to override the value of a built-in load index. For example, if your site stores temporary files in the/usr/tmp
directory, you might want to monitor the amount of space available in that directory. Anelim
can report the space available in the/usr/tmp
directory as the value for thetmp
built-in load index. For detailed information about how to use anelim
to override a built-in load index, see thePlatform LSF Configuration Reference
.
Platform Computing Inc.
www.platform.com |
Knowledge Center Contents Previous Next Index |