Understanding resource groups

Resource groups overview

Resource groups organize a heterogeneous resource pool.

Resource groups are logical groups of hosts. Resource groups provide a simple way of organizing and grouping resources (hosts) for convenience; instead of creating policies for individual resources, you can create and apply them to an entire group. Groups can be made of resources that satisfy a specific static requirement in terms of OS, memory, swap space, CPU factor, and so on, or that are explicitly listed by name.

The cluster administrator can define multiple resource groups, assign them to consumers, and configure a distinct resource plan for each group. For example:

  • Define multiple resource groups: A major benefit in defining resource groups is the flexibility to group your resources based on attributes that you specify. For example, if you run workload units or use applications that need a Linux OS with not less than 1000 MB of maximum memory, then you can create a resource group that only includes resources meeting those requirements.

    Note:

    No hosts should overlap between resource groups. Resource groups are used to plan resource distribution in your resource plan. Having overlaps causes the hosts to be double-counted (or more) in the resource plan, resulting in recurring under-allocation of some consumers.

  • Configure a resource plan based on individual resource groups: Tailoring the resource plan for each resource group requires you to complete several steps. These include adding the resource group to each desired top-level consumer (thereby making the resource group available for other sub-consumers within the branch), along with configuring ownership, enabling lending/borrowing, specifying share limits and share ratio, and assigning a consumer rank within the resource plan.

Resource groups are either specified by host name or by resource requirement using the select string.

Tip:

You can use the command egoconfig addresourceattr to add a custom tag to any hosts and then specify that tag when creating a resource group. See the reference for more information.

By default, EGO comes configured with three resource groups: InternalResourceGroup, ManagementHosts, and ComputeHosts.

InternalResourceGroup and ManagementHosts should be left untouched, but ComputeHosts can be kept, modified, or deleted as required.

Note:

Requirements for resource-aware allocation policies (where only certain resources that meet specified requirements are allocated to a consumer) can be met by grouping resources with common features and configuring them as special resource groups with their own resource plans.

Notes on setting CPU slots

When you create a resource group in the Platform Management Console, you must choose whether to set the CPU slots in your resource group to 1 slot per CPU or x slots per host.

  • Choose 1 slot per CPU if the workload is expected to contain long-running, compute-intensive workload units.

    Dedicating a CPU to this type of workload unit makes best use of each CPU's resources and contributes to better performance.

  • Choose x slots per host if the workload is expected to contain short-running, I/O intensive workload units, or if workload units are not expected to be compute intensive.

    For example, choose this setting if you want more slots than CPUs for workload units that require less than one CPU to run. Setting x slots per host can give multiple slots per CPU when x is greater than the number of CPUs on the host. If x is less than the number of CPUs on the host (for example, setting 1 slot per host when you have many CPUs on the host), performance suffers because only one CPU gets used while other resources are wasted.

Note the following:

  • There is a 1-to-1 mapping between small workload units (for example, a job, a task, a VM, etc.) and slots.

  • If there is a differing individual host value of “slots per host,” it overrides the setting of x slots per host you set for the resource group. The host level setting overrides the group level configuration.

    For example, if there are 10 hosts in a resource group and you choose in the Console to set up 5 slots per host, you would normally expect to see 50 slots listed within the Member Host Summary section of a resource group’s properties page. However, if you see a different number showing in the summary (for example, 45), then an administrator has manually overridden the settings for one or more hosts. This individual value overrides the group setting configured in the Console.

    In some cases, even if an administrator has not manually changed the "slots per host" value, you may still see an unexpected number in the Member Hosts Summary section. This may mean that certain hosts within this particular resource group are double-allocated, meaning they are allocated to more than one resource group. In cases of double-allocation, the sum of the allocated slots displays in the Member Hosts Summary section, not the number of slots for this resource group alone. It is advised not to double-allocate slots.

  • If you want to change the value of the number of slots per CPU, it must be specified on the workload management side (outside of EGO).

  • The value for the number of CPUs per host is automatically detected during installation.

Built-in InternalResourceGroup and ManagementHosts groups

The InternalResourceGroup and ManagementHosts groups should never be deleted. They are special resource groups that contain hosts used for system services.

When you build the cluster, you use egoconfig mghost to declare which hosts you want to use as management hosts instead of compute hosts. Among other things, the command defines the mg tag on the host. Technically, the ManagementHosts group consists of all hosts that have mg defined.

The built-in ClusterServices and ManagementServices consumers are configured to use the InternalResourceGroup or ManagementHosts groups only. Do not change this configuration.

Distributing management hosts

Each Symphony application's SSM should run on a management host. When you configure the resource plan, distribute a slot from the ManagementHosts resource group to every consumer. In the application profile, specify the ManagementHosts resource group for SSM.

Lost-and-found resource groups

When host slots are allocated to a client, the vemkd detects the resource group to which the host belongs. But when the vemkd restarts, there is a brief interval (while host information gets updated) where it may not immediately detect the host’s resource group. It is during this update interval that Platform EGO creates a temporary resource group called “LOST_AND_FOUND”. The vemkd adds any host with a current allocation to this resource group if it cannot immediately detect an assigned group. Once vemkd completes its update of host information and detects the host’s assigned resource group, the host automatically rejoins it.

Note:

This only happens if the host is already allocated and vemkd must trace its resource group. If the host does not currently belong to an allocation, then vemkd does not perform a search for a resource group.

Similarly, if a host with allocated slots is permanently removed from its resource group (thus never rejoining its original resource group when vemkd restarts), the vemkd adds this host to the “LOST_AND_FOUND” group. It will remain in this group until the cluster administrator frees up the allocation on the host.

Heterogeneous compute hosts

If resources are not homogeneous, and some consumers should not have access to certain resources, create multiple groups in place of the ComputeHosts resource group.

Remove the ComputeHosts group so that group membership does not overlap. When you configure the resource plan, this prevents you from accidentally configuring consumers to use this group.

For a consumer that should not use any hosts from a particular group, configure the consumer properties and disable the resource group.

Customizing host slots in a resource group

By default, a host has one slot per CPU. This limits how much work can be started on the host.

For any resource group, you can configure a fixed number of slots per host, so every host in the group will have the same number of slots. It might be more or less than the number of CPUs on a host.