Fix UNKNOWN or DEFAULT Matched Models and Matched Types

Fixing UNKNOWN Matched Type or Matched Model

A model or type UNKNOWN indicates the host or lim on the host is down. You need to take immediate action.

  1. Start the host.
  2. With root (Unix) or administrator (Windows) permission, run egosh ego start host_name to start up the load information manager (lim) on the host.

    You can specify more than one host name to start up the lim on multiple hosts. If you do not specify a host name, the lim is started up on the host from which the command is submitted.

    You must be a cluster administrator to run this command.

    On UNIX, to start up the lim remotely, you must be root or listed in ego.sudoers and be able to run the rsh command across all hosts without entering a password.

  3. Wait a few seconds, then run egosh resource view [resource_name …].
    You should now be able to see either a matched model or type for the host or the result DEFAULT. If you see DEFAULT, it means that automatic detection of host type or model has failed, and the host type configured in ego.shared cannot be found. EGO still works on the host, but there are disadvantages:
    • A DEFAULT Matched Type may cause binary incompatibility because a job from a DEFAULT host type can be migrated to another.

    • A DEFAULT Matched Model may be inefficient because of incorrect CPU factors.

Fixing DEFAULT Matched Type or Matched Model

If automatic detection of host type or model fails, and the host type configured in ego.shared cannot be found, then Matched Type gets set to DEFAULT. A Matched Type reported as DEFAULT may contribute to binary incompatibilities; a Matched Model reported as DEFAULT may be inefficient due to an incorrect CPU factor. You can run lim -t to detect the real type or model for a host, and then make changes to ego.shared.

  1. Run lim -t on the host whose type is DEFAULT.
  2. Edit ego.shared.
    1. In the HostType section, enter a new host type. Use the host type name detected with lim -t.
    2. In the HostModel section, add the new model with architecture and CPU factor. Add the host model to the end of the host model list. The limit for host model entries is 127. Lines commented out with # are not counted as part of the 127 line limit.

      Use the architecture detected with lim -t.

  3. Save changes to ego.shared.
  4. Run egosh ego restart on master host.
  5. Wait a few seconds, then run egosh resource view [resource_name …] to check the type or model for a host.