IBM Books

Administering Satellites Guide and Reference


Identifying and Fixing a Failed Satellite

The sections that follow describe how to use the Satellite Administration Center to identify and fix problems on failed satellites. For information about how to use the windows and notebooks of the Satellite Administration Center, and for information about the different views and icons that are available, refer to the online help that is available from the Satellite Administration Center.

The following sections provide information on:

  1. Identifying the Failed Satellite
  2. Obtaining Information About the Failure
  3. Assigning a Fix Batch to the Satellite
  4. Debugging the Fix Batch
  5. Returning the Repaired Satellite to Production.

Identifying the Failed Satellite

The Satellite Administration Center uses roll-up views, which provides easy access to a quick, high-level snapshot of a group, or of an application version. The roll-up views display information about any failures that occur. That is, if a satellite reports a failure when it uploads its status during a synchronization session, you do not need to have the satellite details open to find out that a failure occurred. Instead, the icons that are displayed in the object tree (the left-hand side of the Satellite Administration Center) indicate if an error has occurred.

If you want to view basic information about the icons that are used in the Satellite Administration Center, use the Show/Hide Legend icon on the tool bar to open the Legend window. For information about the different possible states of these icons, refer to the legend, which is available from the Satellite Administration Center help.

The following example describes how to identify a production satellite that reports an error, and what the error is. To find the failed production satellite, and the error that it reported:

  1. Expand the Groups folder in the object tree.

    The Group folder for the group that contains the failed satellite will have a red "X" superimposed over it to indicate that at least one production satellite reported an error while executing the group batches of a specific application version.

  2. Select the Group folder.

    So that you can easily identify when a production satellite reports an error, the icons that represent both the Satellites folder and the Application Version folder associated with the group contain red within the folder.
    Note:The Application Version folder associated with the group only has red within the folder when one or more production satellites report an error while executing a group batch. The Satellites icon associated with the group has red within the folder if any satellite, either test or production, reports an error.

  3. Within the Group folder:
    1. Click the Application Version folder to determine which application version was being executed by the production satellite (or satellites) that reported an error. The application version details view opens in the contents (the right-hand side of the display) pane of the Satellite Administration Center. For information about this view, refer to the online help that is available from the Satellite Administration Center.

      The icon representing the application version that was being executed by a production satellite that reported an error has a red "X" superimposed over it as follows:
      satapppd

    2. Click the Satellites folder to determine which production satellite (or satellites) reported an error. The satellite details view opens. For information about this view, refer to the online help that is available from the Satellite Administration Center.

      The icon the represents the production satellite that reported the error is as follows:
      satsatpd

      The X over the icon indicates that the production satellite has failed when executing group batches. The gradation on the sphere in the icon indicates that the satellite is disabled.
      Note:If a test satellite reports an error, it is also disabled from executing group batches.

Obtaining Information About the Failure

When you identify the failed satellite, the next step is to obtain information about the failure that occurred. To perform this task, you view the logs for the satellite:

  1. Right click on the failed satellite.
  2. Select Show Logs from the pop-up menu.

    The Show Logs window opens. The logs in the window are organized by date in descending order.

  3. Select the log that records the failure and right click. Because the logs are organized by date in descending order, typically, the log that records the failure is the first log in the list.
  4. Select View Details from the pop-up menu.

    The Log Details window opens. This window displays the full set of information for the log that you selected. The information includes the batch that was being executed, as well as the batch step and the script that did not execute successfully. The log information also indicates whether the error resulted in an external or an internal return code. For more information, see Internal and External Error Return Codes.

You can also view the logs for the failed satellite by starting from the Logs folder in the object tree. To perform this task:

  1. Select the Logs folder.

    The Log Details view opens in the contents pane. The logs in the view are organized by date in descending order. You can use the sort and filter facilities that are available from the Satellite Administration Center to modify the details view to reveal failed satellites. For example, you can filter the view to display the logs for a specific satellite, or only failed satellites.

  2. Select the log that records the failure and right click.
  3. Select View Details from the pop-up menu.

    The Log Details window opens. This window displays the full set of information for the log record that you selected. The information includes the batch that was being executed, as well as the batch step and the script that did not execute successfully. The log information also indicates whether the error resulted in an external or an internal return code. For more information, see Internal and External Error Return Codes.

Assigning a Fix Batch to the Satellite

When you have determined the problem that caused the satellite to fail, the next step is to attempt to fix the problem. If the satellite reported an error that occurred on other satellites, you may already have a fix batch to fix that error. Otherwise, use the information from the log to create a fix batch. For information on creating a batch, refer to the online help that is available from the Satellite Administration Center.

To assign a fix batch to the satellite:

  1. Open the Satellite Details view. For information on performing this task, refer to the online help that is available from the Satellite Administration Center.
  2. Select the failed satellite in the Satellite Details view and right click.
  3. Select Fix from the pop-up menu. The Fix Satellite window opens.

    Use the Fix Satellite window to assign the fix batch that you want the satellite to execute, and the batch step where you want the satellite to begin executing the batch. You can use the ... push button to display the list of fix batches and unassigned batches that are available. If you do not have a batch that is suitable to fix the problem, create an unassigned batch and use it.
    Note:The satellite will only be able to execute fix batches. Because the satellite is in fix mode, it cannot execute group batches. If the user attempts to synchronize the failed satellite before you enable it to execute the fix batch, the SQLCODE -3934W is returned at the satellite.

  4. Click OK.
  5. Select the failed satellite in the Satellite Details view and right click.
  6. Select Enable from the pop-up menu.

    When a satellite reports an error, its state changes to Failed in the Satellite Details view. In addition, the satellite is disabled. That is, the satellite cannot execute batches. The satellite must be enabled to execute the fix batch.

  7. Click OK.
  8. Have the user synchronize.
  9. View the results of the execution of the fix batch.

    To perform this task, you should view the logs for the satellite to determine whether the satellite executed the fix batch successfully. In addition, you can query the results of the fix batch. To perform the query, you can add a batch step to the fix batch that the satellite executed and have the satellite execute only that batch step, or you can use a different fix batch. If you are satisfied with the results of the fix, you are ready to promote the satellite back to executing its group batches. For details, see Returning the Repaired Satellite to Production. If you are not satisfied with the results of the fix, see Debugging the Fix Batch.

Debugging the Fix Batch

If the satellite reported an error when it executed the fix batch, or the results of the fix batch are not satisfactory, you need to debug the fix batch.

To debug the fix batch:

  1. Determine the problem with the fix batch.

    To perform this task, you should examine the logs for the satellite that executed the fix batch. If the log shows that an error occurred, you can begin the process of debugging the fix batch based on the error. If, however, an error did not occur, you can have the satellite execute a fix batch that queries the state of the satellite. You may have to try different queries to determine the problem.

  2. Edit the fix batch to make the changes that you require.

    For information about this task, refer to the online help that is available from the Satellite Administration Center.

  3. Open the satellite details view.
  4. Select the satellite that you want and right click.
  5. Select Edit from the pop-up menu.

    The Edit Satellite notebook opens.

  6. On the Batches page, specify the fix batch that you want the satellite to execute, and the batch step where you want the satellite to begin executing the batch.
  7. Enable the satellite to execute the fix batch, if necessary.

    This step is only required if the satellite reported an error when it executed the fix batch. When a satellite reports an error, that satellite is automatically disabled from executing batches. If the satellite successfully executed the fix batch, but the results of the fix batch are not satisfactory, the satellite remains enabled to execute fix batches. You can check the satellite details view in the Satellite Administration Center to determine whether the satellite is enabled or disabled.

  8. Have the user synchronize.
  9. View the results of the execution of the fix batch.

    To perform this task, you should view the logs for the satellite to determine whether the satellite executed the fix batch successfully. In addition, you can have the satellite execute another fix batch to query the results of the previous fix batch. If you are satisfied with the results of the fix, you are ready to promote the satellite back to executing its group batches. For details, see Returning the Repaired Satellite to Production. If you are not satisfied with the results of the fix, return to step 2 and repeat the procedure.

Returning the Repaired Satellite to Production

When the fix that you apply to the satellite produces the results that you want, the satellite can return to production. That is, the satellite can return to executing its group batches when it synchronizes. To return the satellite to production:

  1. Open the satellite details view.
  2. Select the satellite that you fixed and right click.
  3. Select Promote from the pop-up menu.

    The Promote Satellite window opens.

  4. Depending on the fix that you applied, you may need to specify that the satellite resumes execution of one or more of its group batches at a different batch step than the next one to be executed. The fields of the Promote Satellite window indicate which group batches the satellite executes, and the batch step at which the satellite begins executing each group batch. Use the ... push button to specify the batch step where the satellite is to begin executing its group setup, update, or cleanup batch, as required.
  5. Click OK.

    If the satellite is already enabled (that is, it did not report an error when executing the fix batch), the next time that the user synchronizes, the satellite will download and execute its group batches, beginning execution at the batch steps that you specified. If the satellite reported an error when it executed the fix batch and the error is not important, you must enable the satellite before it can execute its group batches.

  6. If the satellite is disabled, select it from the satellite details view and right click.
  7. Select Enable from the pop-up menu.
  8. Click OK when the Enable Satellite window opens to confirm that you want to enable this satellite to execute its group batches.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ Top of Page ]