This page last changed on Jul 15, 2009 by mmcgarry.

Topics marked with * relate to HQ Enterprise-only features.

On the Alert Definition page, you can edit an alert definition's properties and condition set, and define and edit the alert actions HQ should perform when the alert fires.

The Alert Definition page, which appears when you save a new alert definition or edit an existing alert definition, is similar to the New Alert page, with the addition of Edit controls in the "Alert Properties" and "Condition Set" sections, and a set of tabs at the bottom of the page for defining alert actions.

Feedback is welcome. Click Add Comment at the bottom of the page.

Edit Alert Properties

Click in "Alert Properties" section. 

  • Name - Name assigned by the user creating an alert definition. A fired alert is identified, in the HQ user interface and alert notifications, by the alert definition name and a timestamp. An alert definition name should clearly communicate the nature of the problem. For example, "Down" for an alert on availability, or "Low Memory" for an alert on free memory.
  • Description - Description entered by the user creating the alert definition.
  • Priority - The severity of the problem, as defined by the person creating the alert definition:  "Low", "Medium", or "High".  A consistent policy for defining an alert definition priority makes it easier to triage problems appropriately.  An alert's priority is shown in HQ pages that present alert status and in alert notifications. You can sort alerts by priority in HQ Enterprise's  Alert Center or Operations Center.
  • Active - The current enabled/disabled status of the alert definition. Alerts only fire for enabled alert definitions. When an alert definition is disabled, HQ does not evaluate its condition or fire alerts for it.

Edit Alert Condition Set

Click in "Condition Set" section.

Condition Set

An alert condition specifies a resource metric value or event that will initiate the alert firing process.

The condition types you can choose when you define a alert vary by resource type and HQ version. If a condition type is not supported by your version of HQ or is not valid for the target resource, it will not appear as an option.

To define a condition, choose one of the following condition types, and supply required parameter values.

  • Metric condition -  To base the alert on the value of a metric that HQ collects for the resource:
    1. Metric - Select a metric from the selector list.  Only currently enabled metrics are listed.  (If the metric you're looking for is not listed, see the note below.)
    2. Define the rule for evaluating the metric value.  You can:
      • Compare metric value to a specified value. Select a comparison operator:  >(greater than), <(less than), =(equal to), or != (not equal to), and enter the absolute value, or
      • Fire upon change in metric value.  Click value changes.
      • Compare to metric value to baseline, in HQ Enterprise only. Select an operator:  >(greater than), <(less than), =(equal to), or != (not equal to), and choose "Baseline Value" from the pulldown. Baselining must be enabled. For more information, see Baselines.*
To Enable Collection of a Metric

If you want to base a metric condition on a metric that is not currently collected, you have to enable collection of that metric. To do so, update the metric collection settings for the resource type (choose Monitoring Defaults from the Administration tab), or for the specific resource (click Metrics on the Monitor tab for the resource).

  • Inventory Property Condition - To define a condition that is triggered when the value of an inventory property for resource changes, select an inventory property.  The pulldown contains only those inventory properties that are valid for the type of the resource to which the alert applies.
  • Control Action Condition - When you define an alert for a resource that supports control actions, you can define a condition that is triggered when a particular control action is performed. If desired, you can base the condition on a control action with a particular result status: "in progress", "completed", or "failed".  Pulldowns allow you to select a control action that the resource supports, and a result status if desired.
  • Events/Log Level Condition - To define a condition that is triggered by a log event, select a message severity level ("error", "warn", "info", "debug", "all") and optionally a match string. The condition is satisfied each time a message of the selected severity that contains the match string (if one was specified) is written to a log file that HQ is tracking. Log tracking must be enabled for the resource. To determine the log files that HQ monitors for the resource, see the Configuration Properties section of the resource's Inventory tab. The log files that HQ monitors for a resource are defined using the server.log_track.files property. For configuration instructions, see  see Log Tracking.
  • Config Changed... Condition - This type of condition is triggered by a change to a configuration file that HQ is configured to monitor for the resource. To limit the condition to a single file, enter its filename in the "match filename" field.  If you don't specify a filename, a change to any file monitored will trigger the alert.  To determine the log files that HQ monitors for the resource, see the Configuration Properties section of the resource's Inventory tab. The files that HQ monitors for a resource are defined using the server.config_track.files property. The maximum length for filename entered is  25 characters.  For configuration instructions, see Configuration Tracking.

Define Additional Conditions*

In HQ Enterprise, you can define up to three conditions for an alert. To add another condition, click Add Another Condition and specify whether both the new condition and the preceding one must be satisfied for the alert to be triggered ("AND") or only one must be satisfied ("OR").

Define Recovery Alert Behavior*

To designate the alert you're defining as a recovery alert, select the primary alert definition from the pulldown.

A recovery alert condition should detect when the condition that fired the primary alert is no longer true. When a recovery alert fires, it marks the primary alert "Fixed", and the primary alert definition is re-enabled. The primary alert definition should be configured to Generate one alert and then disable alert definition until fixed, as described below. For more information, see Recovery Alerts.

Enable Actions

You can make the condition absolute - (one strike you're out) or fire after the condition occurs repeatedly.  Choose either:

  • Each time conditions are met.  The alert fires upon a single occurrence of the condition, or
  • Once every __ times conditions are met within a time period of __ minutes. This option configures an alert to fire when the condition(s) occur multiple times over a period of time.  Enter the number of  occurrences and period of time.
Option removed in 4.1

In versions of HQ Enterprise previous to 4.1, you could configure an alert definition to fire when its conditions have meet met continuously for a specified portion of an period of time. The option - "When conditions are exceeded for x within a time period of y minutes" - was removed in HQ 4.1.

Enable Action Filters

An action filter can be used to control alert firing and alert actions.

Disable an Alert Definition upon Firing

Click Generate one alert and then disable alert definition until fixed to disable the alert definition after firing and reenable it when the alert that triggered it is marked "Fixed".

This option eliminates redundant firing for the same problem. If you do not choose this option, the alert will fire repeatedly as long as the triggering condition is still true.  

In HQ Enterprise this configuration option -  used in conjunction with recovery alerts -  automates the process of disabling and re-enabling an alert definition.  Result:  (1) no redundant alerts for the same problem, and (2) you don't have manually "fix" an alert triggered by a transient problem.  For more information, see Recovery Alerts.

Disregard Control Actions for Related Alerts.

The Disregard control actions that are defined for related alerts option appears on New Alert Definition pages for resources that support control actions. This option only applies when:

  1. The current alert definition will include an alert action
  2. The resource associated with the alert is a member of an application
  3. There are other members of the same application with alerts that fire control actions (ideally the same control action)

Under these circumstances, this configuration option ensures that if multiple alerts are fired within a short period for resources that are members of the same application, only one control action will be executed. For example, this would prevent a server from being restarted several times in a short period of time for the same alert conditions. For instance, you might have an alert with an action to restart a Tomcat server if the JVM Free Memory got too low and another alert with an action to restart the same server if the JVM Active Thread count got too high. If both alerts fired at the same time and they were filtering control actions, only 1 restart control action would be executed and not two.

Option removed in 4.2

Versions of HQ previous to 4.2 also had a Filter notification actions that are defined for related alerts option to prevent multiple notification when alerts fire for resources on the same platform. In HQ 4.2, the option was removed. HQ 4.2 provides enhanced functionality for global control of notification volume.  For more information, see Set a Notification Throttle.

Create or Edit Alert Actions

You assign actions to an alert definition on the Alert Definition page, which appears when you save a new alert definition or edit an existing alert definition.

The Alert Definition page is similar to the New Alert page, with the addition of Edit controls in the "Alert Properties" and "Condition Set" sections, and tabs at the bottom of the page for defining alert actions.

You can specify multiple actions to be performed automatically when an alert fires. The types of actions available in the Alert Definition page vary based on: (1) the type of resource the alert applies to, (2) your version of HQ, and (3) whether you've configured HQ for the types of actions that must be enabled before you can use them, such as escalations, OpenNMS trap actions, and in HQ Enterprise, SNMP notifications.

To define an alert action, select one of the tabs and supply the required information:

Escalation

Select an escalation from the "Escalation Scheme" pulldown; the tab refreshes and shows the escalation steps. You must define an escalation before you can assign it to an alert definition. Using an escalation that is configured to repeat until the alert is fixed is a good way to prevent redundant alerts firing for the same problem. To create an escalation, click Escalation Schemes Configuration on the Administration tab. For more information about escalations, see Understanding Escalations.

Control Action*

In HQ Enterprise, you can define a resource control action for HQ to perform when the alert fires. The control action can target the current resource (the one to which the alert definition is assigned) or a different resource on the same platform, as long as the resource type has HQ-supported control actions. To configure a control action for the alert, select the Control Action tab, click Edit, and follow the instructions on the associated help page. You can only assign a single control action to an alert definition. Note: You cannot assign a control action to a resource type alert.

Notify Roles*

In HQ Enterprise you can specify one or more HQ roles as notification recipients. HQ users with a role you specify will be notified when an alert is fired.  Click Add to List on the Notify Roles tab.  On the roles selection page, choose the role(s) to be notified when the alert fires. The help page has instructions. 

For information about creating roles specifically for use in notification actions, see in Role-Based Alert Notifications.

Notify HQ Users

Click Add to List on this tab to specify one or more HQ uses as notification recipients. On the user selection page, choose the users to be notified when the alert fires. The help page has instructions. 

Notify Other Recipients

Click Add to List on this tab to specify non-HQ user email recipients for alert notifications.  The help page has instructions. 

Script*

In HQ Enterprise, to assign a script action to the alert definition, click the Script tab, enter the full path to the script, and click Set.  HQ will run the script when the alert fires. Scripts can reference alert-related HQ environment variables to perform custom notification logic. For information, see Define a Script Action for an Alert.

OpenNMS

If HQ Server must be configured for OpenNMS integration, you can use this tab to configure HQ to send an SNMP trap to OpenNMS when the alert fires. The notification will be generated by opennms_notify.gsp alert notification template.

To configure an OpenNMS trap action, enter:

  • Server - Listen address for the OpenNMS server
  • Port for the OpenNMS server.

For more information, see Enabling OpenNMS Integration.

SNMP Notification*

If the HQ Server is configured to send SNMP notifications to your NMS, you can use this tab to configure a trap notification action. See SNMP Server Configuration Properties for more information.

The notification sent when the alert fires will contain three variable bindings:

  • sysUptimeOID.0 - No configuration is required for this binding.
  • snmpTrapOID.0 - This binding is configured on the HQ Server settings page.
  • A variable binding for the alert data specified in the snmp_trap.gsp alert notification template - the alert definition name and the "short reason" for firing. Note that Alert templates may be customized, as described in Tailoring Alert Notification Templates.
Including more variable bindings in SNMP messages
For richer capability, you can configure a SNMP notification as a step in an escalation. An SNMP notification in an escalation can be configured with additional variable bindings. For more information, see Understanding Escalations

To configure an SNMP notification action on HQ Enterprise 4.3, enter:

  • IP Address - the address and port of the target NMS.
  • OID - The OID of the notification to send, which will contain the alert details specified in the snmp_trap.gsp, template.
  • Notification Mechanism - The type of SNMP notification to send:
    • v1 Trap
    • v2c Trap
    • Inform

To configure an SNMP notification action on HQ Enterprise 4.2 and earlier, enter:

  • Address of the target SNMP engine.
  • OID - The OID of the notification to send.

tbb_edit.gif (image/gif)
tbb_remove.gif (image/gif)
tbb_set.gif (image/gif)
tbb_addtolist.gif (image/gif)
tbb_removefromlist.gif (image/gif)
Document generated by Confluence on Apr 20, 2010 15:01