2.2 Alarm Detection
An Alarm Detection configuration enables you to define criteria for the generation of alarm messages. You select a condition, or combine a set of conditions, that within specific limits, generate an alarm message. To monitor the system alarms, you use the Web Interface. Note that enables you to deliver alarm messages to SNMP monitoring systems, as well.
An Alarm can be in one of two states: new or closed. An open Alarm is an indication of a certain occurrence or situation that has not been resolved yet. A closed Alarm is a resolved indication.
To create a new
configuration, click the button in the upper left part of the Desktop window, and then select from the menu.To open an existing Open Configuration(s)... .
configuration, double-click on the configuration in the Configuration Navigator, or right-click a configuration and then selectAlarm Detection Menus
The contents of the menus in the menu bar may change depending on which configuration type has been opened in the currently displayed tab. Alarm Detection uses the standard menu items that are visible for all configurations, and these are described in the section Configuration Menus in 2.1 Menus and Buttons.
There is one menu item that is specific for Alarm Detection, and it is described in the following section, The Edit Menu.
The Edit Menu
Item | Description |
---|---|
To define a variable to use in the APL code, see the APL Reference Guide, and the section below, Workflow Alarm Value for further information. |
Alarm Detection Buttons
The contents of the button panel may change depending on which configuration type has been opened in the currently displayed tab. Alarm Detection uses the standard buttons that are visible for all configurations, and these are described in the section Configuration Buttons in 2.1 Menus and Buttons.
Defining an Alarm Detection
An Alarm Detection definition is made up of:
A condition, or a set of conditions, see the section below, Alarm Conditions
An object such as host, pico instance, or workflow, that the alarm should supervise
The parameter that you want the alarm to supervise, for example, the Statistics value
Time and value limits of supervision
To create a valid alarm detection configuration make sure that:
The Alarm Detection includes at least one condition.
Two conditions within an alarm guard the same object: host, pico instance, or workflow.
Two conditions are set to the same time interval criteria.
To define an alarm:
Create an Alarm Detection configuration by clicking the
The Alarm Detection
Click on the Edit menu and select the Validate option to check if your configuration is valid.
Click on the Edit menu and select the Workflow Alarm Value Names option to define a variable you can use in the APL code, see the APL Reference Guide, and in the Workflow Alarm condition, see the section below, Workflow Alarm Value.
Enter a statement that describes the Alarm Detection that you are defining in the Description field.
Select the priority that the alarm should have in the Severity drop-down list.
Use the Alarm Detection Enabled check box to turn alarm detection on or off.
At the bottom of the Alarm Detection configuration, click the Add button.
The Add Alarm Condition dialog box opens.
The Add Alarm Condition
Select a condition in the Alarm Condition drop-down list.
Alarm Conditions
The Alarm conditions enable you to define specific situations or events for which you want the system to produce an alarm. You configure a condition to produce an alarm whenever a certain behavior occurs, within specific limits.
Note!
An alarm is generated only if ALL conditions in the Alarm Detection are met.
The Alarm condition limits are reset:
Every time you restart the Platform
Every time you save the alarm configuration
When you resolve the alarm
The Alarm Conditions that you can choose from are:
Host Statistic Value
System Event
Pico Instance Statistic Value
Unreachable Execution Context
Workflow Alarm Value
Workflow Execution Time
Workflow Group Execution Time
Workflow Throughput
Host Statistic Value
The Host Statistic Value condition enables you to configure an alarm detection for the Host Statistic parameters. For further information see Host Statistics in 6.16 System Statistics.
The Host Statistic Value condition
Item | Description |
---|---|
Host | Select a host server from the drop-down list |
Statistic Value | Select from the drop-down list the parameter that you want the alarm to watch over. For a detailed description of every Statistic Value, see 6.16 System Statistics. |
Limits | Select a limit, either or , upon which the alarm should be triggered. Check During Last to specify the time frame during which the Limits value should be compared. If a match is detected, an alarm is invoked. |
Example - Configuring a Host Statistic Value condition
Note!
The parameters in the following example do not apply to any specific system and are only presented here to enhance understanding of the alarm condition.
You want the system to generate a warning if the primary host is being overworked.
Configure an Alarm Detection with the Host Statistic Value condition.
Select the statistic value Swapped in from Disk (blocks/s).
Enter a limit of 1200 swaps-a-second, during the last 3 hours.
Configure an Alarm Detection
Configure the Alarm Condition
The alarm will be triggered only if the statistic value has been higher than 1200 throughout the last 3 hours. Note that if a momentary drop in value has occurred during the last 3 hours, the alarm will not be triggered.
System Event
The System Event condition enables you to set up an Alarm Detection for the various Event types.
The System Event condition
Item | Description |
---|---|
Type | Select an event-related reason for an alarm to be invoked. For a detailed description of every event type, see 4.3. Event Types. |
Filter | Use this table to define a filter of criteria for the alarm messages that you are interested in. To define an entry, double-click on the row. The Edit Match Value dialog box opens. Click the Add button to add value. |
Limits | Specify the condition for the alarm to be triggered. The options are based on the number and frequency of occurrence of the event: Occurred Once, Occurred More Than, Occurred Less Than. In During Last, specify the time frame during which the Limits value should be compared. If a match is detected, an alarm is invoked. |
Example - Configuring a System Event condition
Note!
The parameters in the following example do not apply to any specific system and are only presented here to enhance understanding of the alarm condition.
Configure an Alarm Detection that applies the System Event condition.
Configure an Alarm Detection
On the Edit Alarm Condition dialog box, from the Event Type drop-down list, select Workflow State Event.
On the Filter table double-click workflowName; the Edit Match Value dialog box opens.
Click Add to browse and look for the specific workflow.
Enter a limit of occurred more than 3 times during the last 24 hours.
Select an Alarm Condition
The alarm will be triggered by every 4th occurrence of a "Workflow State Event" during the last 24 hours.
Pico Instance Statistic Value
The Pico Instance Statistic Value condition enables you to configure an Alarm Detection that guards the Pico instance statistic value of a specific EC. For further information about the Pico Instance, see 6.10 Pico Viewer.
The Pico Instance Statistic Value Condition
Item | Description |
---|---|
Pico Instance | From the drop-down list select the pico Instance of which you want to collect statistical data. |
Statistic Value | See Pico Instance in 6.16 System Statistics. |
Example - Configuring a Pico Instance Statistic Value condition
Note!
The parameters in the following example do not apply to any specific system and are only presented here to enhance understanding of the alarm condition.
A telecom provider wants the system to generate an alarm if the following two events occur simultaneously:
The relevant pico instance (EC) memory is overloaded.
Too many files are open on that same particular Pico instance.
Configure an Alarm Detection
Configure an Alarm Detection that supervises EC1 with the Pico Instance Statistic Value condition. Use this condition twice:
With the Used Memory statistic value
With the Open Files Count statistical value
Select the Alarm Condition Pico Instance Statistic Value.
Select an Alarm Condition
From the Statistic Value drop-down list, select Used Memory.
Enter a limit of 900000 KB with - Note!- no time limit. This means that whenever this limit is exceeded, AND the other conditions are met, an alarm is generated.
From the Alarm Detection dialog select the alarm condition Pico Instance Statistic Value once again.
Select Another Alarm Condition
This time use the statistic value Open Files Count.
Enter a limit of 10000 files, without any time limit.
An alarm is triggered by every simultaneous occurrence of overloaded memory on EC1 AND too many open files, at any time.
Unreachable Execution Context
The Unreachable Execution Context condition enables you to configure an Alarm Detection that will alert you if the connection, between the platform and the EC that the alarm supervises, fails.
The Unreachable Execution Context Condition
Item | Description |
---|---|
Pico Instance | See the section above, Pico Instance Statistic Value. Note: Selecting Any from the drop-down list applies the condition to all the clients. |
Unreachable due to normal shutdown | Check to invoke an alarm whenever the connection between the Platform and the client fails due to a normal shutdown of the client. |
Example - Configuring an Unreachable EC condition
Note!
The parameters in the following example do not apply to any specific system and are only presented here to enhance understanding of the alarm condition.
A telecom provider wants the system to generate an alarm if the connection to any EC cannot be re-established within 10 minutes.
Configure an Alarm Detection that uses the Unreachable Execution Context condition.
Configure an Alarm DetectionFrom the Pico Instance drop-down list, select Any.
Define the Alarm Condition
Enter the time limit of During the last 10 minutes.
The alarm will be triggered whenever the system detects a loss of connection between the platform and one of its ECs, for a period that is longer than 10 minutes.
Workflow Alarm Value
The Workflow Alarm Value condition is a customizable alarm condition. It enables you to have the Alarm Detection watch over a variable that you create and assign through the APL code. To apply the Workflow Alarm Condition use the following guidelines:
Create a variable.
Assign the variable with a value.
Set up the Workflow Alarm Value condition.
To Create a Variable name:
From the Edit menu in the Alarm Detection configuration menu, select Workflow Alarm Value Names. The Workflow Alarm Value dialog box opens.
Click the Add button and enter a variable name, e g CountBillingFiles.
Click OK and then close the Workflow Alarm Value dialog box.
To Assign a Value to the Value Name:
In the APL code, include the command DispatchAlarmValue. For example:
consume { dispatchAlarmValue ("CountBillingFiles",1); udrRoute(input); }
To Configure the Workflow Alarm Value Condition:
- At the bottom of the Alarm Detection configuration, click Add ; the Add Alarm Condition dialog box opens.
- From the Alarm Condition drop-down list select Workflow Alarm Value.
- From the Value drop-down list, select the name of the variable that you created.
- Click Browse... to select the Workflow that the Alarm Detection should guard.
- Configure the Limits according to the description of The Workflow Alarm Value and click OK .
The Workflow Alarm Value configuration
Item | Description |
---|---|
Value | Select an alarm value from the drop-down list. |
Workflow | Click Browse... to enter the workflow instance(s) that you want to apply the alarm to. |
Limits | Summation: Select this check box to add up the Note: Selecting Summation means that the During Last entry refers to the time period during which a sum is added up. Once the set period has ended, that sum is compared with the limit value. For All Workflows: Select this check box to add up the values (see Summation above) of all the workflows that the alarm supervises. The Alarm Detector compares this total value with the alarm limit (exceeds or falls below), and generates an alarm message accordingly. Note: This check box can only be selected when Workflow is set to Any. For further information about Limits see the section above, Host Statistic Value. |
The Workflow Execution Time condition enables you to generate an alarm whenever the execution time of a particular, or all workflows, exceeds or falls below the time limit that you specify.
The Workflow Execution Time configuration
Item | Description |
---|---|
Workflow | The default workflow value is Any . Use this value when you want to apply the condition to all the Workflows. Otherwise, click Browse to select a Workflow you apply the condition to. |
Example - Configuring a Workflow Execution Time condition
Note!
The parameters in the following example do not apply to any specific system and are only presented here to enhance understanding of the alarm condition.
A telecom provider wants the system to identify a workflow that has recently run out of input, and to generate an alarm that warns about a processing time that is too short.
Configure an Alarm Detection to use the Workflow Execution Time condition.
Configure an Alarm DetectionClick Browse...; the Workflow Instance Selection dialog box opens.
At the bottom of the dialog box click Any.
Set a limit of Falls below 2 seconds.
Configure the Alarm Condition
An alarm is generated whenever an active workflow seems to process data too fast (in less than 2 seconds).
Workflow Group Execution Time
The Workflow Group Execution Time alarm condition enables you to generate an alarm whenever the execution time of a workflow group exceeds or falls below the time limit that you specify.
The Workflow Group Execution Time configuration
Item | Description |
---|---|
Workflow Group | Click Browse... to enter the address of the workflow group to which you want to apply the alarm. |
Example - Configuring a Workflow Group Execution Time condition
Note!
The parameters in the following example do not apply to any specific system and are only presented here to enhance understanding of the alarm condition.
You want the system to generate an alarm if a billing workflow group has been active longer than 3 hours.
Configure an Alarm Detection that uses the Workflow Group Execution Time condition.
Configure an Alarm Detection
On the Edit Alarm Condition dialog box click Browse... to enter the workflow group you want the alarm detection to supervise.
Configure the Alarm Condition
Enter a limit of Exceeds 3 hours.
The alarm will be triggered if the workflow group has been active longer than 3 hours.
Workflow Throughput
The Workflow Throughput alarm condition enables you to create an alarm if the volume-per-time processing rate of a particular workflow exceeds, or falls below, the throughput limit that you specify.
The Workflow Throughput configuration
Item | Description |
---|---|
Workflow | Select a workflow with the throughput value, and the processing speed, that you want to supervise. For further information about the throughput value calculation, see Throughput Calculation in 3.1.8 Workflow Properties. An alarm is generated if the throughput value is not within the condition limits. |
Limits | For information about Limits see the section above, Host Statistic Value. |
Example - Configuring a Workflow Throughput condition
Note!
The parameters in the following example do not apply to any specific system and are only presented here to enhance understanding of the alarm condition.
You want the system to warn you about the detection of decreased processing rate.
Configure an Alarm Detection to use the workflow throughput condition.
Configure an Alarm Detection
On the Edit Alarm Condition dialog box click Browse... to select the workflow with the processing rate that you want to supervise.
Enter a limit of Falls Below 50000 (batches, UDRs, Bytearray).
Configure an Alarm Condition
The alarm will be triggered by every occurrence of a workflow slowing down its processing rate to a throughput that is lower than 50000 units per second.