Alarm Service

Overview

ZStack Cube Virtualization Edition supports setting up alarms for resources based on load and capacity usage, as well as configuring event alarms for predefined events occurring within the platform. When critical resources experience abnormalities, the platform instantly pushes alarm messages to designated endpoints to help quickly identify and resolve issues, minimizing potential business disruptions.

Alarm Service Infrastructure

The ZStack Cube Virtualization Edition alarm service consists of two main components: the monitoring system and the notification service.
  • Monitoring System
    • Provides time-series data monitoring and alarms, including resource load and capacity data.
    • Captures predefined events from the platform and triggers alarms.
    • Supports custom alarms and alarm message templates.
    • Allows viewing of alarm messages through multiple entry points.
  • Notification Service
    • Sends alarm notifications to designated endpoints, such as system, email, DingTalk, Lark, WeCom, HTTP applications, Microsoft Teams, and SNMP Trap receivers.

Usage Recommendations

Considering that monitoring data consumes certain system resources, it is recommended to configure ZStack Cube Virtualization Edition related resources according to the following requirements:
  • Plan a dedicated physical server as the management node for the platform.
  • Given that monitoring data may periodically consume system disk I/O resources, it is recommended to use an SSD for the management node's system disk.
  • To avoid excessive system disk usage due to large monitoring data, plan for a system disk space of at least 1TB.
  • If your system disk space is less than 500GB, you can modify the following configurations in System Parameters:
    • Monitoring Data Retention Period: Set to 1 month.
    • Monitoring Data Retention Size: Set to a power of 2, such as 32GB, 64GB, or 128GB.

Endpoint

New Endpoint

Endpoints are the foundation for the notification service to push alarm messages. ZStack Cube Virtualization Edition provides a system alarm endpoint as the default option, and you can also create custom endpoints of various types.

Email

Before you begin

An email server needs to be added in advance. For more information, see Add Email Server.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select Email.
    • Email Server: Choose an email server that has been added.
    • Email Address: Enter the email address.
    • Message Language: Select the notification language for alarm messages, including Simplified Chinese, English.
  4. Review the configuration and click OK.

What to do next

  • You can proceed to set up alarm message templates to ensure that alarm messages are sent out in a uniform format as specified. For more information, see Message Template.
  • You can proceed to create a new alarm to push resource alarm messages to designated endpoints. For more information, see Alarm.

DingTalk

Before you begin

  • Make sure ZStack Cube Virtualization Edition has an installed Advanced Edition license, and that the license is in a valid state.
  • Add a DingTalk group bot in advance and configure security settings as needed. After adding, obtain the bot's Webhook URL. For more information, refer to the official DingTalk Open Platform documentation.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select DingTalk.
    • Address: Enter the Webhook URL of the DingTalk bot.
    • Security Setting: Choose the security settings configured for the DingTalk group bot, including Signature or Other (keywords or IP addresses).
      • Custom Keywords: Alarm messages must contain at least one custom keyword to be sent successfully. If you choose this method, make sure you add "Alarm" as the keyword. Otherwise, alarm messages will fail to send.
      • IP Address: Only requests from within the specified IP address range will be processed by third-party applications. If you choose this method, add the management node IP address and VIP of the platform to the bot's IP allowlist to ensure that third-party applications receive alert messages correctly.
    • Mention Member: Set whether to notify specific members when alarm messages are pushed to the DingTalk group. Options include Nobody, All, or Specified Members. When choosing Specified Members, add the mobile phone numbers of the group members.
    • Message Language: Select the notification language for alarm messages, including Simplified Chinese and English.
  4. Review the configuration and click OK.

What to do next

  • You can proceed to set up alarm message templates to ensure that alarm messages are sent out in a uniform format as specified. For more information, see Message Template.
  • You can proceed to create a new alarm to push resource alarm messages to designated endpoints. For more information, see Alarm.

Lark

Before you begin

  • Make sure ZStack Cube Virtualization Edition has an installed Advanced Edition license, and that the license is in a valid state.
  • Add a Lark group bot in advance and configure security settings as needed. After adding, obtain the bot's Webhook URL. For more information, refer to the official Lark Open Platform documentation.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select Lark.
    • Address: Enter the Webhook URL of the Lark bot.
    • Security Setting: Choose the security settings configured for the Lark group bot, including Signature or Other (keywords or IP addresses).
      • Custom Keywords: Alarm messages must contain at least one custom keyword to be sent successfully. If you choose this method, make sure you add "Alarm" as the keyword. Otherwise, alarm messages will fail to send.
      • IP Address: Only requests from within the specified IP address range will be processed by third-party applications. If you choose this method, add the management node IP address and VIP of the platform to the bot's IP allowlist to ensure that third-party applications receive alert messages correctly.
    • Mention Members: Set whether to notify specific members when sending alarm messages via the bot. Options include Nobody, All, or Specified Members. When choosing Specified Members, add the user IDs of the designated members.
    • Message Language: Select the notification language for alarm messages, including Simplified Chinese and English.
  4. Review the configuration and click OK.

What to do next

  • You can proceed to set up alarm message templates to ensure that alarm messages are sent out in a uniform format as specified. For more information, see Message Template.
  • You can proceed to create a new alarm to push resource alarm messages to designated endpoints. For more information, see Alarm.

WeCom

Before you begin

  • Make sure ZStack Cube Virtualization Edition has an installed Advanced Edition license, and that the license is in a valid state.
  • Add a WeCom group bot in advance. After adding, obtain the bot's Webhook URL. For more information, refer to the official WeCom documentation.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select WeCom.
    • Address: Enter the Webhook URL of the WeCom bot.
    • Mention Members: Set whether to notify specific members when sending alarm messages via the bot. Options include Nobody, All, or Specified Members. When choosing Specified Members, add the user IDs of the designated members.
    • Message Language: Select the notification language for alarm messages, including Simplified Chinese and English.
  4. Review the configuration and click OK.

What to do next

  • You can proceed to set up alarm message templates to ensure that alarm messages are sent out in a uniform format as specified. For more information, see Message Template.
  • You can proceed to create a new alarm to push resource alarm messages to designated endpoints. For more information, see Alarm.

SMS

Before you begin

Obtain an AccessKey that includes SMS services from a third-party provider in advance.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select SMS.
    • AccessKey ID: Enter the AccessKey ID that identifies the user.
    • AccessKey Secret: Enter the secret key used to authenticate the user.
    • SMS Address: Enter the phone number to receive SMS messages.
  4. Review the configuration and click OK.

What to do next

  • You can proceed to set up alarm message templates to ensure that alarm messages are sent out in a uniform format as specified. For more information, see Message Template.
  • You can proceed to create a new alarm to push resource alarm messages to designated endpoints. For more information, see Alarm.

HTTP Application

Before you begin

Prepare the Webhook URL for the HTTP application in advance, and configure the username and password as needed.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select HTTP Application.
    • Address: Enter the URL of the HTTP service.
    • Username: Enter the username configured for the HTTP application.
    • Password: Enter the password corresponding to the username.
  4. Review the configuration and click OK.

What to do next

  • You can proceed to set up alarm message templates to ensure that alarm messages are sent out in a uniform format as specified. For more information, see Message Template.
  • You can proceed to create a new alarm to push resource alarm messages to designated endpoints. For more information, see Alarm.

Microsoft Teams

Before you begin

  • Make sure ZStack Cube Virtualization Edition has an installed Advanced Edition license, and that the license is in a valid state.
  • Add the Incoming Webhook app to Microsoft Teams in advance. After adding, obtain the Webhook URL. For more information, refer to the official Microsoft Teams documentation.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select Microsoft Teams.
    • Address: Enter the Webhook URL obtained from Microsoft Teams.
    • Message Language: Select the notification language for alarm messages, including Simplified Chinese and English.
  4. Review the configuration and click OK.

What to do next

  • You can proceed to set up alarm message templates to ensure that alarm messages are sent out in a uniform format as specified. For more information, see Message Template.
  • You can proceed to create a new alarm to push resource alarm messages to designated endpoints. For more information, see Alarm.

SNMP Trap Receiver

Before you begin

  • Make sure ZStack Cube Virtualization Edition has an installed Advanced Edition license, and that the license is in a valid state.
  • Enable SNMP management and add an SNMP Trap receiver in advance. For more information, see Enable SNMP Management.

Procedure

  1. In the navigation pane, choose O&M Management > Endpoint.
  2. On the Endpoint page, click New Endpoint.
  3. In the New Endpoint dialog, set the following parameters:
    • Name: Set the name for the endpoint.
    • Description: Optionally fill in a description for the endpoint.
    • Type: Select SNMP Trap Receiver.
    • SNMP Trap Receiver: Choose the SNMP Trap receiver that has been added.
  4. Review the configuration and click OK.

Manage Endpoint

Modify Basic Information

If you only need to modify the name and description of an endpoint, you can do so on the Endpoint page by clicking Actions > Edit Name and Description.

If you need to modify the configuration information of an endpoint, you can do so on the Endpoint page by clicking Actions > Modify Configuration.

Enable/Disable Endpoints

To enable or disable one or more endpoints to prevent unnecessary personnel from receiving alarm messages and ensure that relevant personnel can receive alarm information in a timely manner to take necessary measures, you can select the desired endpoints on the Endpoint page and then click Enable or Disable.

Add/Remove Alarms for Endpoints

To add or remove alarms for an endpoint to ensure it only receives relevant alarm information and avoid unnecessary disturbances, you can select an endpoint on the Endpoint page, then click Actions > Add Alarm/Remove Alarm and choose the alarms you wish to add or remove.

Delete Endpoints

To delete an existing endpoint, you can select the endpoint you wish to delete on the Endpoint page, then click Actions > Delete to remove it.
Note: You cannot delete system endpoints.

Message Template

New Message Template

Message templates are text templates used by alarms to push notification messages to endpoints. You can specify a default message template for each type of endpoint, and alarm messages will be sent using the format of the default template.

Before you begin

Before creating message templates for DingTalk, Lark, WeCom, or Microsoft Teams, make sure ZStack Cube Virtualization Edition has an installed Advanced Edition license, and that the license is in a valid state.

Procedure

  1. In the navigation pane, choose O&M Management > Message Template.
  2. On the Message Template page, click New Message Template.
  3. In the New Message Template dialog, set the following parameters:
    Basic Information
    • Name: Enter the name for the message template.
    • Description: Provide a description for the message template.
    Template Information

    When the type is selected as Email, DingTalk, Lark, WeCom, HTTP Application, or Microsoft Teams, configure the template as follows:

    • Type: Select the message template type, choosing from Email, DingTalk, Lark, WeCom, HTTP Application, or Microsoft Teams.
    • Alarm Type: Select resource alarms or event alarms.
    • Alarm Message Title: The title displayed in the alarm message.
    • Alarm Message Text: The text displayed in the alarm message.
    • Recovery Message Title: The title of the recovery message sent when a monitored resource's status returns to normal. This parameter is only supported for resource alarms.
    • Recovery Message Text: The text of the recovery message. This parameter is only supported for resource alarms.
    • Default Template: Unchecked by default. Checking this option sets the current template as the default template.

    When the type is selected as SMS, configure the template as follows:

    • Type: Select SMS as the message template type.
    • Signature: Enter the SMS signature name applied for on the third-party platform.
    • Resource Alarm Template: Set the resource alarm message template and enter the resource alarm template CODE.
    • Event Alarm Template: Set the event alarm message template and enter the event alarm template CODE.
    • Default Template: Set this template as the default template. After setting, all SMS messages will be sent in this template format.
    Note:
    1. To create an Email or Lark message template, follow the Text syntax.
    2. To create a DingTalk or WeCom message template, follow the Markdown syntax
    3. To create a HTTP Application message template, follow the JSON syntax
    4. To create a Microsoft Teams message template, follow the Webhook syntax requirements listed on the Microsoft Teams official website
    5. To create a SMS message template, you need to apply for third-party SMS signatures templates in advance. Currently, you can use Alibaba Cloud SMS service. Any template changes require re-applying through the third-party service.
  4. Review the configuration and click OK.

Manage Message Template

Modify Basic Information

If you only need to modify the name and description of a message template, you can do so on the Message Template page by clicking Actions > Edit Name and Description.

If you need to modify other configurations of the message template, including basic information and template information, you can do so on the Message Template page by clicking Actions > Modify Configuration.

Set Default Message Template / Unset Default

If you have added multiple message templates and need to specify one as the default message template, alarm messages will be sent using the format of the default template. You can set a template as the default on the Message Template page by clicking Actions > Set as Default.

If you need to unset the default message template, you can do so on the Message Template page by clicking Actions > Unset Default.

Delete Message Templates

To delete one or more message templates, you can do so on the Message Template page by clicking Bulk Actions > Delete.

Alarm

Alarm Rules

Before creating a new alarm, you may want to familiarize yourself with the alarm rules. This section will demonstrate how to configure alarm rules for resource-based and event-based alarms.

Resource Alarm Rules

Parameter Description Example
Resource Type The type of resource monitored by the alarm. Virtual Machine
Metric Item Types and names of various monitoring metrics. CPU Usage
Resource The specific resource object monitored by the alarm. /
Alarm Trigger Rule Comparison Relation Defines how the detected metric value compares to the threshold.

Comparison relations include >, ≥, <, and ≤.

Threshold The threshold value and unit that triggers an alarm. 85%
Duration The duration for which the alarm must continuously trigger before sending an alarm message.

Durations include 30 seconds, 1 minute, 5 minutes, 10 minutes, 30 minutes, 1 hour, custom.

5 minutes
Alarm Interval When an alarm is triggered, it notifies at a specific interval.

Intervals include once only, every 1 hour, custom.

Every 1 hour
Emergency Level Alarm message levels include Emergent, Major, Info. Major
Recovery Notification Sends a recovery notification when the monitored resource returns from an alarm state to a normal state. /

Event Alarm Rules

Parameter Description Example
Resource Type The type of resource monitored by the alarm. Data Storage
Metric Item Name of the monitored event or metric. Data Storage Disconnected
Emergency Level Alarm message levels include Emergent, Major, Info. Emergent

New Resource Alarm

Resource alarms are used to monitor time-series data of resources within the platform. For example, you can set an alarm for the CPU usage of a virtual machine. If the CPU usage exceeds 80% and persists for 5 minutes, it will trigger a system alarm.

Before you begin

  • ZStack Cube Virtualization Edition provides system parameter functionality to globally control the default behavior of platform settings. You can customize parameters related to alarms within the system parameters. For more information, see System Parameters.
  • Some alarm items require VMTools for monitoring and alerting. For more information about VMTools, see Virtual Machine VMTools.

Procedure

  1. In the navigation pane, choose O&M Management > Alarm > Resource Alarm.
  2. On the Resource Alarm page, click New Resource Alarm.
  3. In the New Resource Alarm dialog, set the following parameters:
    Basic Information:
    • Name: Enter the name for the resource alarm.
    • Description: Provide a description for the resource alarm.
    Configuration Information:
    • Resource Type: Select the type of resource.
    • Metric Item: Choose alarm items based on the selected resource type. Some alarm items require selecting a resource and setting corresponding alarm trigger rules.
    • Alarm Interval: Choose the alarm interval type, including one-time and repeated alarms.
    • Emergency Level: Different levels of alarms will issue messages according to their severity, including Emergent, Major, and Info.
    • Alarm Recovery Notification: Disabled by default. If enabled, you will receive a recovery notification when any monitored resource returns to a normal state from an alarm state.

      Recovery notifications are sent using the default message template, but you can also customize the message template. For more information, see New Message Template.

    • Endpoint: After the alarm is triggered, the alarm message will be pushed to the specified endpoint.

      The system provides the default endpoint. You can also create custom endpoints. For more information, see New Endpoint.

  4. Review the configuration and click OK.

New Event Alarm

Event alarms are used to monitor predefined events within the platform. For example, a host disconnection event alarm will trigger a system alarm when the host becomes unreachable.

Before you begin

  • ZStack Cube Virtualization Edition provides system parameter functionality to globally control the default behavior of platform settings. You can customize parameters related to alarms within the system parameters. For more information, see System Parameters.
  • Some alarm items require VMTools for monitoring and alerting. For more information about VMTools, see Virtual Machine VMTools.

Procedure

  1. In the navigation pane, choose O&M Management > Alarm > Event Alarm.
  2. On the Event Alarm page, click New Event Alarm.
  3. In the New Event Alarm dialog, set the following parameters:
    • Resource Type: Select the type of resource.
    • Metric Item: Choose alarm items based on the selected resource type.
    • Emergency Level: Different levels of alarms will issue messages according to their severity, including Emergent, Major, and Info.
    • Endpoint: After the alarm is triggered, the alarm message will be pushed to the specified endpoint.

      The system provides the default endpoint. You can also create custom notification objects. For more information, see New Endpoint.

  4. Review the configuration and click OK.

Manage Alarm

Modify Basic Information

If you only need to modify the name and description of an alarm, you can do so on the Alarm page by clicking Actions > Edit Name and Description.

If you need to modify the basic information and configuration of a resource alarm, or the configuration information of an event alarm, you can do so on the Alarm page. Select the target resource alarm or event alarm, then click Actions > Modify Configuration to make changes.

Enable/Disable Alarms

To enable or disable one or more alarms, you can do so on the Alarm page by selecting the alarms and then clicking Enable or Disable.

Add/Remove Endpoints for Alarms

To add or remove endpoints for an alarm, ensuring that endpoints only receive relevant alarm information and avoid unnecessary disturbances, you can do so on the Alarm page by clicking Actions > Add Endpoint/Remove Endpoint. Select the endpoints you wish to add or remove.

Delete Alarms

To delete one or more alarms, you can do so on the Alarm page by selecting the alarms and then clicking Delete.
Note: You cannot delete default alarms.

Alarm Message

View Alarm Messages

About this task

Triggered alarms are visible in several locations throughout the platform.
  • View from the alarm message page: The alarm messages page presents the overall platform alarm information on a single dashboard. You can compare and view alarm messages based on different dimensions, helping you intuitively and comprehensively understand the platform's resource status and identify potential issues and bottlenecks.
  • View from the resource's monitoring tab: You can focus on a specific resource to view its alarm messages, gaining a more detailed understanding of the alarm conditions for that resource. This allows for more targeted optimization and adjustments.
  • View from the bottom task and alarm pane: You can check alarm messages from the pane at the bottom of the platform. The pane displays up to 50 recent alarm messages, and you can also click More to navigate to the alarm messages page.

Procedure

  1. In the navigation pane, choose O&M Management > Alarm Message.
  2. View the triggered alarm messages.
    The alarm message page consists of weekly alarm statistics, weekly alarm distribution, and an alarm message list.
    • Alarm Statistics in Recent 1 Week: Displays alarm statistics for the past 7 days in a bar chart format, with a sampling interval of 8 hours. Hover over the bar chart to view the number of alarms at different levels.
    • Alarm Distribution in Recent 1 Week: Shows the percentage of resource alarms within the last 7 days using a bar chart. Hover over the bar chart to view the number of alarms for different types of resources.
    • Alarm Message List: Displays up to 1,000 alarm messages in a list format. You can filter the display by resource type and time.

Acknowledge Alarm Messages

Acknowledging an alarm lets other users know that you are taking ownership of the issue. Administrators acknowledge alarm messages to make it easier for O&M personnel to identify and respond to them promptly, ensuring no critical alarm information is missed.

Procedure

  1. In the navigation pane, choose O&M Management > Alarm Message.
  2. In the Triggered Alarms list, select an alarm message and click Acknowledge.
    1. Acknowledged alarms will not be displayed in the triggered alarms tab but can be viewed in the all alarms tab.
    2. After acknowledging, if the alarm issue is not resolved promptly, the alarm system will continue to trigger and push messages according to the rules. To avoid repeated notifications, you can set a silence period as needed.

Set Silence Period for Alarm Messages

If you need to temporarily suspend the push of a specific alarm message for a certain period, you can set a silence period for it. During the silence period, the alarm message will not be pushed. Once the silence period ends, if the alarm remains triggered, the alarm message will be pushed again.

Procedure

  1. In the navigation pane, choose O&M Management > Alarm Message.
  2. In the Triggered Alarms or All Alarms list, select an alarm message and click Actions > Set Silence Period.
  3. In the Set Silence Period dialog, choose a silence period.

Restore Alarms

To resume the push of alarm messages during the silence period, you can manually restore the alarm.

Procedure

  1. In the navigation pane, choose O&M Management > Alarm Message.
  2. In the Triggered Alarms or All Alarms list, select an alarm message and click Restore Alarm.