Configure Alerts

Overview

Note

As generating alerts is dependent on cluster monitoring, please configure monitoring before configuring alerts from Grafana.

Working alongside cluster monitoring, this feature proactively alerts you to the changing state of a cluster. Alerts allow you to proactively remediate any issues that could adversely affect both the cluster and the applications that depend on it.

Pre-configured alerts are included that can be configured by, and triggered for, individual cluster(s).

These pre-configured alerts, most of which can be customized, have the following components:

  • A trigger condition (or alert rules) refers to what’s being monitored.

  • Each alert has configurable parameters that, when taken together, constitute a threshold.

  • When an alert rule's threshold is reached, an alert is sent to the list of configured contact points.

Prerequisites

  • A SingleStore 7.3 or later Metrics cluster configured to monitor one or more Source clusters via Configure Monitoring.

  • A Grafana instance with the monitoring data source configured via Connect Data Visualization Tools.

  • Grafana version 9.5.6 or later. Note that this is a later version than the one required for monitoring. Grafana can be updated by installing this version or a later version and restarting the Grafana server.

Configure Contact Points

Contact points define the connections to the external services that deliver alerts, such as email, PagerDuty, Slack, and more. This section describes how to configure these three contact points. A number of other integrations are supported by Grafana. Refer to Configure contact points for additional information.

Email

Configure the SMTP Server (Optional)

Note

If you already have an SMTP server configured, you can skip this example.

While the Postfix SMTP server is used in this example, email alerts can be configured with any SMTP server.

Red Hat Distribution
  1. Install Postfix.

    sudo yum install postfix
  2. Start Postfix.

    systemctl start postfix
  3. Enable the Postfix service to ensure that it will restart when the host is rebooted.

    systemctl enable postfix
  4. After installation, run the postconf command to see the Postfix configuration. Confirm that the record 127.0.0.1 example.com entry exists in the /etc/hosts file.

Debian Distribution
  1. Install Postfix.

    sudo apt-get install postfix
  2. During installation, you will be prompted to select an SMTP server type. Select the “Internet Site” option, which allows emails to be sent outside the server.

  3. For testing purposes and/or a local-only configuration, use the suggested name (example.com) when prompted for a fully qualified domain name (FQDN), and click Ok.

  4. After installation, run the postconf command to see the Postfix configuration. Confirm that the record 127.0.0.1 example.com entry exists in the /etc/hosts file.

Configure Email Alerts

Before continuing, ensure that your SMTP server is configured to send outbound messages to your desired recipients.

  1. Update the [smtp] section of the /etc/grafana/grafana.ini Grafana configuration file with the following values:

    1. enabled = true

    2. host = <SMTP server hostname>:<port>

    3. user = The user that the SMTP server uses to authenticate the client that is sending the email

    4. password = The above user’s password

    5. from_address = The email address from which the email is sent

    6. from_name = The name of the sender

  2. Restart the Grafana server.

    sudo systemctl restart grafana-server
  3. In Grafana, add a new contact point and specify:

    1. Integration: Email

    2. Addresses: A list of addresses to send to the alerts to

    3. (Optional) Configure additional settings under Optional Email settings.

PagerDuty

  1. In PagerDuty, navigate to Services and either create a new service or modify an existing one. Add Events API v2 as an integration when creating or modifying the service.

  2. View the Events API v2 integration and copy the Integration Key (for example, 1v970cee7a8c4c01d0e407b31e098a59).

  3. In Grafana, add a new contact point and specify:

    1. Integration: PagerDuty

    2. Integration Key: Paste the integration key from the previous step

    3. (Optional) Configure additional settings under Optional PagerDuty settings, such as severity, class, etc.

Slack

To message a specific Slack workspace channel when an alert is triggered:

  1. Go to Slack API: Applications and create a new app.

    1. Select From Scratch.

    2. Specify the workspace to message.

  2. Select Incoming Webhooks and enable Activate Incoming Webhooks.

  3. Click Add new Webhook to Workspace and select a channel to message. Create the Slack channel if it does not exist. Copy the URL of the created Webhook.

  4. In Grafana, add a new contact point and specify:

    1. Integration: Slack

    2. Webhook URL: Paste the URL of the Webhook from the previous step

    3. (Optional) Configure additional settings under Optional Slack settings, such as username, mention users, etc.

  5. Test the integration setup and click the Save contact point button.

Configure Alert Rules

The following rules are supported by default.

Trigger Condition

Condition

Threshold

What it measures

What it could mean

CPU Utilization

Core CPU utilization is greater than

90%

A workload is beginning to slow due to lack of available CPU cycles

How long at least one CPU core has been running over the specified maximum threshold

For this number of minutes

5

Memory Utilization

Memory utilization is greater than

90%

How long that the cluster's memory has been allocated over the specified maximum threshold

The cluster is using more memory than expected

For this number of minutes

5

Persistent Cache (Disk) Utilization

Persistent cache is greater than

90%

How long the disk has been allocated over the specified threshold

The cluster is using more disk space than expected and warrants further investigation

For this number of minutes

5

Partition Availability

Partitions are unavailable for this number of minutes

10

Whether both primary and secondary partitions are available

There is an outage impacting your cluster as both primary and secondary partitions are offline

Import Alert Rules from SingleStore

  1. Download and unzip the alert_rules.zip file.

    Note: Install jq, a command-line JSON processor, before proceeding to the next step.

  2. Run the following on the Linux command line to import the alert rules into Grafana.

    GRAFANA_HOST=<Hostname of the Grafana instance (default=localhost)> \
    GRAFANA_PORT=<Port of the Grafana instance (default=3000)> \
    GRAFANA_USER=<Grafana user (default=admin)> \
    GRAFANA_PASSWORD=<Grafana user's password (default=admin)> \
    ./provision_alert_rules.sh
  3. View the alert rules in Grafana.

  4. (Optional) To customize the imported alert rules, click the edit button and:

    1. Modify threshold value to change when the alert is triggered.

    2. Edit the evaluation period for field and update the length of time that the alert must be triggered before a notification is sent. An alert must be constantly triggered for the specified period before a notification is sent. For example, a notification will not be sent if the alert is in a "Firing" state for two minutes and this parameter is set to five minutes.

    3. Add custom labels and other metadata.

Note: A minimum version of SingleStore may be required on the Source cluster for some alerts to function. The required SingleStore version is listed in the details of the alert rule. If this requirement is not met, the alert may reflect a status of “No data."

Create Custom Alert Rules

You may create custom rules by following the Create Grafana-managed alert rules instructions and specifying the monitoring data source.

You may also duplicate one of the imported alerts and modify the query and/or any of the other parameters.

Configure Notification Policies

Notification policies control which alert rules are connected to which contact points as well as other alerting parameters. Notification policies use labels to match alert rules.

To connect all alert rules to a single contact point:

  1. Modify the default policy and set the default contact point to the desired contact point.

  2. To receive alert instances by Source clusters, add “cluster” to the Group By list by selecting the Group By field, adding the word cluster, and pressing the Enter key on your keyboard.

  3. (Optional) Modify the Timing options to change how repetitive alerts are handled.

To set a specific subset of alerts to a different contact point:

  1. Click New nested policy and set:

    1. label: <label>, value: <value>

      Ensure that all required alerts have this label and value.

      The imported alert rules have the type:system label-value set.

    2. Contact point: The desired contact point.

    3. (Optional) Select Override general timings to change how repetitive alerts are handled for the subset.

Last modified: July 22, 2024

Was this article helpful?