Configure Alerts
On this page
Overview
Note
As generating alerts is dependent on cluster monitoring, please configure monitoring before configuring alerts from Grafana.
Working alongside cluster monitoring, this feature proactively alerts you to the changing state of a cluster.
Pre-configured alerts are included that can be configured by, and triggered for, individual cluster(s).
These pre-configured alerts, most of which can be customized, have the following components:
-
A trigger condition (or alert rules) refers to what’s being monitored.
-
Each alert has configurable parameters that, when taken together, constitute a threshold.
-
When an alert rule's threshold is reached, an alert is sent to the list of configured contact points.
Prerequisites
-
A SingleStore 7.
3 or later Metrics cluster configured to monitor one or more Source clusters via Configure Monitoring. -
A Grafana instance with the monitoring data source configured via Connect Data Visualization Tools.
-
Grafana version 9.
5. 6 or later. Note that this is a later version than the one required for monitoring. Grafana can be updated by installing this version or a later version and restarting the Grafana server.
Configure Contact Points
Contact points define the connections to the external services that deliver alerts, such as email, PagerDuty, Slack, and more.
Configure the SMTP Server (Optional)
Note
If you already have an SMTP server configured, you can skip this example.
While the Postfix SMTP server is used in this example, email alerts can be configured with any SMTP server.
Red Hat Distribution
-
Install Postfix.
sudo yum install postfix -
Start Postfix.
systemctl start postfix -
Enable the Postfix service to ensure that it will restart when the host is rebooted.
systemctl enable postfix -
After installation, run the
postconf
command to see the Postfix configuration.Confirm that the record 127.
entry exists in the0. 0. 1 example. com /etc/hosts
file.
Debian Distribution
-
Install Postfix.
sudo apt-get install postfix -
During installation, you will be prompted to select an SMTP server type.
Select the “Internet Site” option, which allows emails to be sent outside the server. -
For testing purposes and/or a local-only configuration, use the suggested name (
example.
) when prompted for a fully qualified domain name (FQDN), and click Ok.com -
After installation, run the
postconf
command to see the Postfix configuration.Confirm that the record 127.
entry exists in the0. 0. 1 example. com /etc/hosts
file.
Configure Email Alerts
Before continuing, ensure that your SMTP server is configured to send outbound messages to your desired recipients.
-
Update the
[smtp]
section of the/etc/grafana/grafana.
Grafana configuration file with the following values:ini -
enabled
=true
-
host
= <SMTP server hostname>:<port> -
user
= The user that the SMTP server uses to authenticate the client that is sending the email -
password
= The above user’s password -
from_
= The email address from which the email is sentaddress -
from_
= The name of the sendername
-
-
Restart the Grafana server.
sudo systemctl restart grafana-server -
In Grafana, add a new contact point and specify:
-
Integration: Email
-
Addresses: A list of addresses to send to the alerts to
-
(Optional) Configure additional settings under Optional Email settings.
-
PagerDuty
-
In PagerDuty, navigate to Services and either create a new service or modify an existing one.
Add Events API v2 as an integration when creating or modifying the service. -
View the Events API v2 integration and copy the Integration Key (for example,
1v970cee7a8c4c01d0e407b31e098a59
). -
In Grafana, add a new contact point and specify:
-
Integration: PagerDuty
-
Integration Key: Paste the integration key from the previous step
-
(Optional) Configure additional settings under Optional PagerDuty settings, such as severity, class, etc.
-
Slack
To message a specific Slack workspace channel when an alert is triggered:
-
Go to Slack API: Applications and create a new app.
-
Select From Scratch.
-
Specify the workspace to message.
-
-
Select Incoming Webhooks and enable Activate Incoming Webhooks.
-
Click Add new Webhook to Workspace and select a channel to message.
Create the Slack channel if it does not exist. Copy the URL of the created Webhook. -
In Grafana, add a new contact point and specify:
-
Integration: Slack
-
Webhook URL: Paste the URL of the Webhook from the previous step
-
(Optional) Configure additional settings under Optional Slack settings, such as username, mention users, etc.
-
-
Test the integration setup and click the Save contact point button.
Configure Alert Rules
The following rules are supported by default.
Trigger Condition |
Condition |
Threshold |
What it measures |
What it could mean |
CPU Utilization |
Core CPU utilization is greater than |
90% |
A workload is beginning to slow due to lack of available CPU cycles |
How long at least one CPU core has been running over the specified maximum threshold |
For this number of minutes |
5 |
|||
Memory Utilization |
Memory utilization is greater than |
90% |
How long that the cluster's memory has been allocated over the specified maximum threshold |
The cluster is using more memory than expected |
For this number of minutes |
5 |
|||
Persistent Cache (Disk) Utilization |
Persistent cache is greater than |
90% |
How long the disk has been allocated over the specified threshold |
The cluster is using more disk space than expected and warrants further investigation |
For this number of minutes |
5 |
|||
Partition Availability |
Partitions are unavailable for this number of minutes |
10 |
Whether both primary and secondary partitions are available |
There is an outage impacting your cluster as both primary and secondary partitions are offline |
Import Alert Rules from SingleStore
-
Download and unzip the alert_
rules. zip file. Note: Install jq, a command-line JSON processor, before proceeding to the next step.
-
Run the following on the Linux command line to import the alert rules into Grafana.
GRAFANA_HOST=<Hostname of the Grafana instance (default=localhost)> \GRAFANA_PORT=<Port of the Grafana instance (default=3000)> \GRAFANA_USER=<Grafana user (default=admin)> \GRAFANA_PASSWORD=<Grafana user's password (default=admin)> \./provision_alert_rules.sh -
View the alert rules in Grafana.
-
(Optional) To customize the imported alert rules, click the edit button and:
-
Modify threshold value to change when the alert is triggered.
-
Edit the evaluation period for field and update the length of time that the alert must be triggered before a notification is sent.
An alert must be constantly triggered for the specified period before a notification is sent. For example, a notification will not be sent if the alert is in a "Firing" state for two minutes and this parameter is set to five minutes. -
Add custom labels and other metadata.
-
Note: A minimum version of SingleStore may be required on the Source cluster for some alerts to function.
Create Custom Alert Rules
You may create custom rules by following the Create Grafana-managed alert rules instructions and specifying the monitoring data source.
You may also duplicate one of the imported alerts and modify the query and/or any of the other parameters.
Configure Notification Policies
Notification policies control which alert rules are connected to which contact points as well as other alerting parameters.
To connect all alert rules to a single contact point:
-
Modify the default policy and set the default contact point to the desired contact point.
-
To receive alert instances by Source clusters, add “cluster” to the Group By list by selecting the Group By field, adding the word cluster, and pressing the Enter key on your keyboard.
-
(Optional) Modify the Timing options to change how repetitive alerts are handled.
To set a specific subset of alerts to a different contact point:
-
Click New nested policy and set:
-
label
: <label>,value
: <value>Ensure that all required alerts have this label and value.
The imported alert rules have the
type:system
label-value set. -
Contact point: The desired contact point.
-
(Optional) Select Override general timings to change how repetitive alerts are handled for the subset.
-
Last modified: July 22, 2024