Alerts
Note
SingleStoreDB Cloud alerts is a preview feature.
Working alongside historical monitoring, this feature proactively alerts you to the changing state of a workspace. Alerts allow you to proactively remediate any issues that could adversely affect both the workspace and the applications that depend on it.
Alerts can either be enabled or disabled, and are disabled by default.
Pre-configured alerts are included that can be configured by:
Individual workspace(s), where alerts are triggered for the specified workspace(s)
Workspace group, where alerts are triggered for all workspaces in the specified workspace group
These pre-configured alerts, most of which can be customized, have the following components:
A trigger condition (or alert type) refers to what’s being monitored.
Each trigger condition has a single condition that can have up to three severity levels enabled: Critical, Warn(ing), and Info.
Each severity level has configurable parameters that, when taken together, constitute a threshold.
When a severity level's threshold is reached, an email alert is sent to the list of subscribers.
The one exception is the Availability alert, which has a preset threshold that cannot be customized. As SingleStore continuously monitors the SingleStoreDB Cloud infrastructure, availability alerts are first routed to, and evaluated by, SingleStore. If the issue cannot be readily resolved, an Availability alert is sent to the customer. While this alert may arrive outside of the expected notification window, please know that SingleStore is already working to resolve the issue.
The following table summarizes the available pre-configured alerts, what they measure, what an alert could mean, and the preset thresholds.
Trigger Condition | Condition | Critical | Warn | Info | What it measures | What it could mean | Alert Type |
CPU Utilization | Core CPU utilization is greater than | 95% | 90% | 80% | A workload is beginning to slow due to lack of available CPU cycles | How long at least one CPU core has been running over the specified maximum threshold | |
For this number of minutes | 5 | 5 | 10 | ||||
Memory Utilization | Memory utilization is greater than | 95% | 90% | 80% | How long that the workspace’s memory has been allocated over the specified maximum threshold | The workspace is using more memory than expected | |
For this number of minutes | 5 | 5 | 10 | ||||
Persistent Cache (Disk) Utilization | Persistent cache is greater than | 95% | 90% | 80% | How long the disk has been allocated over the specified threshold | The workspace is using more disk space than expected and warrants further investigation | |
For this number of minutes | 5 | 5 | 10 | ||||
Availability | Infrastructure is unavailable | N/A | N/A | N/A | A workspace may be either offline, or online but unreachable | Whether a workspace is available |
Alerting Frequency
The following table specifies the alerting frequency for each severity level.
Severity Level | Alerting Frequency |
Critical | 1 hour |
Warn(ing) | 6 hours |
Info | 24 hours |
For example, if you receive a critical CPU Utilization alert for a given workspace, you will receive the next critical CPU Utilization alert an hour later. During this one-hour interval, all other critical CPU Utilization alerts will be suppressed for this workspace.