Configure Monitoring
Configure SingleStore DB Monitoring
Notice
You may continue to use the historical monitoring, SingleStore’s former monitoring solution, if you are currently unable to migrate to SingleStore’s native monitoring solution.
Please note that historical monitoring has been deprecated in favor of SingleStore’s native monitoring solution and may no longer function in future versions of SingleStore DB.
SingleStore’s native monitoring solution is designed to capture and reveal SingleStore DB cluster events over time. By analyzing this event data, you can identify trends and, if necessary, take action to remediate issues.
Terminology
Throughout this guide, the cluster that is being monitored will be referred to as the “Source” cluster, and the cluster that stores the monitoring data will be referred to as the “Metrics” cluster. The databases that store monitoring data will be referred to as the metrics
database.
SingleStore Native Monitoring Solution High-Level Architecture

In SingleStore’s native monitoring solution, the Metrics cluster utilizes a SingleStore pipeline to pull the data from the exporter process on the Source cluster and stores it in a database named metrics
. Note that this metrics
database can either reside within the same cluster as the Source cluster, or within a dedicated cluster.
When these event data is then analyzed through the associated Grafana dashboards, trends can be identified and, if necessary, actions taken to remediate issues.
The provided Grafana dashboards include:
Dashboard | Description |
---|---|
Active session history | Aggregated resource consumption by activity and activity type |
Activity history | Resource consumption by a specific activity over time |
Detailed cluster view | A “birds-eye view” of a single SingleStore DB cluster |
Information schema view | Provides a view into |
Memory usage | Granular breakdown of memory use for a host |
SingleStore DB status and variables view | Collected status variables from each host in the cluster |
Node breakout | System metrics from each host in the cluster |
Node drilldown | System metrics from each host in the cluster, with the ability to focus on a specific metric subsystem |
Prerequisites
Notice
These instructions have been developed for SingleStore DB clusters that have been installed and deployed via .rpm
or .deb
packages as a sudo
user.
If your cluster was deployed via tarball as a non-sudo
user, change to the directory (cd
) in which singlestoredb-toolbox
was untarred, and run all sdb-admin
commands as ./sdb-admin
.
Note that the Grafana instructions will require a user with sudo
access to install and configure the associated Grafana components.
A SingleStore DB 7.3 or later cluster to monitor (the Source cluster).
Optional: A separate SingleStore DB 7.3 or later cluster used to collect monitoring data (the Metrics cluster). This cluster can be the same as, or separate from, the Source cluster. If a separate cluster, the Metrics cluster:
Must be open to local (internal) network traffic only, and not open to the Internet.
Must only contain monitoring data from one or more Source clusters.
Should have two aggregator nodes and two leaf nodes, each with 2TB disks and with high availability (HA) enabled (SingleStore recommended)
Clusters are managed with SingleStore DB Toolbox 1.9.3 or later.
A Grafana instance (the latest version of Grafana) that can access the Metrics cluster.
SingleStore DB Toolbox is recommended for managing the clusters as automation during setup is provided through
sdb-admin
commands. While monitoring can be enabled through a series of SQL commands, the preferred method is to use SingleStore DB Toolbox.
Port Configuration
Default Port | Used by | Invoked by |
---|---|---|
| Grafana | User browser |
| SingleStore DB |
|
| SingleStore DB Studio | User browser |
|
| SingleStore pipelines |
Enable the Exporter Process
Notice
In most cases, both SingleStore DB Toolbox and SQL methods are provided to perform a given task. While either method may be used, the selected method must then be used throughout the remainder of the guide.
The memsql-exporter
process (or simply “the exporter”) collects data about a running cluster. The user that starts the exporter (other than the SingleStore DB root
user) must have the following permissions at a minimum:
GRANT CLUSTER on *.* to <user> GRANT SHOW METADATA on *.* to <user> GRANT SELECT on *.* to <user> GRANT PROCESS on *.* to <user>
Stop the Existing Exporter Process
If you are currently using SingleStore’s former monitoring solution and are migrating to SingleStore’s new monitoring solution, stop the existing SingleStore DB exporter process by running the following commands on your Source cluster’s Master Aggregator host. Otherwise, skip to Start the New Exporter Process.
Note: Perform the following steps on the Source cluster Master Aggregator.
Obtain the SingleStore DB exporter process PID.
ps aux | grep memsql_exporter
Kill the SingleStore DB exporter process.
kill <PID>
Start the New Exporter Process
Note: Perform the following steps on the Source cluster Master Aggregator.
SingleStore DB Toolbox
Run the following command to start the exporter. This command will start the exporter using the SingleStore DB root
user and default port (9104
). The exporter username and password are used to access the Source cluster. Refer to sdb-admin configure-monitoring for more information.
sdb-admin configure-monitoring \ --exporter-user root \ --exporter-password <secure-password>
SQL
You may start the exporter in the SingleStore DB engine by passing a user, port, and password to the exporter. The following example shows the exporter being started with the SingleStore DB root
user:
SET GLOBAL exporter_user = root;
SET GLOBAL exporter_password = '<secure-password>';
SET GLOBAL exporter_port= 9104;
Note that the exporter process is restarted when the exporter_port
variable is set to any non-zero value, including the current value. exporter_port
can be set by running SET GLOBAL exporter_port = <port>;
or sdb-admin configure-monitoring --exporter-port <port>
.
How to Stop the Exporter Process
You may stop the exporter at any time by performing either of the following commands on the Source cluster Master Aggregator.
SingleStore DB Toolbox
sdb-admin configure-monitoring --stop-exporter
SQL
Use an engine variable to stop the exporter process by setting the port to 0
.
SET GLOBAL exporter_port = 0;
Configure the metrics Database
The metrics
database can either reside in the Source cluster, in a dedicated Metrics cluster, or in a remote SingleStore DB cluster that the Metrics cluster can access.
Note that the user that creates the metrics
database may differ from the one that started the exporter. This user must have the ability to create tables, databases, and pipelines. At minimum, it requires the following permissions:
SingleStore DB Toolbox
The following command creates a database named metrics with a 10-day retention period for the data, and will also create the associated metrics
and blobs
pipelines. These pipelines are responsible for extracting data from the exporter and storing it in the metrics
database.
The exporter-host
value must be the Master Aggregator host in the Source cluster. The user
and password
are the credentials for the Metrics cluster. The variable retention-period
defines the number of days that data is retained before being deleted. Refer to sdb-admin start-monitoring for more information.
Please note the following:
The database that is created to store monitoring data is named
metrics
by default. You may change the database name by updating the value of the--database-name
option.The database user is
root
by default. When creating themetrics
database with a database user other thanroot
, theSUPER
privilege must be granted to this user. For example, for adbmon
user:GRANT SUPER ON *.* TO 'dbmon'@'%';
The IP address provided in the
exporter-host
argument must be resolvable by all hosts in the cluster. If the cluster you are monitoring has hosts that contain leaf nodes (as is common in a multi-host vs. single-host deployment), do not uselocalhost
. Instead, provide the explicit IP address or hostname that the other hosts in the cluster can resolve.This command must be run on the Metrics cluster.
sdb-admin start-monitoring \ --database-name metrics \ --exporter-host <exporter-hostname-or-IP-address> \ --user root \ --password <secure-password> \ --retention-period 10
Notice
For Metrics clusters running SingleStore DB 7.3.12 and earlier that are managed by Toolbox 1.11.7 and earlier, when a new Source cluster is monitored, the specified --retention-period
will apply to all existing Source clusters.
For Metrics clusters running SingleStore DB 7.3.13 and later that are managed with Toolbox 1.11.8 and later, a unique --retention-period
can be specified for each Source cluster. Therefore, when a new Source cluster is monitored, the specified --retention-period
will only apply to that Source cluster.
metrics
Database DDL
Alternatively, you may run the following SQL scripts to configure the database.
Please note the following:
The database that is created to store monitoring data is named
metrics
by default. You must edit these SQL statements manually to change the database name.The database user is
root
by default. When creating themetrics
database with a database user other thanroot
, theSUPER
privilege must be granted to this user. For example, for adbmon
user:GRANT SUPER ON *.* TO 'dbmon'@'%';
These commands must be run on the Master Aggregator of the Metrics cluster.
Create the metrics
database and associated tables by copying, pasting, and executing each set of SQL statements below. This can be done via the SQL Editor in SingleStore DB Studio, or your favored SQL client.
These SQL statements are also available in a downloadable metrics-database-ddl_73.zip file from SingleStore.
CREATE DATABASE IF NOT EXISTS metrics; USE metrics; CREATE TABLE IF NOT EXISTS `metrics` ( `labels` JSON, `name` TINYBLOB NOT NULL, `memsql_tags` JSON COLLATE utf8_bin, `cluster` AS `memsql_tags`::$cluster PERSISTED TINYBLOB, `host` AS `labels`::$host PERSISTED TINYBLOB, `port` AS `labels`::$port PERSISTED SMALLINT(6), `role` AS `labels`::$role PERSISTED TINYBLOB, `extractor` AS substring_index(`name`, '_', 1) PERSISTED TINYBLOB, `subsystem` AS substring_index(substring_index(`name`, '_', 2), '_', -1) PERSISTED TINYBLOB, `job` AS `memsql_tags`::$push_job PERSISTED TINYBLOB, `value` DOUBLE NOT NULL, `intval` AS floor(`value`) PERSISTED BIGINT(20), `time_sec` BIGINT(20) NOT NULL, KEY `name` (`cluster`,`extractor`,`subsystem`,`host`,`role`,`name`,`time_sec`) USING CLUSTERED COLUMNSTORE, SHARD KEY() ); CREATE TABLE IF NOT EXISTS act_samples ( `cluster` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `memsql_tags` JSON NOT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `ACTIVITY_TYPE` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `ACTIVITY_NAME` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, `AGGREGATOR_ACTIVITY_NAME` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL, `DATABASE_NAME` varchar(512) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL, `NODE_ID` smallint(6) DEFAULT NULL, `PARTITION_ID` smallint(6) DEFAULT NULL, `CPU_TIME_MS` bigint(4) DEFAULT NULL, `CPU_WAIT_TIME_MS` bigint(4) DEFAULT NULL, `ELAPSED_TIME_MS` bigint(4) DEFAULT NULL, `LOCK_TIME_MS` bigint(4) DEFAULT NULL, `NETWORK_TIME_MS` bigint(4) DEFAULT NULL, `DISK_TIME_MS` bigint(4) DEFAULT NULL, `DISK_B` bigint(4) DEFAULT NULL, `DISK_LOGICAL_READ_B` bigint(4) DEFAULT NULL, `DISK_LOGICAL_WRITE_B` bigint(4) DEFAULT NULL, `DISK_PHYSICAL_READ_B` bigint(4) DEFAULT NULL, `DISK_PHYSICAL_WRITE_B` bigint(4) DEFAULT NULL, `LOCK_ROW_TIME_MS` bigint(4) DEFAULT NULL, `LOG_FLUSH_TIME_MS` bigint(4) DEFAULT NULL, `LOG_BUFFER_TIME_MS` bigint(4) DEFAULT NULL, `LOG_BUFFER_WRITE_B` bigint(4) DEFAULT NULL, `NETWORK_B` bigint(4) DEFAULT NULL, `NETWORK_LOGICAL_RECV_B` bigint(4) DEFAULT NULL, `NETWORK_LOGICAL_SEND_B` bigint(4) DEFAULT NULL, `MEMORY_BS` bigint(4) DEFAULT NULL, `MEMORY_MAJOR_FAULTS` bigint(4) DEFAULT NULL, `RUN_COUNT` bigint(4) NULL DEFAULT '0', `SUCCESS_COUNT` bigint(4) NULL DEFAULT '0', `FAILURE_COUNT` bigint(4) NULL DEFAULT '0', KEY (`cluster`, `ACTIVITY_TYPE`,`ACTIVITY_NAME`,`AGGREGATOR_ACTIVITY_NAME`,`DATABASE_NAME`,`NODE_ID`,`PARTITION_ID`,ts) USING CLUSTERED COLUMNSTORE, SHARD KEY() ); CREATE TABLE IF NOT EXISTS mv_queries ( `cluster` varchar(512) NOT NULL, `memsql_tags` JSON NOT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `ACTIVITY_NAME` varchar(512), `QUERY_TEXT` varchar(8192), `PLAN_WARNINGS` varchar(8192), KEY (`cluster`, `ACTIVITY_NAME`) USING CLUSTERED COLUMNSTORE, SHARD KEY() ); CREATE TABLE IF NOT EXISTS cluster_info ( `cluster` varchar(512) NOT NULL, `memsql_tags` JSON NOT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `status` JSON NOT NULL, KEY (cluster, ts) USING CLUSTERED COLUMNSTORE, SHARD KEY() ); CREATE TABLE IF NOT EXISTS db_status ( `memsql_tags` JSON NOT NULL, `keys` JSON NOT NULL, `values` JSON NOT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `cluster` AS `memsql_tags`::$cluster PERSISTED TINYBLOB, database_name AS `keys`::$database_name PERSISTED TINYBLOB, `role` AS `keys`::$role PERSISTED tinyblob, num_partitions AS `values`::$num_partitions PERSISTED SMALLINT(6), `online` AS `values`::$online PERSISTED SMALLINT(6), `offline` AS `values`::$offline PERSISTED SMALLINT(6), replicating AS `values`::$replicating PERSISTED SMALLINT(6), recovering AS `values`::$recovering PERSISTED SMALLINT(6), pending AS `values`::$pending PERSISTED SMALLINT(6), transition AS `values`::$transition PERSISTED SMALLINT(6), unrecoverable AS `values`::$unrecoverable PERSISTED SMALLINT(6), ref_db_state AS `values`::$ref_db_state PERSISTED TINYBLOB, summary AS `values`::$summary PERSISTED TINYBLOB, KEY (cluster, database_name, ts) USING CLUSTERED COLUMNSTORE, SHARD KEY() ); CREATE TABLE IF NOT EXISTS mv_events ( `memsql_tags` JSON NOT NULL, `keys` JSON NOT NULL, `values` JSON NOT NULL, ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, event_ts TIMESTAMP, `cluster` AS `memsql_tags`::$cluster PERSISTED TINYBLOB, `event_type` AS `keys`::$event_type PERSISTED TINYBLOB, `origin_node_id` AS `keys`::$origin_node_id PERSISTED SMALLINT, `ip_addr` AS `values`::$ip_addr PERSISTED TINYBLOB, `port` AS `values`::$port PERSISTED SMALLINT, `type` AS `values`::$type PERSISTED TINYBLOB, `severity` AS `keys`::$severity PERSISTED TINYBLOB, `details` AS `values`::$details PERSISTED TINYBLOB, KEY (cluster, origin_node_id, event_ts) USING CLUSTERED COLUMNSTORE, SHARD KEY() ); CREATE TABLE IF NOT EXISTS mv_nodes ( `memsql_tags` JSON NOT NULL, `keys` JSON NOT NULL, `values` JSON NOT NULL, `ts` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, `cluster` AS `memsql_tags`::$cluster PERSISTED TINYBLOB, `node_id` AS `keys`::$node_id PERSISTED SMALLINT, `ip_addr` AS `keys`::$ip_addr PERSISTED TINYBLOB, `port` AS `keys`::$port PERSISTED SMALLINT, `role` AS `keys`::$role PERSISTED TINYBLOB, `external_host` AS `keys`::$external_host PERSISTED TINYBLOB, `external_port` AS `keys`::$external_port PERSISTED SMALLINT, `state` AS `values`::$state PERSISTED TINYBLOB, `availability_group` AS `values`::$availability_group PERSISTED SMALLINT, `num_cpus` AS `values`::$num_cpus PERSISTED BIGINT, `max_memory_mb` AS `values`::$max_memory_mb PERSISTED BIGINT, `max_table_memory_mb` AS `values`::$max_table_memory_mb PERSISTED BIGINT, `memory_used_mb` AS `values`::$memory_used_mb PERSISTED BIGINT, `table_memory_used_mb` AS `values`::$table_memory_used_mb PERSISTED BIGINT, `total_data_disk_mb` AS `values`::$total_data_disk_mb PERSISTED BIGINT, `available_data_disk_mb` AS `values`::$available_data_disk_mb PERSISTED BIGINT, `uptime` AS `values`::$uptime PERSISTED BIGINT, `version` AS `values`::$version PERSISTED TINYBLOB, KEY(`cluster`, `node_id`, `ts`) USING CLUSTERED COLUMNSTORE, SHARD KEY() ); delimiter // CREATE OR REPLACE PROCEDURE purge_metrics() AS BEGIN DELETE FROM metrics WHERE TIMESTAMPDIFF(MINUTE, FROM_UNIXTIME(time_sec), NOW()) > @@monitoring_retention_period; END // delimiter ; delimiter // CREATE OR REPLACE PROCEDURE purge_blobs() AS BEGIN DELETE FROM act_samples WHERE TIMESTAMPDIFF(MINUTE, ts, NOW()) > @@monitoring_retention_period; DELETE FROM mv_events WHERE TIMESTAMPDIFF(MINUTE, ts, NOW()) > @@monitoring_retention_period; DELETE FROM mv_queries WHERE TIMESTAMPDIFF(MINUTE, ts, NOW()) > @@monitoring_retention_period; DELETE FROM cluster_info WHERE TIMESTAMPDIFF(MINUTE, ts, NOW()) > @@monitoring_retention_period; DELETE FROM db_status WHERE TIMESTAMPDIFF(MINUTE, ts, NOW()) > @@monitoring_retention_period; DELETE FROM mv_nodes WHERE TIMESTAMPDIFF(MINUTE, ts, NOW()) > @@monitoring_retention_period; END // delimiter ; delimiter // CREATE OR REPLACE PROCEDURE load_metrics (metrics_pipe query(`labels` JSON, `name` TINYBLOB, `memsql_tags` JSON COLLATE utf8_bin,`value` DOUBLE NOT NULL,`time_sec` BIGINT(20) NOT NULL)) AS BEGIN INSERT INTO metrics ( `labels`, `name`, `memsql_tags`, `value`, `time_sec` ) SELECT `labels`, `name`, `memsql_tags`, `value`, `time_sec` FROM metrics_pipe; CALL purge_metrics(); END // delimiter ; delimiter // CREATE OR REPLACE PROCEDURE load_blobs (blobs_pipe query(type text, `keys` JSON, `vals` JSON, time_sec bigint, `memsql_tags` JSON NULL)) AS BEGIN INSERT INTO act_samples ( `cluster`, `memsql_tags`, `ts`, `ACTIVITY_TYPE`, `ACTIVITY_NAME`, `AGGREGATOR_ACTIVITY_NAME`, `NODE_ID`, `PARTITION_ID`, `DATABASE_NAME`, `CPU_TIME_MS`, `CPU_WAIT_TIME_MS`, `ELAPSED_TIME_MS`, `LOCK_TIME_MS` , `NETWORK_TIME_MS` , `DISK_TIME_MS`, `DISK_B`, `DISK_LOGICAL_READ_B`, `DISK_LOGICAL_WRITE_B`, `DISK_PHYSICAL_READ_B`, `DISK_PHYSICAL_WRITE_B`, `LOCK_ROW_TIME_MS`, `LOG_FLUSH_TIME_MS`, `LOG_BUFFER_TIME_MS`, `LOG_BUFFER_WRITE_B`, `NETWORK_B`, `NETWORK_LOGICAL_RECV_B`, `NETWORK_LOGICAL_SEND_B`, `MEMORY_BS`, `MEMORY_MAJOR_FAULTS`, `RUN_COUNT`, `SUCCESS_COUNT`, `FAILURE_COUNT` ) SELECT `memsql_tags`::$cluster, `memsql_tags`, FROM_UNIXTIME(time_sec), `keys`::$activity_type, `keys`::$activity_name, `keys`::$aggregator_activity_name, `keys`::node_id, `keys`::partition_id, `keys`::$db_name, `vals`::cpu_time_ms, `vals`::cpu_wait_time_ms, `vals`::elapsed_time_ms, `vals`::lock_time_ms, `vals`::network_time_ms, `vals`::disk_time_ms, `vals`::disk_b, `vals`::disk_logical_read_b, `vals`::disk_logical_write_b, `vals`::disk_physical_read_b, `vals`::disk_physical_write_b, `vals`::lock_row_time_ms, `vals`::log_flush_time_ms, `vals`::log_buffer_time_ms, `vals`::log_buffer_write_b, `vals`::network_b, `vals`::network_logical_recv_b, `vals`::network_logical_send_b, `vals`::memory_bs, `vals`::memory_major_faults, `vals`::run_count, `vals`::success_count, `vals`::failure_count FROM blobs_pipe WHERE type = 'activity'; INSERT INTO mv_events ( `memsql_tags`, `keys`, `values`, `ts`, `event_ts` ) SELECT `memsql_tags`, `keys`, `vals`, FROM_UNIXTIME(time_sec), FROM_UNIXTIME(`keys`::$event_time) FROM blobs_pipe b LEFT JOIN (SELECT DISTINCT cluster, origin_node_id, event_ts FROM mv_events) k ON b.`memsql_tags`::$cluster = k.cluster AND b.`keys`::$origin_node_id = k.origin_node_id AND b.`keys`::$event_time = k.event_ts WHERE type = 'event' AND k.cluster IS NULL AND k.origin_node_id IS NULL; INSERT INTO mv_queries ( `cluster`, `ACTIVITY_NAME`, `memsql_tags`, `ts`, `QUERY_TEXT`, `PLAN_WARNINGS` ) SELECT `memsql_tags`::$cluster, `keys`::$activity_name, `memsql_tags`, FROM_UNIXTIME(MAX(time_sec)), `vals`::$query_text, `vals`::$plan_warnings FROM blobs_pipe b LEFT JOIN (SELECT DISTINCT cluster, ACTIVITY_NAME FROM mv_queries) k ON b.`memsql_tags`::$cluster = k.cluster AND b.`keys`::$activity_name = k.ACTIVITY_NAME WHERE type = 'query' AND k.cluster IS NULL AND k.ACTIVITY_NAME IS NULL GROUP BY 1, 2; INSERT INTO cluster_info ( `cluster`, `memsql_tags`, `ts`, `status` ) SELECT `memsql_tags`::$cluster, `memsql_tags`, FROM_UNIXTIME(time_sec), `vals` FROM blobs_pipe WHERE type = 'cluster'; INSERT INTO db_status ( `memsql_tags`, `keys`, `values`, `ts` ) SELECT `memsql_tags`, `keys`, `vals`, FROM_UNIXTIME(time_sec) FROM blobs_pipe WHERE type = 'db_state'; INSERT INTO mv_nodes ( `memsql_tags`, `keys`, `values`, `ts` ) SELECT `memsql_tags`, `keys`, `vals`, FROM_UNIXTIME(time_sec) FROM blobs_pipe WHERE type = 'node'; CALL purge_blobs(); END // DELIMITER ; -- For Grafana DELIMITER // CREATE OR REPLACE FUNCTION `trim_metric`(input varchar(512) CHARACTER SET utf8 COLLATE utf8_general_ci NULL, prefix varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL) RETURNS varchar(512) CHARACTER SET utf8 COLLATE utf8_general_ci NULL AS BEGIN RETURN CASE WHEN input IS NOT NULL AND prefix IS NOT NULL AND locate(prefix, input, 1) > 0 THEN UPPER(substr(input, locate(prefix, input, 1) + length(prefix), length(input) - length(prefix))) ELSE input END; END // DELIMITER ; DELIMITER // CREATE OR REPLACE FUNCTION `trim_host`(input varchar(512) CHARACTER SET utf8 COLLATE utf8_general_ci NULL, c int default 1) RETURNS varchar(512) CHARACTER SET utf8 COLLATE utf8_general_ci NULL AS BEGIN RETURN CASE WHEN input RLIKE '([[:digit:]]{1,3}\\.){3}[[:digit:]]{1,3}' THEN input ELSE substring_index(input, '.', c) END; END // DELIMITER ; -- helper table for integers 1..100 CREATE REFERENCE TABLE IF NOT EXISTS ints ( f int key ); INSERT INTO ints values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23), (24), (25), (26), (27), (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), (39), (40), (41), (42), (43), (44), (45), (46), (47), (48), (49), (50), (51), (52), (53), (54), (55), (56), (57), (58), (59), (60), (61), (62), (63), (64), (65), (66), (67), (68), (69), (70), (71), (72), (73), (74), (75), (76), (77), (78), (79), (80), (81), (82), (83), (84), (85), (86), (87), (88), (89), (90), (91), (92), (93), (94), (95), (96), (97), (98), (99), (100);
Pipelines DDL
Run the following in the SQL Editor in SingleStore DB Studio, or your favored SQL client.
Note: You must edit exporter-host
and port
in the following SQL statements to align with where your exporter process resides.
The
exporter-host
is typically the host of your Source cluster’s Master Aggregator that’s running the exporter, and must includehttp://
.The default port for the endpoint is
9104
.
Note: The IP address provided in the exporter-host
argument must be resolvable by all hosts in the cluster. If the cluster you are monitoring has hosts that contain leaf nodes (as is common in a multi-host vs. single-host deployment), do not use localhost
. Instead, provide the explicit IP address or hostname that the other hosts in the cluster can resolve.
The metrics
pipeline:
CREATE OR REPLACE PIPELINE `metrics` AS LOAD DATA prometheus_exporter "<exporter-host:port>/cluster-metrics" CONFIG '{"is_memsql_internal":true, "monitoring_version": "7.3"}' INTO PROCEDURE `load_metrics` FORMAT JSON;
START PIPELINE IF NOT RUNNING metrics;
The blobs
pipeline:
CREATE OR REPLACE PIPELINE `blobs` AS LOAD DATA prometheus_exporter "<exporter-host:port>/samples" CONFIG '{"is_memsql_internal":true, "download_type":"samples", "monitoring_version": "7.3"}' INTO PROCEDURE `load_blobs` FORMAT JSON;
START PIPELINE IF NOT RUNNING blobs;
How to Stop the Monitoring Process
You may stop the monitoring process at any time by performing either of the following commands.
Note: Be sure to substitute the database name and/or the associated database user if you've changed them from the defaults of the metrics
database and the root
user.
SingleStore DB Toolbox
sdb-admin stop-monitoring \ --exporter-host <exporter-hostname-or-IP-address> \ --user root \ --password <secure-password>
SQL
You may manually stop the pipelines created earlier.
STOP PIPELINE metrics;
STOP PIPELINE blobs;
Connect Data Visualization Tools
After configuring the metrics
database, you can connect SingleStore Studio and data visualization tools such as Grafana to your cluster’s monitoring data. SingleStore Studio surfaces basic workload monitoring data, whereas Grafana provides additional dashboards for advanced analytics and troubleshooting.
Note
If you may monitor multiple Source clusters, make sure to name each Source cluster to identify the cluster in the visualization tool. The Source cluster name can be set using the cluster_name
sync variable. (For more information, see Sync Variables Lists.) To perform this action, connect to the Master aggregator of the Source cluster and run the following SQL command: SET GLOBAL cluster_name='source_cluster_name';
Alternatively, you can run the following Toolbox command on a host where Toolbox is set up to manage the Source cluster: sdb-admin update-config --key cluster_name --value 'source_cluster_name'
SingleStore Studio
As of SingleStore DB 7.3 and SingleStore Studio 3.1.0, your SingleStore Studio instance can be configured to surface specific monitoring data of your cluster. Learn how to connect Studio to your cluster's monitoring data.
Grafana
Note: If you already have Grafana installed, you may skip to Add the Grafana Monitoring Data Source.
Install Grafana
If a Grafana instance is not already available, install the latest version of Grafana on the collecting Master Aggregator host.
Add the Grafana Monitoring Data Source
Notice
For those SingleStore DB instances that don’t run database queries as root
, this requires both admin privileges and SELECT
and EXECUTE
grants to the metrics
database as per GRANT.
For example:
GRANT SELECT, EXECUTE ON metrics.* TO <user>@%;
where <user>
is the user other than root
under which these queries will be run.
Connect to Grafana at
http://<grafana-hostname>:3000
and log in usingadmin
as the username andadmin
as the password.Add a “monitoring” MySQL data source with the following settings. The SingleStore DB data source port is
3306
by default.Data source name:
monitoring
Data source type:
mysql
Data source host:
<metrics-cluster-master-aggregator-host>
Data source port:
3306
Database name:
metrics
User:
root
Password:
<secure-password-or-blank-if-none>
Alternatively, edit the
sample.yaml
file in the/etc/grafana/provisioning/datasources
directory and add the following content, substituting the values in angle brackets with your SingleStore DB configuration.# # config file version apiVersion: 1 datasources: - name: monitoring type: mysql url: <metrics-cluster-master-aggregator-host>:<port> database: metrics user: root password: <secure-password-or-blank-if-none> isDefault: true version: 1
Restart the Grafana server.
sudo systemctl restart grafana-server
Connect Grafana Dashboards
Notice
You may need Editor privileges to create new Grafana dashboards.
Download the cluster monitoring dashboards from SingleStore.
Import the dashboards into Grafana. To import a dashboard, click the
+
icon in the side menu, and then click Import. From here you can upload a dashboard JSON file, paste a Grafana.com dashboard URL, or paste dashboard JSON text directly into the text area.Alternatively, edit the
sample.yaml
file in the/etc/grafana/provisioning/dashboards/
directory and add the following content.# # config file version apiVersion: 1 providers: - name: 'default' orgId: 1 folder: '' folderUid: '' type: file options: path: /var/lib/grafana/dashboards
Next, copy the dashboard JSON files to the
/var/lib/grafana/dashboards
directory and ensure that the dashboards are owned bygrafana:grafana
(viachown grafana:grafana *
).Restart the Grafana server.
sudo systemctl restart grafana-server
View Monitoring Data from Multiple Clusters (Optional)
You can view monitoring data from multiple source clusters in a single Grafana instance using the following steps:
Ensure that all cluster monitoring data is stored in a single database (on the
metrics
cluster). This will allow a single Grafana datasource to be configured.On each source cluster, Enable the Exporter Process.
On the
metrics
cluster, run sdb-admin start-monitoring for each source cluster and store the monitoring data in the same database.Name each Source cluster to allow Grafana to identify the cluster. For more information, see Note.
Use the cluster tab in the Grafana UI to toggle between monitoring data from each source cluster.
Next Steps
View the Dashboards
When all cluster monitoring components are installed, configured, and running, the Grafana dashboards can be used to monitor SingleStore DB cluster health over time.
Identify Trends
Each dashboard provides insights that can be used to identify trends that may require intervention, including:
Active Session History
Chart Name | What it shows | When to use it |
---|---|---|
Active Session History | The activities running on the cluster and their respective resource usage (including CPU, memory, and network) | To view queries that are currently running, and that have been run, on the cluster. To view current and past session wait events to identify databases and activities that consume considerable resources, including:- Why queries are running slowly- Where a cluster is resource-constrained |
Activity History
Chart Name | What it shows | When to use it |
---|---|---|
Execution History | The number of times a given query shape was executed over time | To view query statistics over time. To determine the number of times a given query has been run. To identify if the performance and resource usage of a single activity has regressed over time |
Average Time | A view of the average time spent waiting for resources for a given query over time (milliseconds) | To identify and compare the history of the time spent on query resource usage to understand if it’s performing similar to, or different than, previous executions |
Average Resource Use | The average resource use for a given query including disk bytes, memory bytes, memory bytes/second, and network bytes/second | To identify and compare the history of query resource usage to understand if it’s using resources similar to, or different than, previous executions |
Detailed Cluster View
Chart Name | What it shows | When to use it |
---|---|---|
Database CPU Breakdown | The CPU cycles spent by each activity, grouped by database | To identify which databases incur the most CPU usageNote: A blank database indicates system activity, which is not related to a user database. |
Query Rate | The number of reads/writes per second of the queries running on the system | To understand typical (“normal”) cluster activity to:- Benchmark workloads and their query rates- Identify anomalies in the read/write workload |
Rows Read or Written | The number of rows read/written | To understand typical (“normal”) cluster activity to:- Benchmark workload read/write row counts- Identify anomalies in the number of rows read/written |
SysInfo CPU | The percentage of the host’s CPU that is being used | To understand CPU usage and host resource usage in general, or for a given workloadTo identify if any non-SingleStore DB activity is affecting a host’s CPU |
SysInfo Memory | The percent of the host’s memory that is being used | To understand host memory usage for a given workload over time To identify if any non-SingleStore DB activity is affecting a host’s memory |
SysInfo Network | The network bytes sent and received | To understand network usage for a given workload and identify bottlenecks To identify if any non-SingleStore DB activity is affecting a host’s network |
| A summary of host memory usage | To view cross-sections of memory usage within the cluster and identify anomalous memory use |
Workload Management Queries | The queries running and their states if affected by workload management | To understand the current cluster load and identify high-workload issues that are affecting the cluster’s ability to process queries |
Cluster Events | The count and output of the warning and errors on the cluster (as per mv_events) | To recognize cluster health issues by reviewing the number of eventsTo drill into cluster events to identify and understand what the issues are |
Memory Usage
Chart Name | What it shows | When to use it |
---|---|---|
Used vs. Total Limit | The memory in use compared to the total memory available (megabytes) | To perform capacity planning for memoryTo identify if the cluster is not performing optimally due to a shortage of memory |
Query Memory vs. Total Limit | The query memory in use compared to the total memory available (megabytes) | To perform capacity planning for workloadsTo identify if workloads in general, or workload spikes in particular, are putting the cluster at risk of running out of memory |
Data Memory Used vs. Total Limit | The data memory in use versus the total memory available (megabytes) | To perform capacity planning for data memoryTo identify if given write workloads are putting the cluster at risk of running out of memory |
Internal Memory Allocators vs. Limit | The memory used by SingleStore DB memory allocators (megabytes) | To identify why memory allocations have increased, or are anomalously large, when there are no other indicators of increased memory use, such as workload or dataTo discover where memory is allocated (table, query, etc.) |
Detailed Breakout of Memory Allocators vs. Limit | The memory used by extended SingleStore DB memory allocators (megabytes) | To identify if any memory allocations have increased, or are anomalously large, when there are no other indicators of increased memory useTo discover where memory is allocated (table, query, etc.) |
SingleStore DB Status & Change
Chart Name | What it shows | When to use it |
---|---|---|
Show Status and Change(two charts) | The values and changes in SHOW STATUS EXTENDED (sizing units based on the variable) | To identify if any anomalous changes have occurred to SingleStore DB status variables |
SingleStore DB Variables & Change
Chart Name | What it shows | When to use it |
---|---|---|
Show Variables and Change(two charts) | The values and changes to SingleStore DB variables (sizing units depending on the variable) | To view changes to Engine Variables over time to identify if any anomalous changes have occurred |
Information Schema View
Chart Name | What it shows | When to use it |
---|---|---|
Table Statistics | The row counts for tables across schemas | To identify anomalies in table sizes in general, and workloads in particular |
Node Metrics Breakout
Chart Name | What it shows | When to use it |
---|---|---|
CPU Utilization | The System Info CPU utilization (percent) | To view a host’s CPU utilization and hardware health to identify if processes outside of SingleStore DB could be affecting them |
Filesystem | The filesystem usage (bytes) | To view host-level filesystem usage and identify if processes outside of SingleStore DB could be affecting it |
Network Rate | The System Info network rate (byes) | To view host-level network usage and identify if processes outside of SingleStore DB could be affecting it |
Memory Bytes | The System Info memory usage (bytes) | To identify host-level and |
Node Metrics Drilldown
Chart Name | What it shows | When to use it |
---|---|---|
Node Metrics Drilldown | The | To determine if the exporter process is running efficiently and/or to identify lags in data collection |
Troubleshoot Your Monitoring Setup
Pipelines
Check the Monitoring Tables for Data
Connect to the database.
Run the following SQL. The default database name is
metrics
. If your database name is different from the default name, replacemetrics
with your database name.USE metrics; SELECT * FROM metrics LIMIT 10;
Optional, run
SELECT * FROM
on all of the monitoring tables.If these queries return an empty set, review the pipelines error tables using the next step.
Review the monitoring pipelines.
SHOW PIPELINES;
If a monitoring pipeline (with a name resembling
*_metrics
and*_blobs
) is in a state other thanrunning
, start the pipeline.START PIPELINE <pipeline-name>;
Check the
information_schema.pipelines_errors
table for errors.SELECT * FROM information_schema.pipelines_errors;
Resolve Pipeline Errors
If you receive an Cannot extract data for the pipeline error
in the pipelines_error table
, perform the following steps.
Confirm that port
9104
is accessible from all hosts in the cluster. This is the default port used for monitoring. To test this, run the following command at the Linux command line and review the output.curl http://<endpoint>:9104/cluster-metrics
For example:
curl http://192.168.1.100:9104/cluster-metrics
If the hostname of the Master Aggregator is
localhost
, and a pipeline was created usinglocalhost
, recreate the pipeline using the Master Aggregator host’s IP addresses. For example:metrics
pipeline:create or replace pipeline `metrics` as load data prometheus_exporter "http://<host-ip-address>:9104/cluster-metrics" config '{"is_memsql_internal":true}' into procedure `load_metrics` format json;
start pipeline if not running metrics;
blobs
pipeline:create or replace pipeline `blobs` as load data prometheus_exporter "http://<host-ip-address>:9104/samples" config '{"is_memsql_internal":true, "download_type":"samples"}' into procedure `load_blobs` format json;
start pipeline if not running blobs;