View the Dashboards
When all cluster monitoring components are installed, configured, and running, the Grafana dashboards can be used to monitor SingleStoreDB cluster health over time.
Each dashboard provides insights that can be used to identify trends that may require intervention, including:
Active Session History
Chart Name | What it shows | When to use it |
---|---|---|
Active Session History | The activities running on the cluster and their respective resource usage (including CPU, memory, and network) | To view queries that are currently running, and that have been run, on the cluster; To view current and past session wait events to identify databases and activities that consume considerable resources, including: Why queries are running slowly; Where a cluster is resource-constrained |
Activity History
Chart Name | What it shows | When to use it |
---|---|---|
Execution History | The number of times a given query shape was executed over time | To view query statistics over time; To determine the number of times a given query has been run; To identify if the performance and resource usage of a single activity has regressed over time |
Average Time | A view of the average time spent waiting for resources for a given query over time (milliseconds) | To identify and compare the history of the time spent on query resource usage to understand if it’s performing similar to, or different than, previous executions |
Average Resource Use | The average resource use for a given query including disk bytes, memory bytes, memory bytes/second, and network bytes/second | To identify and compare the history of query resource usage to understand if it’s using resources similar to, or different than, previous executions |
Detailed Cluster View
Chart Name | What it shows | When to use it |
---|---|---|
Database CPU Breakdown | The CPU cycles spent by each activity, grouped by database | To identify which databases incur the most CPU usage. Note: A blank database indicates system activity, which is not related to a user database. |
Query Rate | The number of reads/writes per second of the queries running on the system | To understand typical (“normal”) cluster activity to: Benchmark workloads and their query rates; Identify anomalies in the read/write workload |
Rows Read or Written | The number of rows read/written | To understand typical (“normal”) cluster activity to: Benchmark workload read/write row counts; Identify anomalies in the number of rows read/written |
SysInfo CPU | The percentage of the host’s CPU that is being used | To understand CPU usage and host resource usage in general, or for a given workload; To identify if any non-SingleStoreDB activity is affecting a host’s CPU |
SysInfo Memory | The percent of the host’s memory that is being used | To understand host memory usage for a given workload over time; To identify if any non-SingleStoreDB activity is affecting a host’s memory |
SysInfo Network | The network bytes sent and received | To understand network usage for a given workload and identify bottlenecks; To identify if any non-SingleStoreDB activity is affecting a host’s network |
| A summary of host memory usage | To view cross-sections of memory usage within the cluster and identify anomalous memory use |
Workload Management Queries | The queries running and their states if affected by workload management | To understand the current cluster load and identify high-workload issues that are affecting the cluster’s ability to process queries |
Cluster Events | The count and output of the warning and errors on the cluster (as per mv_events) | To recognize cluster health issues by reviewing the number of events; To drill into cluster events to identify and understand what the issues are |
Memory Usage
Chart Name | What it shows | When to use it |
---|---|---|
Used vs. Total Limit | The memory in use compared to the total memory available (megabytes) | To perform capacity planning for memory; To identify if the cluster is not performing optimally due to a shortage of memory |
Query Memory vs. Total Limit | The query memory in use compared to the total memory available (megabytes) | To perform capacity planning for workloads; To identify if workloads in general, or workload spikes in particular, are putting the cluster at risk of running out of memory |
Data Memory Used vs. Total Limit | The data memory in use versus the total memory available (megabytes) | To perform capacity planning for data memory; To identify if given write workloads are putting the cluster at risk of running out of memory |
Internal Memory Allocators vs. Limit | The memory used by SingleStoreDB memory allocators (megabytes) | To identify why memory allocations have increased, or are anomalously large, when there are no other indicators of increased memory use, such as workload or data; To discover where memory is allocated (table, query, etc.) |
Detailed Breakout of Memory Allocators vs. Limit | The memory used by extended SingleStoreDB memory allocators (megabytes) | To identify if any memory allocations have increased, or are anomalously large, when there are no other indicators of increased memory use; To discover where memory is allocated (table, query, etc.) |
SingleStoreDB Status & Change
Chart Name | What it shows | When to use it |
---|---|---|
Show Status and Change(two charts) | The values and changes in SHOW STATUS EXTENDED (sizing units based on the variable) | To identify if any anomalous changes have occurred to SingleStoreDB status variables |
SingleStoreDB Variables & Change
Chart Name | What it shows | When to use it |
---|---|---|
Show Variables and Change(two charts) | The values and changes to SingleStoreDB variables (sizing units depending on the variable) | To view changes to Engine Variables over time to identify if any anomalous changes have occurred |
Information Schema View
Chart Name | What it shows | When to use it |
---|---|---|
Table Statistics | The row counts for tables across schemas | To identify anomalies in table sizes in general, and workloads in particular |
Node Metrics Breakout
Chart Name | What it shows | When to use it |
---|---|---|
CPU Utilization | The System Info CPU utilization (percent) | To view a host’s CPU utilization and hardware health to identify if processes outside of SingleStoreDB could be affecting them |
Filesystem | The filesystem usage (bytes) | To view host-level filesystem usage and identify if processes outside of SingleStoreDB could be affecting it |
Network Rate | The System Info network rate (byes) | To view host-level network usage and identify if processes outside of SingleStoreDB could be affecting it |
Memory Bytes | The System Info memory usage (bytes) | To identify host-level and |
Node Metrics Drilldown
Chart Name | What it shows | When to use it |
---|---|---|
Node Metrics Drilldown | The | To determine if the exporter process is running efficiently and/or to identify lags in data collection |