Troubleshooting
On this page
Introduction
This guide can be used to troubleshoot system, node, or general query performance issues.Next Steps
section at the end, which provides recommendations for what the reader can do next based on the information they uncover in their current step.
Collecting and Checking the Cluster Report
-
Using sdb-report collect, collect a report and write it to a tar file on the host you execute the command from.
sdb-report collectToolbox will perform the following actions: · Execute 74 collectors ✓ Collected report for host 127.0.0.1 Report written to report-2021-01-08T021531.tar.gz
-
Use sdb-report check to check the output for issues.
Look at all FAIL
output, such as this example:sdb-report check --report-path your_report_path.tar.gz✘ transparentHugepage ........................... [FAIL] FAIL /sys/kernel/mm/transparent_hugepage/enabled is [always] on 172.17.0.2 NOTE https://docs.memsql.com/memsql-report-redir/transparent-hugepage ✘ leavesNotOnline ........................ [FAIL] FAIL leaf node on host 127.0.0.1 and port 3308 is offline Some checks failed: 1 FAIL, 1 PASS, 1 UNAVAILABLE
Next Steps
-
If the cluster report does not uncover any
FAIL
output, move onto Step 1 and be sure to take note of anyWARNINGS
in thecheck --report
output for future
In this section
- 1. Identifying Expensive Queries
- 2. Investigating Expensive Queries
- 3. Are any queries waiting on something?
- 4. Are key resources being overused?
- 5. Are there other ongoing operational activities?
- 6. Checking Node, Partition, and Overall Database Health
- 7. Checking Partition Number and Data Skew
- 8. Checking for System Changes and Log Hints
Last modified: July 29, 2024