calibrate

Description

Run ‘calibrate’ to measure cluster performance.

The --password flag is optional and specifies the SingleStore root password. You can use this flag in conjunction with the --user flag to specify a SingleStore user that is different from the root user and the user’s password. Note that the MEMSQL_PASSWORD environment variable is a safer alternative option for setting the password.

Wrap the password in single quotes (') to avoid having special characters included in the password interpreted by the shell. For example: sdb-report calibrate <args> --password '<<fooismypassword'

A calibrate database is created with rowstore and columnstore tables that are populated with data. A number of queries are run against this data to measure cluster performance.

This tool requires a dataset, which will be loaded into tables. By default, the dataset is downloaded from the web, unpacked, and deleted at the end of the run. However, if your cluster cannot access the internet or already has the dataset, use the --data-path flag, which is an absolute path to the dataset on the Master Aggregator node.

You can download the dataset from an Amazon S3 bucket.

The dataset download uses 1.5 Gb of disk space, and the unpacking of data takes up another 12 Gb.

The --partition-ratio flag is used to test clusters under different concurrency conditions, such as, four CPU cores per partition. This flag specifies a proportion of CPU cores to partitions and defaults to 2:1.

Workload consists of the following steps:

  • Download the dataset

  • Create the calibrate database and set up the required variables

  • Load data into tables

  • Run calibration queries

  • Retrieve results of the run and save them in a file on the Master Aggregator node

Workload duration divided into three main categories:

  • FAST: under 12 minutes

  • INTERMEDIATE: between 12 and 16 minutes

  • SLOW: over 16 minutes

In case the cluster gets a SLOW or INTERMEDIATE result, the tool will display data about the run, in addition to the file version, to the terminal.

A simple example that runs calibrate workload with the root user of the cluster and downloads dataset from the internet.

sdb-report calibrate

Runs the calibrate workload with the user user and password pass with a dataset on path /home/admin/dataset.tar.gz on the Master Aggregator node.

sdb-report calibrate -u usr -p pass --data-path /home/admin/dataset.tar.gz

Run calibrate to test non-default partition ratio and retain calibrate database after the run.

sdb-report calibrate --partition-ratio 4:1 --retain-database

Execution time of the following queries will be measured:

Load data queries

+--------------+------------------------------------------------------------------------------------------------------+
| NAME | SQL |
+--------------+------------------------------------------------------------------------------------------------------+
| Load Queries | LOAD /*! calibrate_load_rs_1 */ DATA INFILE '{{CALIBRATE_LOAD_RS_1}}' INTO TABLE disttable2 FIELDS |
| | TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_cs_2 */ DATA INFILE '{{CALIBRATE_LOAD_CS_2}}' INTO TABLE disttable2_cs |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_rs_3 */ DATA INFILE '{{CALIBRATE_LOAD_RS_3}}' INTO TABLE foreignstring440k |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_cs_4 */ DATA INFILE '{{CALIBRATE_LOAD_CS_4}}' INTO TABLE |
| | foreignstring440k_cs FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_rs_5 */ DATA INFILE '{{CALIBRATE_LOAD_RS_5}}' INTO TABLE primarystring2 |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
| | |
| | LOAD /*! calibrate_load_cs_6 */ DATA INFILE '{{CALIBRATE_LOAD_CS_6}}' INTO TABLE primarystring2_cs |
| | FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n'; |
+--------------+------------------------------------------------------------------------------------------------------+

Calibration queries

+--------------------------------+------------------------------------------------------------------------------------------------------+
| NAME | SQL |
+--------------------------------+------------------------------------------------------------------------------------------------------+
| Reshuffle Joins | SELECT /*! calibrate_reshuff_rs_q1 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_reshuff_rs_cs_q2 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | primarystring2_cs b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_reshuff_rs_cs_q3 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_reshuff_cs_q4 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN primarystring2_cs b ON a.a=b.a) A; |
| | |
| Broadcast Joins | SELECT /*! calibrate_bcast_rs_q1 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | WITH(broadcast_left=true) primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_bcast_rs_cs_q2 */ COUNT(1) FROM ( SELECT b.* FROM disttable2 a STRAIGHT_JOIN |
| | WITH(broadcast_left=true) primarystring2_cs b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_bcast_rs_cs_q3 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN WITH(broadcast_left=true) primarystring2 b ON a.a=b.a) A; |
| | |
| | SELECT /*! calibrate_bcast_cs_q4 */ COUNT(1) FROM ( SELECT b.* FROM disttable2_cs a |
| | STRAIGHT_JOIN WITH(broadcast_left=true) primarystring2_cs b ON a.a=b.a) A; |
| | |
| Local Group By | SELECT /*! calibrate_l_gby_rs_q1 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k.a),primarystring2.a FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k.a GROUP BY |
| | primarystring2.a) A; |
| | |
| | SELECT /*! calibrate_l_gby_rs_cs_q2 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2.a FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k_cs.a GROUP |
| | BY primarystring2.a) A; |
| | |
| | SELECT /*! calibrate_l_gby_rs_cs_q3 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k.a),primarystring2_cs.a FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k.a |
| | GROUP BY primarystring2_cs.a) A; |
| | |
| | SELECT /*! calibrate_l_gby_cs_q4 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2_cs.a FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k_cs.a |
| | GROUP BY primarystring2_cs.a) A; |
| | |
| Local Joins | SELECT /*! calibrate_lj_rs_q1 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k.a; |
| | |
| | SELECT /*! calibrate_lj_rs_cs_q2 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k_cs.a; |
| | |
| | SELECT /*! calibrate_lj_rs_cs_q3 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k.a; |
| | |
| | SELECT /*! calibrate_lj_cs_q4 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k_cs.a; |
| | |
| Single Part Join and Filter | SELECT /*! calibrate_spj_rs_q1 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | ON primarystring2.a=foreignstring440k.a AND primarystring2.a = 'A1000027'; |
| | |
| | SELECT /*! calibrate_spj_rs_cs_q2 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 ON primarystring2.a=foreignstring440k_cs.a AND primarystring2.a = 'A1000027'; |
| | |
| | SELECT /*! calibrate_spj_rs_cs_q3 */ COUNT(1) FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs ON primarystring2_cs.a=foreignstring440k.a AND primarystring2_cs.a = 'A1000027'; |
| | |
| | SELECT /*! calibrate_spj_cs_q4 */ COUNT(1) FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs ON primarystring2_cs.a=foreignstring440k_cs.a AND primarystring2_cs.a = |
| | 'A1000027'; |
| | |
| Reshuffle and Distributed | SELECT /*! calibrate_reshuff_dgby_rs_q1 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| Group By | COUNT(b.a) FROM disttable2 a STRAIGHT_JOIN primarystring2 b ON a.a=b.a GROUP BY a.b) A; |
| | |
| | SELECT /*! calibrate_reshuff_dgby_rs_cs_q2 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(b.a) FROM disttable2_cs a STRAIGHT_JOIN primarystring2 b ON a.a=b.a GROUP BY a.b) A; |
| | |
| | SELECT /*! calibrate_reshuff_dgby_rs_cs_q3 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(b.a) FROM disttable2 a STRAIGHT_JOIN primarystring2_cs b ON a.a=b.a GROUP BY a.b) A; |
| | |
| | SELECT /*! calibrate_reshuff_dgby_cs_q4 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(b.a) FROM disttable2_cs a STRAIGHT_JOIN primarystring2_cs b ON a.a=b.a GROUP BY a.b) A; |
| | |
| Local Join and Distributed | SELECT /*! calibrate_lj_dgby_rs_q1 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| Group By | COUNT(foreignstring440k.a),primarystring2.id FROM foreignstring440k STRAIGHT_JOIN primarystring2 |
| | WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k.a GROUP BY |
| | primarystring2.id) A; |
| | |
| | SELECT /*! calibrate_lj_dgby_rs_cs_q2 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2.id FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2 WITH(table_convert_subSELECT=true) ON primarystring2.a=foreignstring440k_cs.a GROUP |
| | BY primarystring2.id) A; |
| | |
| | SELECT /*! calibrate_lj_dgby_rs_cs_q3 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k.a),primarystring2_cs.id FROM foreignstring440k STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k.a |
| | GROUP BY primarystring2_cs.id) A; |
| | |
| | SELECT /*! calibrate_lj_dgby_cs_q4 */ WITH(leaf_pushdown=true) COUNT(1) FROM (SELECT |
| | COUNT(foreignstring440k_cs.a),primarystring2_cs.id FROM foreignstring440k_cs STRAIGHT_JOIN |
| | primarystring2_cs WITH(table_convert_subSELECT=true) ON primarystring2_cs.a=foreignstring440k_cs.a |
| | GROUP BY primarystring2_cs.id) A; |
+--------------------------------+------------------------------------------------------------------------------------------------------+

Usage

Usage:
sdb-report calibrate [flags]
For flags that can accept multiple values (indicated by VALUES after the name of the flag),
separate each value with a comma.
Flags:
--data-path string The absolute path to the folder on the Master Aggregator that contains the calibration datasets
-h, --help Help for calibrate
--partition-ratio RATIO The ratio of CPU cores to database partitions (CPU cores:database partitions) where both values must be less than or equal to 16 (default 2:1)
-p, --password STRING The database user's password for connecting to SingleStore. If a password is specified on the command line, it must not contain an unescaped '$' character as it will be replaced by the shell
--retain-database Retain the ‘calibrate’ database after the calibration process completes
--temp-dir ABSOLUTE_PATH The directory on the Master Aggregator in which to unpack the dataset (ADVANCED)
-u, --user string The database user for connecting to SingleStore (default "root")
Global Flags:
--backup-cache FILE_PATH File path for the backup cache
--cache-file FILE_PATH File path for the Toolbox node cache
-c, --config FILE_PATH File path for the Toolbox configuration
--disable-colors Disable color output in console, which some terminal sessions/environments may have difficulty with
--disable-spinner Disable the progress spinner, which some terminal sessions/environments may have issues with
-j, --json Enable JSON output
--parallelism POSITIVE_INTEGER Maximum number of operations to run in parallel
--runtime-dir DIRECTORY_PATH Where to store Toolbox runtime data
--ssh-control-persist SECONDS Enable SSH ControlPersist and set it to the specified duration in seconds
--ssh-max-sessions POSITIVE_INTEGER Maximum number of SSH sessions to open per host, must be at least 3
--ssh-strict-host-key-checking Enable strict host key checking for SSH connections
--ssh-user-known-hosts-file FILE_PATH Path to the user known_hosts file for SSH connections. If not set, /dev/null will be used
--state-file FILE_PATH Toolbox state file path
-v, --verbosity count Increase logging verbosity: valid values are 1, 2, 3. Usage -v=count or --verbosity=count
-y, --yes Enable non-interactive mode and assume the user would like to move forward with the proposed actions by default

Remarks

This command is interactive unless you use either --yes or --json flag to override interactive behavior.

Last modified: October 6, 2023

Was this article helpful?