Statistics and Sampling Tasks
On this page
Disabling and Enabling Automatic Statistics
You can disable or enable autostats on each table.
You can disable autostats on an existing table by running:
ALTER TABLE table_name DISABLE AUTOSTATS;
or
ALTER TABLE table_name AUTOSTATS_ENABLED = OFF;
and enable autostats by running
ALTER TABLE table_name ENABLE AUTOSTATS;
or
ALTER TABLE table_name AUTOSTATS_ENABLED = ON;
You can choose whether autostats are enabled or disabled when creating a table by adding the option:
AUTOSTATS_ENABLED=<TRUE|FALSE>
at the end of the CREATE TABLE statement, such as:
CREATE TABLE t (a INT, b INT, SORT KEY (a)) AUTOSTATS_ENABLED=FALSE;
When AUTOSTATS_ is set to TRUE, autostats categories are set to the following default settings
-
For rowstore tables,
AUTOSTATS_CARDINALITY_MODE = PERIODICAUTOSTATS_HISTOGRAM_MODE = CREATE -
For columnstore tables,
AUTOSTATS_CARDINALITY_MODE = INCREMENTALAUTOSTATS_HISTOGRAM_MODE = CREATEAUTOSTATS_SAMPLING = ON
SingleStore allows you to independently control how each category of statistics is automatically gathered.
AUTOSTATS_CARDINALITY_MODE = [ INCREMENTAL | PERIODIC | OFF ]AUTOSTATS_HISTOGRAM_MODE = [ CREATE | UPDATE | OFF ]AUTOSTATS_SAMPLING = [ ON | OFF ]
These autostats settings can also be specified or modified through the ALTER TABLE and CREATE TABLE commands.CREATE TABLE statement.
ALTER TABLE table_name AUTOSTATS_CARDINALITY_MODE = PERIODIC;
CREATE TABLE t1 (a INT, b INT, SORT KEY(a))AUTOSTATS_CARDINALITY_MODE = INCREMENTAL,AUTOSTATS_HISTOGRAM_MODE = CREATE,AUTOSTATS_SAMPLING= ON;
Warning
Each type of autostats setting may not work for all types of tables.
Additionally, you can change whether autostats is enabled or disabled by default on future CREATE TABLE statements by setting the global variable default_ to TRUE or FALSE on the master aggregator.TRUE.
You can also modify the default settings for autostats, individually, by setting the following global variables:
default_autostats_columnstore_cardinality_mode = [ INCREMENTAL | PERIODIC | OFF ]default_autostats_rowstore_cardinality_mode = [ PERIODIC | OFF ]default_autostats_histogram_mode = [ CREATE | UPDATE | OFF ]default_autostats_columnstore_sampling = [ ON | OFF ]
Note
Limitations associated with using autostats:
-
Enabling
AUTOSTATS_(either mode besidesHISTOGRAM_ MODE OFF) requires enablingAUTOSTATS_(either mode besidesCARDINALITY_ MODE OFF). -
INCREMENTALcardinality mode is only supported on columnstore tables. -
AUTOSTATS_is not allowed for rowstore tables.SAMPLING On columnstore tables, it controls whether the columnstore sample is built and used for estimation.
You can see whether autostats is enabled or disabled on a table by running SHOW CREATE TABLE.
Autostats Support upon Upgrading
Any table that is upgraded to a newer version of SingleStore retains its autostats settings.ALTER TABLE <table_ on each table.
Below is a simple script you can use to enable autostats on all tables (written in bash to minimize dependencies).
Save this script in a file called enable-autostats.:
#!/bin/bash# Enables autostats on all the columnstore tables which currently have autostats disabled.## Run this bash script with the mysql client arguments you use to connect to the SingleStore deployment, like:# bash enable-autostats.sh -h<host> -P<port> -u<user> -p<password>## Use the --dry-run option to only print the list of tables to be altered, without actually running the ALTER commands.set -eclient_args=(mysql --batch --skip-column-names)dry_run=falsewhile [[ $# -gt 0 ]]docase "$1" in--dry-run)dry_run=trueecho "NOTE: This is a dry run. The SQL statements are printed for tables where AUTOSTATS are disabled, but will not be executed.";;*)client_args+=("$1");;esacshiftdone# List of table names (fully qualified and quoted)tables=$(${client_args[@]} -e "select concat(table_schema, '.\`', replace(table_name, '\`', '\`\`'), '\`') from information_schema.tables where table_type = 'BASE TABLE'")echo "$tables" | while read -r tabledoshow_create_table=$(${client_args[@]} -e "show create table $table")is_columnstore=$(${client_args[@]} -e "select 1 from information_schema.tables where concat(table_schema,'.\`',table_name,'\`')='${table}' and storage_type='COLUMNSTORE';")alter_stmt=""if echo "$show_create_table" | grep -q "AUTOSTATS_CARDINALITY_MODE=OFF"thenif echo "$show_create_table" | grep -q "COLUMNSTORE"thenalter_stmt="${alter_stmt}AUTOSTATS_CARDINALITY_MODE=INCREMENTAL"elsealter_stmt="${alter_stmt}AUTOSTATS_CARDINALITY_MODE=PERIODIC"fifiif echo "$show_create_table" | grep -q "AUTOSTATS_HISTOGRAM_MODE=OFF"thenif [[ "$alter_stmt" != "" ]]thenalter_stmt="${alter_stmt},"fialter_stmt="${alter_stmt}AUTOSTATS_HISTOGRAM_MODE=CREATE"fiif echo "$show_create_table" | grep -q "AUTOSTATS_SAMPLING=OFF"thenif [[ "${is_columnstore}" == 1 ]]thenif [[ "$alter_stmt" != "" ]]thenalter_stmt="${alter_stmt},"fifialter_stmt="${alter_stmt}AUTOSTATS_SAMPLING=ON"fiif [[ "$alter_stmt" != "" ]]thenalter_stmt="ALTER TABLE $table $alter_stmt"echo "AUTOSTATS disabled on table: $table"echo "SQL to enable AUTOSTATS: $alter_stmt"if [[ "$dry_run" == false ]]thenecho "Enabling AUTOSTATS on table: $table"${client_args[@]} -e "$alter_stmt"fifidone
Run this bash script with the mysql client arguments you use to connect to the master aggregator, like:
bash enable-autostats.sh -h<host> -P<port> -u<user> -p<password>
Possible reasons to disable autostats include:
-
Incremental autostats require a small memory overhead per column to maintain the statistics.
If you have a large number of columns across all the columnstore tables in your database, this memory overhead may be significant. In this case you can modify your table to use periodic autostats, as the memory overhead is significantly lower and it will still keep the statistics up to date. -
In
CREATEmode (whendefault_is set toautostats_ histogram_ mode CREATE), automatic histograms may end up creating histograms on unwanted columns.As each histogram has a small memory overhead, you may choose to switch the histogram mode to UPDATEso that all existing histograms will be kept up to date, but no new histograms will be automatically created.You can still create any histograms manually using the ANALYZEcommand. -
Histograms cannot be deleted in
CREATEmode.Deleting a histogram in this mode would have limited impact, as the same histogram may be created automatically in the future. To choose between specific columns for creating histograms, a table must be in UPDATEmode. -
Incremental autostats require a small performance cost on writes to a table.
You may wish to consider disabling autostats if you find this write performance cost to be significant on your workload, can achieve better overall performance with manual statistics, and are willing to handle the operational requirements of managing statistics manually. -
Periodic autostats uses a small fixed amount of resources in the background.
If the resources used negatively affect your workload, you may find it beneficial to disable autostats -
If you want to “freeze” the statistics, so that the query optimizer continues to use a fixed set of statistics regardless of how the data is changing, you can do so by disabling automatic statistics and using
ANALYZEto manage statistics.
Statistics Population after Enabling Autostats
Immediately after enabling autostats on a table where it was previously disabled, the engine will start collecting automatic statistics in the background until it catches up with the current table data.
This background statistics collection uses limited resources to avoid large impacts on the performance of other workloads.
Troubleshooting Autostats
Viewing Cardinality Statistics
You can view the cardinality statistics last collected by querying the information_ table.
DESCRIBE INFORMATION_SCHEMA.OPTIMIZER_STATISTICS;
+-----------------------+--------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+--------------+------+------+---------+-------+
| DATABASE_NAME | varchar(512) | NO | | NULL | |
| TABLE_NAME | varchar(512) | NO | | NULL | |
| COLUMN_NAME | varchar(512) | NO | | NULL | |
| ROW_COUNT | bigint(21) | YES | | NULL | |
| NULL_COUNT | bigint(21) | YES | | NULL | |
| CARDINALITY | bigint(21) | YES | | NULL | |
| AUTOSTATS_ENABLED | tinyint(1) | YES | | NULL | |
| AUTOSTATS_OUTDATED | tinyint(1) | YES | | NULL | |
| AUTOSTATS_ROW_COUNT | bigint(21) | YES | | NULL | |
| AUTOSTATS_NULL_COUNT | bigint(21) | YES | | NULL | |
| AUTOSTATS_CARDINALITY | bigint(21) | YES | | NULL | |
| ADVANCED_HISTOGRAMS | bigint(21) | YES | | NULL | |
| LEGACY_HISTOGRAMS | bigint(21) | YES | | NULL | |
| RANGE_STATS | bigint(21) | YES | | NULL | |
| SAMPLE_SIZE | bigint(21) | YES | | NULL | |
| LAST_UPDATED | datetime | YES | | NULL | |
+-----------------------+--------------+------+------+---------+-------+See the OPTIMIZER_information_ table.
Viewing Histogram Statistics
When collected, histograms can be queried through the information_ table:
DESCRIBE INFORMATION_SCHEMA.ADVANCED_HISTOGRAMS;
+-------------------------+---------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+---------------+------+------+---------+-------+
| DATABASE_NAME | varchar(512) | NO | | NULL | |
| TABLE_NAME | varchar(512) | NO | | NULL | |
| COLUMN_NAME | varchar(512) | NO | | NULL | |
| TYPE | varchar(255) | NO | | NULL | |
| VERSION | bigint(21) | NO | | NULL | |
| BUCKET_COUNT | bigint(21) | NO | | NULL | |
| BUCKET_INDEX | bigint(21) | NO | | NULL | |
| RANGE_MIN | varchar(8192) | YES | | NULL | |
| RANGE_MAX | varchar(8192) | YES | | NULL | |
| UNIQUE_COUNT | double | YES | | NULL | |
| CARDINALITY | double | YES | | NULL | |
| UNIQUE_COUNT_CUMULATIVE | double | YES | | NULL | |
| CARDINALITY_CUMULATIVE | double | YES | | NULL | |
+-------------------------+---------------+------+------+---------+-------+See the ADVANCED_information_ table.
Out-of-Date Statistics
If there is too large a difference between the data on which the engine has gathered statistics and the actual table data, such as immediately after autostats is enabled, autostats will be considered out-of-date.ANALYZE was run on the table, if any.EXPLAIN will show a warning that autostats is out of date.
You can accelerate the collection of autostats by running OPTIMIZE TABLE:
OPTIMIZE TABLE <table_name> FLUSH;OPTIMIZE TABLE <table_name>;
After statistics have been updated, you may run ANALYZE TABLE to trigger plan invalidation and re-optimization of any query plans that were based on stale statistics - see Plan Invalidation.
Identifying Missing Statistics Warnings in EXPLAIN
When you run EXPLAIN on a query, SingleStore will show information about any missing statistics that the optimizer could take advantage of, and the ANALYZE commands that you can run to collect those statistics.
When autostats are disabled, you should normally follow the recommendations in these warnings.
Note
As discussed above, collecting column statistics is strongly recommended, and collecting histograms is recommended in most situations.
For example, for the SELECT statement below, the EXPLAIN output shows that the optimizer is missing column statistics on both tables t and t2 as well as histograms on t..
CREATE TABLE t (id BIGINT PRIMARY KEY, x DOUBLE);CREATE TABLE t2 (id BIGINT PRIMARY KEY, x DOUBLE);EXPLAIN SELECT * FROM t, t2 WHERE t.id = t2.id AND t.x > 1;
+--------------------------------------------------------------------------------------------------+
| EXPLAIN |
+--------------------------------------------------------------------------------------------------+
| WARNING: Missing statistics on some tables, consider running the following commands: |
| ANALYZE TABLE db.`t2`; |
| ANALYZE TABLE db.`t` COLUMNS `x` ENABLE; |
| |
| Project [t.id, t.id2, t.x, t2.id, t2.x] |
| Gather partitions:all |
| Project [t.id, t.id2, t.x, t2.id, t2.x] |
| NestedLoopJoin |
| |---IndexSeek db.t2, PRIMARY KEY (id) scan:[id = t.id] |
| Filter [t.x > 1] |
| TableScan db.t, PRIMARY KEY (id) |
+--------------------------------------------------------------------------------------------------+Identifying Missing Histograms in the Plancache
In addition, you can view information about any recently run queries which were missing histograms, which you can use to help identify which columns you may need to enable autostats on, or collect histograms on.
The view MV_information_ and information_.
If the AUTOSTATS_ engine variable is set to CREATE, a histogram is created if the USAGE_ is 2 or greater on the information_ view.
SELECT * FROM information_schema.MV_PROSPECTIVE_HISTOGRAMS;
+---------------+------------+-------------+-------------+
| DATABASE_NAME | TABLE_NAME | COLUMN_NAME | USAGE_COUNT |
+---------------+------------+-------------+-------------+
| db | t | c | 3 |
| db | t | b | 3 |
| db | t | a | 3 |
+---------------+------------+-------------+-------------+The view information_ shows precisely which queries in the plancache could have made use of the histograms.MV_ and MV_ to learn more about the queries.
SELECT * FROM information_schema.MV_QUERY_PROSPECTIVE_HISTOGRAMS;
+-----------------------------------+---------------+------------+-------------+-------------+
| ACTIVITY_NAME | DATABASE_NAME | TABLE_NAME | COLUMN_NAME | USAGE_COUNT |
+-----------------------------------+---------------+------------+-------------+-------------+
| Select_symbols_t_349f04a075ae9982 | db | t | a | 1 |
| Select_symbols_t_bf93eac63fe6a2e8 | db | t | a | 1 |
| Select_symbols_t_bf93eac63fe6a2e8 | db | t | c | 1 |
| Select_symbols_t_056d53b2265b58ce | db | t | a | 1 |
| Select_symbols_t_4222228538392d61 | db | t | b | 1 |
| Select_symbols_t_ecb971f633697055 | db | t | b | 1 |
| Select_symbols_t_ecb971f633697055 | db | t | c | 1 |
| Select_symbols_t_d0692ff8d63f22e8 | db | t | b | 1 |
| Select_symbols_t_d0692ff8d63f22e8 | db | t | c | 1 |
+-----------------------------------+---------------+------------+-------------+-------------+Both of these views display columns which were missing histograms at the time the queries were compiled.
Exporting and Importing Statistics
It is also possible to export and import statistics.
ANALYZE TABLE table_name INTO OUTFILE '</path/file.json>';
Save column and histograms recorded on table_ into the file /path/file. in a JSON representation.data directory of the SingleStore instance.
Note that this exports the current statistics; it does not collect new statistics.ANALYZE.
ANALYZE TABLE table_name INFILE '/path/file.json';
Import range and column statistics for table_
This command is disallowed if autostats is enabled on the table.ANALYZE . to import the statistics, and re-enable autostats.
You can also export and import configuration parameters (including session and global variables) used by the optimizer via a .
ANALYZE OPTIMIZER_STATE INTO OUTFILE '</path/file.json>';ANALYZE OPTIMIZER_STATE INFILE '</path/file.json>';
INTO OUTFILE stores the configuration parameters into the specified file.
INFILE specifies a file from which to import configuration parameters.
Note
Using INFILE to set the configuration will update global variables.
Statistics Lifetime
Statistics are tied to a base table, and are persisted through ALTER TABLE or by autostats.DROP TABLE or ANALYZE TABLE t DROP.
Intermittent Statistic Collection
You can configure settings for periodic statistics collection by setting the engine variables background_ and background_.
The following scenario demonstrates the periodic statistics collection process for autostats-enabled tables.
-
The user creates a new database
DB1and tableDB1.with autostats enabled and the following default settings:T1 -
background_= 60 (trigger collection every 60 seconds)statistics_ collection_ interval -
background_= 0.statistics_ collection_ threshold 5 (trigger collection when the ratio of changes to the number of rows hits 0. 5)
-
-
The user inserts 5 rows into the
DB1.table.T1 The mv_table reports a total of 5 changes across the partitions forrow_ change_ counts DB1. -
Within 60 seconds, a background thread wakes up and scans the change counts.
The row count is saved as 5, and after the stats collection, the last_field in theanalyzed_ total_ changes mv_table is set to thecollected_ row_ change_ counts total_field, both of which will sum to 5 across partitions.changes -
The user inserts 3 new rows.
The row change counts now sum to 8. -
Next time the background thread wakes, it will see 3 changes and observe the last collected row count (5).
As 3/5 exceeds the 0. 5 ratio, the collection is triggered again.
Last modified: July 2, 2025