# Statistics and Sampling Tasks

## Disabling and Enabling Automatic Statistics

You can disable or enable autostats on each table. If you disable autostats on a table, statistics must be collected with the [ANALYZE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/operational-commands/analyze.md) command. If autostats is re-enabled, automatic statistics collection will resume (see the next section for details). SingleStore recommends leaving autostats enabled in most situations.

You can disable autostats on an existing table by running:

```sql
ALTER TABLE table_name DISABLE AUTOSTATS;
```

or

```sql
ALTER TABLE table_name AUTOSTATS_ENABLED = OFF;
```

and enable autostats by running

```sql
ALTER TABLE table_name ENABLE AUTOSTATS;
```

or

```sql
ALTER TABLE table_name AUTOSTATS_ENABLED = ON;
```

You can choose whether autostats are enabled or disabled when creating a table by adding the option:

```sql
AUTOSTATS_ENABLED=<TRUE|FALSE>
```

at the end of the `CREATE TABLE` statement, such as:

```sql
CREATE TABLE t (a INT, b INT, SORT KEY (a)) AUTOSTATS_ENABLED=FALSE;
```

When `AUTOSTATS_ENABLED` is set to `TRUE`, autostats categories are set to the following default settings

* For rowstore tables,
  ```sql
  AUTOSTATS_CARDINALITY_MODE = PERIODIC
  AUTOSTATS_HISTOGRAM_MODE = CREATE
  ```
* For columnstore tables,
  ```sql
  AUTOSTATS_CARDINALITY_MODE = INCREMENTAL
  AUTOSTATS_HISTOGRAM_MODE = CREATE
  AUTOSTATS_SAMPLING = ON
  ```

SingleStore allows you to independently control how each category of statistics is automatically gathered.

```sql
AUTOSTATS_CARDINALITY_MODE = [ INCREMENTAL | PERIODIC | OFF ]
AUTOSTATS_HISTOGRAM_MODE = [ CREATE | UPDATE | OFF ]
AUTOSTATS_SAMPLING = [ ON | OFF ]
```

These autostats settings can also be specified or modified through the `ALTER TABLE` and `CREATE TABLE` commands. Multiple options can be combined in a single `CREATE TABLE` statement. For example,

```sql
ALTER TABLE table_name AUTOSTATS_CARDINALITY_MODE = PERIODIC;
```

```sql
CREATE TABLE t1 (a INT, b INT, SORT KEY(a))
AUTOSTATS_CARDINALITY_MODE = INCREMENTAL,
AUTOSTATS_HISTOGRAM_MODE = CREATE,
AUTOSTATS_SAMPLING= ON;
```

> **⚠️ Warning**: Each type of autostats setting may not work for all types of tables. For example, you cannot have incremental autostats on a rowstore table.

Additionally, you can change whether autostats is enabled or disabled by default on future `CREATE TABLE` statements by setting the global variable `default_autostats_enabled` to `TRUE` or `FALSE` on the master aggregator. The default is `TRUE`.

You can also modify the default settings for autostats, individually, by setting the following global variables:

```sql
default_autostats_columnstore_cardinality_mode = [ INCREMENTAL | PERIODIC | OFF ]
default_autostats_rowstore_cardinality_mode = [ PERIODIC | OFF ]
default_autostats_histogram_mode = [ CREATE | UPDATE | OFF ]
default_autostats_columnstore_sampling = [ ON | OFF ]
```

> **📝 Note**: Limitations associated with using autostats:- Enabling `AUTOSTATS_HISTOGRAM_MODE` (either mode besides `OFF`) requires enabling `AUTOSTATS_CARDINALITY_MODE` (either mode besides `OFF`).
> - `INCREMENTAL` cardinality mode is only supported on columnstore tables.
> -  `AUTOSTATS_SAMPLING` is not allowed for rowstore tables. On columnstore tables, it controls whether the columnstore sample is built and used for estimation.

You can see whether autostats is enabled or disabled on a table by running `SHOW CREATE TABLE`.

## Autostats Support upon Upgrading

Any table that is upgraded to a newer version of SingleStore retains its autostats settings. If the previous SingleStore version did not have autostats support, then the upgraded tables will have autostats disabled upon upgrade. Also, if a table had autostats enabled prior to upgrade, the same autostats will be enabled on the table post upgrade. To enable autostats on existing columnstore tables created prior to the upgrade, which SingleStore recommends in most cases, run `ALTER TABLE <table_name> ENABLE AUTOSTATS` on each table. This will begin the process of collecting automatic statistics in the background.

Below is a simple script you can use to enable autostats on all tables (written in bash to minimize dependencies).

Save this script in a file called `enable-autostats.sh`:

```shell
#!/bin/bash

# Enables autostats on all the columnstore tables which currently have autostats disabled.
#
# Run this bash script with the singlestore client arguments you use to connect to the SingleStore deployment, like:
# bash enable-autostats.sh -h<host> -P<port> -u<user> -p<password>
#
# Use the --dry-run option to only print the list of tables to be altered, without actually running the ALTER commands.

set -e

client_args=(singlestore --batch --skip-column-names)
dry_run=false

while [[ $# -gt 0 ]]
do
    case "$1" in
        --dry-run)
            dry_run=true
            echo "NOTE: This is a dry run. The SQL statements are printed for tables where AUTOSTATS are disabled, but will not be executed."
            ;;
        *)
            client_args+=("$1")
            ;;
    esac
    shift
done

# List of table names (fully qualified and quoted)
tables=$(${client_args[@]} -e "select concat(table_schema, '.\`', replace(table_name, '\`', '\`\`'), '\`') from information_schema.tables where table_type = 'BASE TABLE'")

echo "$tables" | while read -r table
do
    show_create_table=$(${client_args[@]} -e "show create table $table")
    is_columnstore=$(${client_args[@]} -e "select 1 from information_schema.tables where concat(table_schema,'.\`',table_name,'\`')='${table}' and storage_type='COLUMNSTORE';")
    alter_stmt=""
    if echo "$show_create_table" | grep -q "AUTOSTATS_CARDINALITY_MODE=OFF"
    then
        if echo "$show_create_table" | grep -q "COLUMNSTORE"
        then
            alter_stmt="${alter_stmt}AUTOSTATS_CARDINALITY_MODE=INCREMENTAL"
        else
            alter_stmt="${alter_stmt}AUTOSTATS_CARDINALITY_MODE=PERIODIC"
        fi
    fi
    if echo "$show_create_table" | grep -q "AUTOSTATS_HISTOGRAM_MODE=OFF"
    then
        if [[ "$alter_stmt" != "" ]]
        then
            alter_stmt="${alter_stmt},"
        fi
        alter_stmt="${alter_stmt}AUTOSTATS_HISTOGRAM_MODE=CREATE"
    fi
    if echo "$show_create_table" | grep -q "AUTOSTATS_SAMPLING=OFF"
    then
        if [[ "${is_columnstore}" == 1 ]]
        then
            if [[ "$alter_stmt" != "" ]]
            then
                alter_stmt="${alter_stmt},"
            fi
        fi
        alter_stmt="${alter_stmt}AUTOSTATS_SAMPLING=ON"
    fi
    if [[ "$alter_stmt" != "" ]]
    then
        alter_stmt="ALTER TABLE $table $alter_stmt"
        echo "AUTOSTATS disabled on table: $table"
        echo "SQL to enable AUTOSTATS:  $alter_stmt"
        if [[ "$dry_run" == false ]]
        then
            echo "Enabling AUTOSTATS on table: $table"
            ${client_args[@]} -e "$alter_stmt"
        fi
    fi
done
```

Run this bash script with the `singlestore` client arguments you use to connect to the master aggregator, like:

```shell
bash enable-autostats.sh -h<host> -P<port> -u<user> -p<password>
```

## Possible reasons to disable autostats include:

* Incremental autostats require a small memory overhead per column to maintain the statistics. If you have a large number of columns across all the columnstore tables in your database, this memory overhead may be significant. In this case you can modify your table to use periodic autostats, as the memory overhead is significantly lower and it will still keep the statistics up to date.
* In `CREATE` mode (when `default_autostats_histogram_mode` is set to `CREATE`), automatic histograms may end up creating histograms on unwanted columns. As each histogram has a small memory overhead, you may choose to switch the histogram mode to `UPDATE` so that all existing histograms will be kept up to date, but no new histograms will be automatically created. You can still create any histograms manually using the `ANALYZE` command.
* Histograms cannot be deleted in `CREATE` mode. Deleting a histogram in this mode would have limited impact, as the same histogram may be created automatically in the future. To choose between specific columns for creating histograms, a table must be in `UPDATE` mode.
* Incremental autostats require a small performance cost on writes to a table. You may wish to consider disabling autostats if you find this write performance cost to be significant on your workload, can achieve better overall performance with manual statistics, and are willing to handle the operational requirements of managing statistics manually.
* Periodic autostats uses a small fixed amount of resources in the background. If the resources used negatively affect your workload, you may find it beneficial to disable autostats
* If you want to “freeze” the statistics, so that the query optimizer continues to use a fixed set of statistics regardless of how the data is changing, you can do so by disabling automatic statistics and using `ANALYZE` to manage statistics.

## Statistics Population after Enabling Autostats

Immediately after enabling autostats on a table where it was previously disabled, the engine will start collecting automatic statistics in the background until it catches up with the current table data. During this time, the query optimizer will use the previous manually-gathered statistics if available, until the autostats are sufficiently up-to-date - see the next section for details.

This background statistics collection uses limited resources to avoid large impacts on the performance of other workloads. Therefore, collecting automatic statistics may take a long time for large datasets, but this only affects the first time statistics are populated after autostats is enabled on a table where it was previously disabled

## Troubleshooting Autostats

## Viewing Cardinality Statistics

You can view the cardinality statistics last collected by querying the `information_schema.OPTIMIZER_STATISTICS` table.

```sql
DESCRIBE INFORMATION_SCHEMA.OPTIMIZER_STATISTICS;

```

```output

+-----------------------+--------------+------+------+---------+-------+
| Field                 | Type         | Null | Key  | Default | Extra |
+-----------------------+--------------+------+------+---------+-------+
| DATABASE_NAME         | varchar(512) | NO   |      | NULL    |       |
| TABLE_NAME            | varchar(512) | NO   |      | NULL    |       |
| COLUMN_NAME           | varchar(512) | NO   |      | NULL    |       |
| ROW_COUNT             | bigint(21)   | YES  |      | NULL    |       |
| NULL_COUNT            | bigint(21)   | YES  |      | NULL    |       |
| CARDINALITY           | bigint(21)   | YES  |      | NULL    |       |
| AUTOSTATS_ENABLED     | tinyint(1)   | YES  |      | NULL    |       |
| AUTOSTATS_OUTDATED    | tinyint(1)   | YES  |      | NULL    |       |
| AUTOSTATS_ROW_COUNT   | bigint(21)   | YES  |      | NULL    |       |
| AUTOSTATS_NULL_COUNT  | bigint(21)   | YES  |      | NULL    |       |
| AUTOSTATS_CARDINALITY | bigint(21)   | YES  |      | NULL    |       |
| ADVANCED_HISTOGRAMS   | bigint(21)   | YES  |      | NULL    |       |
| LEGACY_HISTOGRAMS     | bigint(21)   | YES  |      | NULL    |       |
| RANGE_STATS           | bigint(21)   | YES  |      | NULL    |       |
| SAMPLE_SIZE           | bigint(21)   | YES  |      | NULL    |       |
| LAST_UPDATED          | datetime     | YES  |      | NULL    |       |
+-----------------------+--------------+------+------+---------+-------+
```

See the [OPTIMIZER\_STATISTICS](https://docs.singlestore.com/db/v9.1/reference/information-schema-reference/query-performance-workload-management-and-statistics/optimizer-statistics.md) page for detailed explanation of how to interpret the `information_schema.OPTIMIZER_STATISTICS` table.

## Viewing Histogram Statistics

When collected, histograms can be queried through the `information_schema.ADVANCED_HISTOGRAMS` table:

```sql
DESCRIBE INFORMATION_SCHEMA.ADVANCED_HISTOGRAMS;

```

```output

+-------------------------+---------------+------+------+---------+-------+
| Field                   | Type          | Null | Key  | Default | Extra |
+-------------------------+---------------+------+------+---------+-------+
| DATABASE_NAME           | varchar(512)  | NO   |      | NULL    |       |
| TABLE_NAME              | varchar(512)  | NO   |      | NULL    |       |
| COLUMN_NAME             | varchar(512)  | NO   |      | NULL    |       |
| TYPE                    | varchar(255)  | NO   |      | NULL    |       |
| VERSION                 | bigint(21)    | NO   |      | NULL    |       |
| BUCKET_COUNT            | bigint(21)    | NO   |      | NULL    |       |
| BUCKET_INDEX            | bigint(21)    | NO   |      | NULL    |       |
| RANGE_MIN               | varchar(8192) | YES  |      | NULL    |       |
| RANGE_MAX               | varchar(8192) | YES  |      | NULL    |       |
| UNIQUE_COUNT            | double        | YES  |      | NULL    |       |
| CARDINALITY             | double        | YES  |      | NULL    |       |
| UNIQUE_COUNT_CUMULATIVE | double        | YES  |      | NULL    |       |
| CARDINALITY_CUMULATIVE  | double        | YES  |      | NULL    |       |
+-------------------------+---------------+------+------+---------+-------+
```

See the [ADVANCED\_HISTOGRAMS](https://docs.singlestore.com/db/v9.1/reference/information-schema-reference/query-performance-workload-management-and-statistics/advanced-histograms.md) page for detailed explanation of how to interpret the `information_schema.ADVANCED_HISTOGRAMS` table.

## Out-of-Date Statistics

If there is too large a difference between the data on which the engine has gathered statistics and the actual table data, such as immediately after autostats is enabled, autostats will be considered out-of-date. As soon as the difference becomes small enough, autostats will be considered up-to-date. While autostats are out-of-date, the query optimizer uses the statistics saved the last time `ANALYZE` was run on the table, if any. `EXPLAIN` will show a warning that autostats is out of date. If autostats are missing, sampling will be used to get any needed estimates.

You can accelerate the collection of autostats by running `OPTIMIZE TABLE`:

```sql
OPTIMIZE TABLE <table_name> FLUSH;
OPTIMIZE TABLE <table_name>;
```

After statistics have been updated, you may run `ANALYZE TABLE` to trigger plan invalidation and re-optimization of any query plans that were based on stale statistics - see [Plan Invalidation](https://docs.singlestore.com/db/v9.1/query-data/query-tuning/statistics-and-sampling/statistics-and-sampling-concepts/#section-idm4578590413484832788452287954.md).

## Identifying Missing Statistics Warnings in EXPLAIN

When you run `EXPLAIN` on a query, SingleStore will show information about any missing statistics that the optimizer could take advantage of, and the `ANALYZE` commands that you can run to collect those statistics. With autostats enabled, these messages will not appear or will go away on their own quickly, unless autostats is disabled for the needed columns. So if you see these messages, you typically should double check that autostats is enabled for all tables and columns.

When autostats are disabled, you should normally follow the recommendations in these warnings. The missing statistics listed here are statistics that are not present that the query optimizer would be able to use if collected.

> **📝 Note**: As discussed above, collecting column statistics is strongly recommended, and collecting histograms is recommended in most situations. See [Histograms vs. Sampling](https://docs.singlestore.com/db/v9.1/query-data/query-tuning/statistics-and-sampling/statistics-and-sampling-concepts/#section-idm4556660363819232788448162409.md) for more details.

For example, for the `SELECT` statement below, the `EXPLAIN` output shows that the optimizer is missing column statistics on both tables `t` and `t2` as well as histograms on `t.x`.

```sql
CREATE TABLE t (id BIGINT PRIMARY KEY, x DOUBLE);
CREATE TABLE t2 (id BIGINT PRIMARY KEY, x DOUBLE);
EXPLAIN SELECT * FROM t, t2 WHERE t.id = t2.id AND t.x > 1;

```

```output

+--------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                          |
+--------------------------------------------------------------------------------------------------+
| WARNING: Missing statistics on some tables, consider running the following commands:             |
|     ANALYZE TABLE db.`t2`;                                                                       |
|     ANALYZE TABLE db.`t` COLUMNS `x` ENABLE;                                                     |
|                                                                                                  |
| Project [t.id, t.id2, t.x, t2.id, t2.x]                                                          |
| Gather partitions:all                                                                            |
| Project [t.id, t.id2, t.x, t2.id, t2.x]                                                          |
| NestedLoopJoin                                                                                   |
| |---IndexSeek db.t2, PRIMARY KEY (id) scan:[id = t.id]                                           |
| Filter [t.x > 1]                                                                                 |
| TableScan db.t, PRIMARY KEY (id)                                                                 |
+--------------------------------------------------------------------------------------------------+
```

## Identifying Missing Histograms in the Plancache

In addition, you can view information about any recently run queries which were missing histograms, which you can use to help identify which columns you may need to enable autostats on, or collect histograms on. Again, it’s highly recommended to use autostats, and if you are, then you will either not see missing histogram information, or such information can be disregarded because soon autostats will gather histograms.

The view [MV\_PROSPECTIVE\_HISTOGRAMS](https://docs.singlestore.com/db/v9.1/reference/information-schema-reference/query-performance-workload-management-and-statistics/mv-prospective-histograms.md) shows columns which did not have histograms and were used in queries where a histogram could have been used. This view counts over all queries in the in-memory plancache (see [Understanding the Plancache](https://docs.singlestore.com/db/v9.1/query-data/advanced-query-topics/code-generation/#UUID-0a745a7a-c630-4066-7532-d3ded280d626.md)), i.e. those queries shown in `information_schema.PLANCACHE` and `information_schema.MV_QUERIES`.

If the `AUTOSTATS_HISTOGRAM_MODE` engine variable is set to `CREATE`, a histogram is created if the `USAGE_COUNT` is 2 or greater on the `information_schema.MV_PROSPECTIVE_HISTOGRAMS` view.

```sql
SELECT * FROM information_schema.MV_PROSPECTIVE_HISTOGRAMS;

```

```output

+---------------+------------+-------------+-------------+
| DATABASE_NAME | TABLE_NAME | COLUMN_NAME | USAGE_COUNT |
+---------------+------------+-------------+-------------+
| db            | t          | c           |           3 |
| db            | t          | b           |           3 |
| db            | t          | a           |           3 |
+---------------+------------+-------------+-------------+
```

The view `information_schema.MV_QUERY_PROSPECTIVE_HISTOGRAMS` shows precisely which queries in the plancache could have made use of the histograms. You may join this to views such as `MV_QUERIES` and `MV_ACTIVITIES` to learn more about the queries.

```sql
SELECT * FROM information_schema.MV_QUERY_PROSPECTIVE_HISTOGRAMS;

```

```output

+-----------------------------------+---------------+------------+-------------+-------------+
| ACTIVITY_NAME                     | DATABASE_NAME | TABLE_NAME | COLUMN_NAME | USAGE_COUNT |
+-----------------------------------+---------------+------------+-------------+-------------+
| Select_symbols_t_349f04a075ae9982 | db            | t          | a           |           1 |
| Select_symbols_t_bf93eac63fe6a2e8 | db            | t          | a           |           1 |
| Select_symbols_t_bf93eac63fe6a2e8 | db            | t          | c           |           1 |
| Select_symbols_t_056d53b2265b58ce | db            | t          | a           |           1 |
| Select_symbols_t_4222228538392d61 | db            | t          | b           |           1 |
| Select_symbols_t_ecb971f633697055 | db            | t          | b           |           1 |
| Select_symbols_t_ecb971f633697055 | db            | t          | c           |           1 |
| Select_symbols_t_d0692ff8d63f22e8 | db            | t          | b           |           1 |
| Select_symbols_t_d0692ff8d63f22e8 | db            | t          | c           |           1 |
+-----------------------------------+---------------+------------+-------------+-------------+
```

Both of these views display columns which were missing histograms at the time the queries were compiled. If you later collect histograms, or histograms are collected automatically, the views are not updated unless/until the query is recompiled.

## Exporting and Importing Statistics

It is also possible to export and import statistics. This can be useful for testing purposes - for example, you can copy statistics from one SingleStore cluster to another, in order to emulate the optimizer’s behavior on a test system that does not have the same data.

```sql
ANALYZE TABLE table_name INTO OUTFILE '</path/file.json>';
```

Save column and histograms recorded on `table_name` into the file `/path/file.json` in a JSON representation. You can use absolute or relative paths; relative paths are rooted in the `data` directory of the SingleStore instance.

Note that this exports the current statistics; it does not collect new statistics. It uses whatever set of statistics would currently be used by for query optimization - for column statistics on autostats-enabled tables, this would be the automatic statistics if they are enabled and up-to-date, and otherwise the statistics last saved by `ANALYZE`.

```sql
ANALYZE TABLE table_name INFILE '/path/file.json';
```

Import range and column statistics for table\_name from the file /path/file.json. You can use absolute or relative paths; relative paths are rooted in the data directory of the SingleStore instance.

This command is disallowed if autostats is enabled on the table. If you wish to use imported column statistics, you should disable autostats and then import the statistics. If you wish to use imported histograms only on an autostats-enabled table, you can temporarily disable autostats, run `ANALYZE ... INFILE` to import the statistics, and re-enable autostats.

You can also export and import configuration parameters (including session and global variables) used by the optimizer via a .json file. Session variables are exported from or imported into the current session, while global variables are saved and set globally.

```sql
ANALYZE OPTIMIZER_STATE INTO OUTFILE '</path/file.json>';
ANALYZE OPTIMIZER_STATE INFILE '</path/file.json>';
```

`INTO OUTFILE` stores the configuration parameters into the specified file.

`INFILE` specifies a file from which to import configuration parameters.

> **📝 Note**: Using `INFILE` to set the configuration will update global variables. This can impact other SingleStore sessions.

## Statistics Lifetime

Statistics are tied to a base table, and are persisted through `ALTER TABLE` or by autostats. Statistics are also persisted through server restarts. Statistics are deleted upon running `DROP TABLE` or `ANALYZE TABLE t DROP`.

## Intermittent Statistic Collection

You can configure settings for periodic statistics collection by setting the engine variables `background_statistics_collection_interval` and `background_statistics_collection_threshold`.

The following scenario demonstrates the periodic statistics collection process for autostats-enabled tables.

1. The user creates a new database `DB1` and table `DB1.T1` with autostats enabled and the following default settings:

   * `background_statistics_collection_interval` = 60 (trigger collection every 60 seconds)
   * `background_statistics_collection_threshold` = 0.5 (trigger collection when the ratio of changes to the number of rows hits 0.5)

2. The user inserts 5 rows into the `DB1.T1` table. The `mv_row_change_counts` table reports a total of 5 changes across the partitions for `DB1`.

3. Within 60 seconds, a background thread wakes up and scans the change counts. The row count is saved as 5, and after the stats collection, the `last_analyzed_total_changes` field in the `mv_collected_row_change_counts` table is set to the `total_changes` field, both of which will sum to 5 across partitions.

4. The user inserts 3 new rows. The row change counts now sum to 8.

5. Next time the background thread wakes, it will see 3 changes and observe the last collected row count (5). As 3/5 exceeds the 0.5 ratio, the collection is triggered again.

***

Modified at: May 14, 2026

Source: [/db/v9.1/query-data/query-tuning/statistics-and-sampling/statistics-and-sampling-tasks/](https://docs.singlestore.com/db/v9.1/query-data/query-tuning/statistics-and-sampling/statistics-and-sampling-tasks/)

(An index of the documentation is available at /llms.txt)
