# Cross Query Stats and Sampling Cache

The cross query stats cache is an in‑memory cache on each aggregator node that stores rowcount statistics and sampling results from previously executed queries. It is used to:

* Reuse runtime rowcount statistics across related queries, improving cardinality estimates and enabling better plans for common subqueries in future executions.
* Cache sampling query results so that the optimizer can avoid rerunning the same sampling queries, reducing optimization time and resource usage.

Each cache entry is identified by a 64‑bit hash derived from the query or subquery shape. The cached value is the corresponding rowcount plus metadata used for cache management and observability.

This feature is available in SingleStore 9.1 and is disabled by default.

## Write Path: Collecting Runtime Stats

The cross query stats cache can store runtime rowcount statistics and sampling results from executed queries. The `cross_query_stats_collection_strategy` global variable controls which statistics are written into the cache and supports the following values:

* **Off**: No cross query stats are stored in the cross query stats cache.
* **SamplingOnly**: Only results from sampling queries are collected.
* **SamplingAndMisEstimated**: Sampling results and significantly misestimated feedback rowcounts are stored, saving memory while still correcting bad estimates.
* **Full**: All legitimate runtime rowcounts that can be used for future optimization are stored.
* **Auto**: Reserved for future flexibility; currently behaves the same as `Off` and is the default value.

SingleStore heuristically decides which profiled rowcounts are legitimate to store, and exposes the tracked entries in query profiles using the `fr_track_table_id` and `fr_track_hash` fields. Refer to [Observability and Profiling](https://docs.singlestore.com/db/v9.1/query-data/query-tuning/cross-query-stats-and-sampling-cache/#section-id235514321376178.md) for more information.

## Read Path: Using Cached Stats During Optimization

The optimizer can read the cross query stats cache in two ways:

* Use cached feedback rowcounts when estimating cardinalities during optimization.
* Use cached sampling results instead of issuing new sampling queries.

These behaviors are controlled by the following session variables:

* `enable_cross_query_stats_for_sampling`: Use the cache when the optimizer would otherwise issue a sampling query.
* `optimizer_use_cross_query_feedback_stats`: Use runtime feedback rowcounts from the cache during cardinality estimation.

Both variables are ENUMs with values `Off`, `On`, and `Auto`, and default to `Auto` (currently equivalent to `Off`). The `optimizer_use_cross_query_feedback_stats` variable also affects the plan cache query hash, so changing it can cause new plans to be generated.

When `enable_cross_query_stats_for_sampling` is enabled, the optimizer reuses cached sampling results instead of issuing new sampling queries. If multiple sessions need the same sampling result concurrently, only one executes the sampling query while others wait briefly for the result. If the wait exceeds the timeout, waiting sessions issue their own sampling queries.

Cached sampling results expire based on `cross_query_stats_sampling_valid_time_min` and `cross_query_stats_expiration_minutes`. Expired or invalidated entries (via `DROP SAMPLE FROM cross_query_stats_cache`) trigger new sampling queries.

When `optimizer_use_cross_query_feedback_stats` is enabled and a matching cached rowcount is found, that rowcount takes priority over other estimation sources, and is propagated through later phases of optimization.

## Cache Management, TTL, and Memory Limits

The cross query stats cache is a TTL‑based in‑memory cache with garbage collection behavior similar to the in‑memory plan cache.

The following global variables control expiry and memory usage:

| Global Variable                             | Description and Default Value                                                                                                                  |
| ------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| `cross_query_stats_expiration_minutes`      | A cache entry must be reused (written or read) at least once within this interval, otherwise it is evicted.Default: 12 × 60 minutes (12 hours) |
| `cross_query_stats_max_mb`                  | Soft memory limit (in megabytes) for the entire cache. Once this limit is reached, new entries are no longer inserted.Default: 1024 MB (1 GB)  |
| `cross_query_stats_sampling_valid_time_min` | Maximum time window (in minutes) since last update during which sampling entries remain valid for reuse.Default: 5 minutes                     |

You can also configure whether to store query text for sampling entries using the global variable `enable_cross_query_stats_store_query_text`. Refer to [List of Engine Variables](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/engine-variables/list-of-engine-variables.md) for more information.

Cache entries can also be cleared manually. Refer to [SHOW CROSS\_QUERY\_STATS\_CACHE STATUS](https://docs.singlestore.com/db/v9.1/reference/sql-reference/show-commands/show-cross-query-stats-cache-status.md) and [DROP ... FROM CROSS\_QUERY\_STATS\_CACHE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-definition-language-ddl/drop-from-cross-query-stats-cache.md) commands for more information.

## Observability and Profiling

Cross query stats are primarily observable through query profiles. For executors whose rowcounts are collected into the cache, profiles include the following additional fields.

* `fr_track_table_id`: Table identifier used by feedback reoptimization.
* `fr_track_hash:` Hash derived from the query/subquery shape; this hash matches the `HASH` column in the cache tables.

In profiles, `est_rows_source` indicates the source of estimated rowcounts.

* `CROSS_QUERY_STATS`: Stats originate from the cross query stats cache.
* `CROSS_QUERY_CACHED_SAMPLING`: Sampling results are reused from the cache.

Additional observability is provided by the information schema table, its managed view, and the `SHOW CROSS_QUERY_STATS_CACHE STATUS` command. Refer to `INFORMATION_SCHEMA.CROSS_QUERY_STATS_CACHE` and `INFORMATION_SCHEMA.MV_CROSS_QUERY_STATS_CACHE` for more information.

## Interactions with Auto Profiling and Feedback Reoptimization

The cross query stats cache is closely related to existing profiling and reoptimization features:

* With auto profile smart mode, `enable_auto_profile` defaults to `TRUE` and `auto_profile_type` defaults to `SMART`. Under this mode, SingleStore automatically profiles runtime rowcounts for queries with distributed operations.When `cross_query_stats_collection_strategy` is enabled, the system reuses these profiled stats to populate the cache, without changing auto profile behavior itself.
* (Auto) Feedback Reoptimization and cross query stats both aim to improve plans using runtime rowcounts:

  * Feedback reoptimization reoptimizes a query using stats from its own execution.
  * Cross query stats cache allows a query to leverage stats from other queries that share common subpatterns.

A more general hash algorithm for arbitrary subquery structures is integrated into feedback reoptimization, removing a previous limitation in SingleStore 9.0 versions that restricted it to queries with at most 63 tables.

## Examples

## Example 1: Enable Collection and Inspect Cache Entries

The following steps show how to enable cross query stats collection, run a query, and inspect the resulting cache entries.

```sql
-- Enable collection of all legitimate runtime rowcounts
SET GLOBAL cross_query_stats_collection_strategy = 'Full';

-- Run a representative query
PROFILE SELECT COUNT(*) FROM t WHERE created_at >= '2025-01-01';

-- Inspect cache entries on the connected aggregator
SELECT * FROM information_schema.cross_query_stats_cache;

-- View aggregate cache statistics
SHOW cross_query_stats_cache STATUS;
```

In this scenario:

* The `PROFILE` query populates entries in the cross query stats cache (subject to the heuristics described earlier).
* `INFORMATION_SCHEMA.CROSS_QUERY_STATS_CACHE` exposes individual entries, including `HASH`, `ROWCOUNT_INITIALIZED`, `ROWCOUNT`, and `HIT_COUNT`.
* `SHOW cross_query_stats_cache STATUS` returns totals such as:

  * Number of cache entries
  * Total cache hits
  * Total memory in use (in bytes).

## Example 2: Reuse Sampling Results and Reset Sampling Entries

This example enables reuse of sampling results, then shows how to drop only sampling-related entries without clearing the entire cache.

```sql
-- Allow sampling queries to use cached results
SET SESSION enable_cross_query_stats_for_sampling = 'On';

-- Run a query that requires sampling
EXPLAIN SELECT * FROM t WHERE some_predicate;

-- Later, reset only sampling-related cache entries
DROP SAMPLE FROM cross_query_stats_cache;

-- Optionally, check status after the drop
SHOW cross_query_stats_cache STATUS;
```

In this scenario:

* Turning on `enable_cross_query_stats_for_sampling` lets sampling queries check the cache first and reuse stored rowcounts when valid.
* `DROP SAMPLE FROM cross_query_stats_cache` invalidates sampling-related entries (for example, after data or workload changes), while preserving other feedback rowcount entries.

***

Modified at: April 30, 2026

Source: [/db/v9.1/query-data/query-tuning/cross-query-stats-and-sampling-cache/](https://docs.singlestore.com/db/v9.1/query-data/query-tuning/cross-query-stats-and-sampling-cache/)

(An index of the documentation is available at /llms.txt)
