Important

The SingleStore 9.1 release candidate (RC) gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 9.0 is recommended for production workloads, which can later be upgraded to SingleStore 9.1.

Cross Query Stats and Sampling Cache

Reuse runtime rowcount statistics across related queries, improving cardinality estimates and enabling better plans for common subqueries in future executions.
Cache sampling query results so that the optimizer can avoid rerunning the same sampling queries, reducing optimization time and resource usage.

Each cache entry is identified by a 64‑bit hash derived from the query or subquery shape. The cached value is the corresponding rowcount plus metadata used for cache management and observability.

This feature is available in SingleStore 9.1 and is disabled by default.

Write Path: Collecting Runtime Stats

The cross query stats cache can store runtime rowcount statistics and sampling results from executed queries. The cross_query_stats_collection_strategy global variable controls which statistics are written into the cache and supports the following values:

Off: No cross query stats are stored in the cross query stats cache.
SamplingOnly: Only results from sampling queries are collected.
SamplingAndMisEstimated: Sampling results and significantly misestimated feedback rowcounts are stored, saving memory while still correcting bad estimates.
Full: All legitimate runtime rowcounts that can be used for future optimization are stored.
Auto: Reserved for future flexibility; currently behaves the same as Off and is the default value.

SingleStore heuristically decides which profiled rowcounts are legitimate to store, and exposes the tracked entries in query profiles using the fr_track_table_id and fr_track_hash fields. Refer to Observability and Profiling for more information.

Read Path: Using Cached Stats During Optimization

The optimizer can read the cross query stats cache in two ways:

Use cached feedback rowcounts when estimating cardinalities during optimization.
Use cached sampling results instead of issuing new sampling queries.

These behaviors are controlled by the following session variables:

enable_cross_query_stats_for_sampling: Use the cache when the optimizer would otherwise issue a sampling query.
optimizer_use_cross_query_feedback_stats: Use runtime feedback rowcounts from the cache during cardinality estimation.

Both variables are ENUMs with values Off, On, and Auto, and default to Auto (currently equivalent to Off). The optimizer_use_cross_query_feedback_stats variable also affects the plan cache query hash, so changing it can cause new plans to be generated.

When enable_cross_query_stats_for_sampling is enabled, the optimizer reuses cached sampling results instead of issuing new sampling queries. If multiple sessions need the same sampling result concurrently, only one executes the sampling query while others wait briefly for the result. If the wait exceeds the timeout, waiting sessions issue their own sampling queries.

Cached sampling results expire based on cross_query_stats_sampling_valid_time_min and cross_query_stats_expiration_minutes. Expired or invalidated entries (via DROP SAMPLE FROM cross_query_stats_cache) trigger new sampling queries.

When optimizer_use_cross_query_feedback_stats is enabled and a matching cached rowcount is found, that rowcount takes priority over other estimation sources, and is propagated through later phases of optimization.

Cache Management, TTL, and Memory Limits

The cross query stats cache is a TTL‑based in‑memory cache with garbage collection behavior similar to the in‑memory plan cache.

The following global variables control expiry and memory usage:

Global Variable	Description and Default Value
`cross_query_stats_expiration_minutes`	A cache entry must be reused (written or read) at least once within this interval, otherwise it is evicted. Default: 12 × 60 minutes (12 hours)
`cross_query_stats_max_mb`	Soft memory limit (in megabytes) for the entire cache. Once this limit is reached, new entries are no longer inserted. Default: 1024 MB (1 GB)
`cross_query_stats_sampling_valid_time_min`	Maximum time window (in minutes) since last update during which sampling entries remain valid for reuse. Default: 5 minutes

Global Variable

Description and Default Value

cross_query_stats_expiration_minutes

A cache entry must be reused (written or read) at least once within this interval, otherwise it is evicted.

Default: 12 × 60 minutes (12 hours)

cross_query_stats_max_mb

Soft memory limit (in megabytes) for the entire cache. Once this limit is reached, new entries are no longer inserted.

Default: 1024 MB (1 GB)

cross_query_stats_sampling_valid_time_min

Maximum time window (in minutes) since last update during which sampling entries remain valid for reuse.

Default: 5 minutes

You can also configure whether to store query text for sampling entries using the global variable enable_cross_query_stats_store_query_text. Refer to List of Engine Variables for more information.

Cache entries can also be cleared manually. Refer to SHOW CROSS_QUERY_STATS_CACHE STATUS and DROP ... FROM CROSS_QUERY_STATS_CACHE commands for more information.

Observability and Profiling

Cross query stats are primarily observable through query profiles. For executors whose rowcounts are collected into the cache, profiles include the following additional fields.

fr_track_table_id: Table identifier used by feedback reoptimization.
fr_track_hash: Hash derived from the query/subquery shape; this hash matches the HASH column in the cache tables.

In profiles, est_rows_source indicates the source of estimated rowcounts.

CROSS_QUERY_STATS: Stats originate from the cross query stats cache.
CROSS_QUERY_CACHED_SAMPLING: Sampling results are reused from the cache.

Additional observability is provided by the information schema table, its managed view, and the SHOW CROSS_QUERY_STATS_CACHE STATUS command. Refer to INFORMATION_SCHEMA.CROSS_QUERY_STATS_CACHE and INFORMATION_SCHEMA.MV_CROSS_QUERY_STATS_CACHE for more information.

Interactions with Auto Profiling and Feedback Reoptimization

The cross query stats cache is closely related to existing profiling and reoptimization features:

With auto profile smart mode, enable_auto_profile defaults to TRUE and auto_profile_type defaults to SMART. Under this mode, SingleStore automatically profiles runtime rowcounts for queries with distributed operations.When cross_query_stats_collection_strategy is enabled, the system reuses these profiled stats to populate the cache, without changing auto profile behavior itself.
(Auto) Feedback Reoptimization and cross query stats both aim to improve plans using runtime rowcounts:
- Feedback reoptimization reoptimizes a query using stats from its own execution.
- Cross query stats cache allows a query to leverage stats from other queries that share common subpatterns.

A more general hash algorithm for arbitrary subquery structures is integrated into feedback reoptimization, removing a previous limitation in SingleStore 9.0 versions that restricted it to queries with at most 63 tables.

Examples

Example 1: Enable Collection and Inspect Cache Entries

The following steps show how to enable cross query stats collection, run a query, and inspect the resulting cache entries.

SQL

-- Enable collection of all legitimate runtime rowcounts
SET GLOBAL cross_query_stats_collection_strategy = 'Full';

-- Run a representative query
PROFILE SELECT COUNT(*) FROM t WHERE created_at >= '2025-01-01';

-- Inspect cache entries on the connected aggregator
SELECT * FROM information_schema.cross_query_stats_cache;

-- View aggregate cache statistics
SHOW cross_query_stats_cache STATUS;

In this scenario:

The PROFILE query populates entries in the cross query stats cache (subject to the heuristics described earlier).
INFORMATION_SCHEMA.CROSS_QUERY_STATS_CACHE exposes individual entries, including HASH, ROWCOUNT_INITIALIZED, ROWCOUNT, and HIT_COUNT.
SHOW cross_query_stats_cache STATUS returns totals such as:
- Number of cache entries
- Total cache hits
- Total memory in use (in bytes).

Example 2: Reuse Sampling Results and Reset Sampling Entries

This example enables reuse of sampling results, then shows how to drop only sampling-related entries without clearing the entire cache.

SQL

-- Allow sampling queries to use cached results
SET SESSION enable_cross_query_stats_for_sampling = 'On';

-- Run a query that requires sampling
EXPLAIN SELECT * FROM t WHERE some_predicate;

-- Later, reset only sampling-related cache entries
DROP SAMPLE FROM cross_query_stats_cache;

-- Optionally, check status after the drop
SHOW cross_query_stats_cache STATUS;