8.9 Release Notes

To deploy a SingleStore 8.9 cluster, refer to the Deploy SingleStore Guide.
To upgrade a self-managed install to this release, follow this guide.
To make a backup of a database in this release or to restore a database backup to this release, follow this guide.
New deployments of SingleStore 8.9 require a 64-bit version of RHEL/AlmaLinux 7 or later, or Debian 8 or later, with kernel 3.10 or later and glibc 2.17 or later. Refer to the System Requirements and Recommendations for additional information.

Release Highlights

Note

This is the complete list of new features and fixes in SingleStore engine version 8.9.

Full-Text Search - Analyzers and Tokenizers

Full-text search using SingleStore's VERSION 2 full-text index has been enhanced with support for custom analyzers and tokenizers. With this enhancement, SingleStore full-text indexes can be created to support languages other than English, for text that contains emails and URLs, with custom whitespace processing, and more. The n_gram tokenizer, which breaks words into n-grams, sequences of n adjacent symbols is included. The full set of Apache Lucene analyzers and tokenizers is supported. Refer to Full Text VERSION 2 Custom Analyzers for more information.

Full-Text Search - Enhanced BM25 Scoring

Full-text search using SingleStore's VERSION 2 full-text index has updated BM25 scoring functionality.

The BM25 function has been enhanced with support for boolean and boost queries, phrase and proximity search queries, and queries over multiple columns. Refer to BM25 for more information and examples.

A new function, BM25_GLOBAL, has been added to provide BM25 scoring across all partitions. With this new function, all rows in a table are scored together; collection and term statistics are calculated for a table, ensuring accurate scores relative to all rows in a table. The BM25_GLOBAL function augments the existing BM25 and MATCH functions, and it is more accurate and more expensive than both of these functions. Refer to BM25 for more information on the BM25_GLOBAL function.

Iceberg Continuous Ingest

Added support for continuous ingest of data from Iceberg tables. Upsert and append-only workloads are supported. In addition, manual upserts with the CREATE OR REPLACE command are supported. Refer to Iceberg Ingest for more information.

Iceberg - New Catalogs

Added support for Snowflake, REST, JDBC, Hive, and Polaris Catalogs for Iceberg Ingest using pipelines. Refer to Iceberg Ingest for more information.

Enhanced Disk Spilling

Added disk spilling for RIGHT and FULL OUTER JOIN. Refer to Disk Spilling for more information.

Writable Views

Writable views allow users to run UPDATE, INSERT, and DELETE queries on views. To enable writable views, set the enable_writable_views global variable to 1. Query the information_schema.VIEWS view to inspect if a view can be updated. Refer to CREATE VIEW for more information.

Other Improvements and Fixes

Vector Index on Nullable Column

Vector Indexes can be created on columns that are nullable. Prior to this improvement, vector indexes could only be created on columns that were declared NOT NULL. With this improvement, a user can insert a row containing text and a NULL vector value into a table with a vector index. The user can subsequently obtain a vector embedding for the text and update the row with that vector value. The updated value will be added to the vector index.

Vector Index Memory Tracking

The memory used by vector indexes can be tracked using the alloc_vector_index metric which is now available in SHOW STATUS EXTENDED. Refer to Vector Indexing and Tuning Vector Indexes and Queries for more information.

Vector Index Merger

Added the vector index merger which combines per-segment vector indexes to create a cross-segment vector index and improve the performance of vector search queries. Refer to Vector Indexing for more information.

High Availability for the Master Aggregator

Note: This is a Preview feature and will be part of an upcoming add-on pricing package.

High availability (HA) for the Master Aggregator (MA) node enhances the reliability of mission-critical workloads. This feature automatically elects and promotes a new MA if the primary MA fails, ensuring the continuous operation of a cluster. Refer to Multi-Datacenter Failover for more information.

Other Performance Enhancements

Enhancement: Added sub-segment elimination for flexible parallelism. Refer to Flexible Parallelism for more information.
Enhancement: Performance of full-text search, specifically throughput performance, has been significantly improved.
Enhancement: Improved performance of VECTOR type user-defined variables (UDVs). These variables no longer require using extra typecast to BLOB data type.
Enhancement: Improved the performance of LOAD DATA queries that include the CHARACTER SET clause.
Enhancement: Improved performance of the CREATE PROJECTION command by skipping unique key checks.
Enhancement: Improved the performance of REPLACE query into a columnstore table when table-level locking is triggered.
Enhancement: Improved the performance of CREATE {TABLE | TABLES} AS INFER PIPELINE queries.
Enhancement: Optimized full-text queries with ORDER BY ... LIMIT on a full-text score and that optionally filter on the same full-text clause.
Enhancement: Significantly improved the performance (~20x) of certain JSON-based SQL queries, when JSON objects contain arrays of sub-objects. This optimization reduces the need to normalize the data into multiple tables to achieve high analytics performance.

Queries that expand JSON arrays (with or without any aggregations) and/or perform the following operations benefit from this optimization:
- Group by a field outside the array (in the GROUP BY clause)
- Filter on the fields in the array
Enhancement: Optimized parsing of JSON computed columns. This optimization improves INSERT, pipeline, and load data performance by parsing JSON data once for multiple JSON computed columns. Refer to JSON Computed Column Optimization for more information.

New Information Schema Views and Columns

Enhancement: Added the ACTIVE_METADATA_VERSIONS information schema view that provides visbiility into active metadata transaction read versions. (8.9.31)
Enhancement: Added the MV_GC_EVENTS information schema view that provides visibility into garbage collection passes. This view is gated behind the preview feature and engine variable enable_gc_events. (8.9.31)
Enhancement: Added the TABLE_NAME column to the LOAD_DATA_ERRORS information schema view. TABLE_NAME is the name of the table associated with the error.
Enhancement: Added the NODE_ID column to the MV_RECOVERY_STATUS information schema view that specifies the ID of the node from which the database is being recovered.
Enhancement: Added the MV_BOTTOMLESS_API_EVENTS_SUMMARY information schema view that contains a summary of remote API calls made from the engine. Refer to MV_BOTTOMLESS_API_EVENTS_SUMMARY for more information.
Enhancement: Added the following columns (metrics) to the MV_BOTTOMLESS_STATUS_EXTENDED and MV_BOTTOMLESS_SUMMARY information schema views:
- IS_BOTTLE_SERVICE_UP: Indicates whether the bottle service is up.
- BOTTLE_SERVICE_UPTIME_DOWNTIME_SECS: Specifies the minimum uptime when the bottle service is up and the maximum downtime when the bottle service is down.

New Commands and Functions

Enhancement: Introduced new JSON_BUILD_ARRAY function. (8.9.5)
New feature: Added the FULLTEXT SERVICE STOP command. This command stops the full-text V2 service running on any node connected to the aggregator on which the command is run. Refer to FULLTEXT SERVICE STOP for more information.
New feature: Added the following Identifier Generation Functions:
- UUID_TO_BIN
- BIN_TO_UUID
- IS_UUID
New feature: Added a SHOW FULLTEXT SERVICE METRICS command that displays the diagnostic metrics for the JLucene full-text search in JSON format. Refer to SHOW FULLTEXT SERVICE METRICS for more information.
New feature: Added a SHOW CDC EXTRACTOR POOL command that displays information about the CDC-in pipelines. Refer to SHOW CDC EXTRACTOR POOL for more information.
New feature: Added a new JSON_MERGE_PATCH function that merges two JSON objects into a single JSON object. Refer to JSON_MERGE_PATCH for more information.
New feature: Added support for Lateral Join. Lateral join allows a subquery in the FROM clause of a SQL query to reference another table in that same FROM clause, which can simplify query syntax. Refer to Lateral Join for more information.

New or Modified Engine Variables

Refer to List of Engine Variables for information on each of the following engine variables.

Enhancement: Added an engine variable optimizer_disable_subselect_to_join_cte_preprocess which can be used to disable the subselect to join rewrite for CTE preprocessing before inlining or materialization. (8.9.16)

Enhancement: Added an engine variable optimizer_enable_merge_unioned_queries_rewrite which can be used to enable or disable the rewrite that merges union queries. This variable is ON by default. (8.9.15)
Enhancement: Added an engine variable disk_plan_gc_pause_seconds_on_startup that disables disk plan garbage collection on startup. This variable prevents hot plans from being unintentionally disk garbage collected. (8.9.11)
Enhancement: Added a new engine variable scheduler_slow_loop_seconds that specifies the threshold for triggering the verbose logging of scheduler thread timing. (8.9.8)
Enhancement: Added a new engine variable scheduler_slow_ready_queue_seconds that specifies the threshold for triggering logging of slow ready queue draining. (8.9.8)
Enhancement: Introduced pipelines_iceberg_data_workers_heap_size global engine variable to control memory used by iceberg data processing. (8.9.5)
Enhancement: Added a new engine variable sync_partitions_timeout_sec that specifies the timeout (in seconds) to synchronize the cluster metadata across the cluster.
Enhancement: Added a new engine variable disconnect_client_on_invalid_connection_state. If enabled, client connections are closed when their state becomes invalid.
Enhancement: Added a new engine variable synchronize_reference_timeout_ms that specifies the time (in seconds) long running queries wait for reference databases to synchronize on commit in the cluster.
Enhancement: Added a new engine variable assume_udfs_deterministic that controls behavior where SingleStore does extra work to avoid issues caused by UDFs that return different values when called repeatedly (e.g., are non-deterministic).
Enhancement: Added a new engine variable max_autostats_update_workers to tune the maximum number of background autostats update workers.
Enhancement: Added a new engine variable enable_writable_views that enables creation of writable views. Refer to CREATE VIEW for more information.
Enhancement: Added a new engine variable recovery_concurrency that controls the replay and database initialization concurrency during database recovery.
Enhancement: Updated the minimum value of json_document_max_children engine variable to 1 (from 128 previously).
Enhancement: Disabled the optimize_json_computed_column engine variable by default.
Enhancement: Added a new engine variable enable_block_level_stats_collection that controls the collection of block-level statistics for sub-segment elimination for flexible parallelism.
Enhancement: Added a new engine variable enable_block_stats_use_in_query that controls whether the block-level statistics are read and used during scan as part of sub-segment elimination for flexible parallelism.
Enhancement: Added a new engine variable pipelines_iceberg_heap_size to control heap size specifically for Iceberg pipelines.
Enhancement: Added json_collation global variable to control collation of JSON. The value of json_collation can be either utf8_bin or utf8mb4_bin.
Enhancement: Added a method to throttle upload ingest when blob cache has low evictability and running out of disk space is imminent. This is controlled via the following two new engine variables:
- bottomless_upload_throttle_hard_limit_cache_usability: The usability (free space + evictable space) of blob cache below which all columnstore ingest is throttled.
- bottomless_upload_throttle_soft_limit_cache_usability: The usability (free space + evictable space) of blob cache below which some columnstore ingest is throttled.
Enhancement: Added a new optimizer_not_null_filter_derivation engine variable that controls a new filter derivation rewrite.
Enhancement: Added a new engine variable external_functions_service_buffer_mb that sets the maximum size (in MB) of the memory-mapped region used to communicate between the engine and collocated services.
Bugfix: During the upgrade to SingleStore 8.9, if the value of fts2_max_connections is equal to 100000, it is set to 32.
Enhancement: Added a new engine variable early_snapshot_timeout_seconds that specifies the time (in seconds) to wait after an ALTER or TRUNCATE command before taking an early snapshot.

Miscellaneous

Updates

Enhancement: Added knob to disable cardinality estimates on JSON/BSON columns during analyze. (8.9.34)
Bugfix: Fixed query performance issue caused by heuristic estimate selectivity being overwritten. (8.9.34)
Bugfix: Updated third party libraries to fix security vulnerabilities. (8.9.34)
Bugfix: Fixed an issue where monitoring pipelines would sometimes fail to produce data for large source clusters. (8.9.34)
Bugfix: Fixed row count estimation for union subquery containing Materialized Common Table Expression(MCTE)s. (8.9.34)

Bugfix: Updated third party libraries to fix security vulnerabilities. (8.9.33)

Enhancement: Added support for fully enclosed optimization on nullable columns and segment elimination on IS NULL filter. (8.9.31)
Bugfix: Fixed incorrect results in an ORDER BY <sort key> LIMIT query, caused by an issue where rows with null values in the sort key may be skipped. (8.9.31)

Bugfix: Fixed results for NOT IN subselects when the left input table has NULL values. (8.9.29)

Enhancement: Added Lucene logs to the cluster report. (8.9.28)
Bugfix: Fixed a bug in lateral joins. (8.9.28)

Bugfix: Fixed the predicate pushdown logic for materialized common table expressions. (8.9.27)
Bugfix: Fixed an issue where periodic autostat was disabled when it should not have been. (8.9.27)

Bugfix: Fixed crash in lockfree hashtable when engine is under high memory pressure. Report out of memory error instead. (8.9.26)
Enhancement: Users with database visibility can now query the *_BOTTOMLESS_* information schema views in the respective database. CLUSTER permissions are no longer enforced for querying these views. (8.9.25)
Enhancement: Improved the error message for parameter count mismatch during query parsing. (8.9.25)
Enhancement: Disabled segment elimination for IN clauses for which the left-hand expression is not a table column. (8.9.25)
Bugfix: Fixed a leak in a rare scenario. (8.9.25)
Bugfix: Fixed the ER_BAD_TABLE_ERROR error on UPDATE or DELETE queries with a specific shape. (8.9.25)
Bugfix: Fixed a crash in a shallow copy of a table with a pending ALTER operation. (8.9.25)
Bugfix: Fixed a bug in CTE query rewrites. (8.9.25)
Bugfix: Fixed a potential replay failure on a table that uses full-text version 2. (8.9.25)
Bugfix: Fixed an issue where IN-list factorization did not work with newline characters around the IN-list. (8.9.25)

Enhancement: Extended a rewrite to allow merging derived tables with json functions. (8.9.24)
Enhancement: Changed the default blob cache size to 75% of the total disk for small disk config. (8.9.24)
Enhancement: Enabled rewriting correlated subselects that depend on more than one outer table with conditions other than equality. (8.9.24)
Enhancement: Prevented the destination table from being replaced by projection on update or delete. (8.9.24)
Enhancement: Renamed GIN index to Multi-Value Hash Index. GIN keyword deprecated. (8.9.24)
Bugfix: Fixed a bug that caused filter-aware optimizations to not be applied above MCTE when involved in hash joins. (8.9.24)
Bugfix: Fixed a bug that occurred when CTEs are rewritten. (8.9.24)

Enhancement: Added more metrics to the views, MV_CLOUD_PER_COMPUTE_REMOTE_STATS and MV_CLOUD_PER_STORAGE_REMOTE_STATS, for tracking of retention log chunks and snapshots.(8.9.22)
Enhancement: Added support for stopwords token_filter for custom analyzers for FTS VERSION 2. (8.9.22)
BugFix: Fixed invalid optree from rewrite on union queries matching a very specific shape.(8.9.22)

Enhancement: Added estimates for query memory usage based on static row count and row size estimates to improve Workload Manager (WM) memory management. (8.9.21)
Enhancement: Added support to the GIN Index for BSON for expressions, built-in functions, user-defined variables (UDVs), and typecasting. User-defined functions (UDFs) and non-deterministic built-in functions, such as rand(), remain unsupported. (8.9.21)
Bugfix: Fixed an issue that could lead to undefined behavior when a node failed concurrently with the computation of the replication distribution tree. (8.9.21)
Bugfix: Fixed a NULL value inline bug by adding typecast operations above the referencing field. (8.9.21)
Bugfix: Fixed an issue where a node that had not been upgraded could trigger undefined behavior if the node's processlist was queried while the node was processing an internal RPC. (8.9.21)

Enhancement: Added a KILL option for the DETACH DATABASE command. (8.9.20)

Enhancement: Improved optimization speed of parameterized IN-lists by limiting traversal depth. (8.9.19)
Enhancement: Added support for delimited batch sets in external functions. (8.9.19)
Enhancement: High Availability for the Master Aggregator (HA for MA) is now enabled exclusively for Enterprise edition. (8.9.19)
Enhancement: OPTIMIZE TABLE ... INDEX is now more responsive to KILL QUERY statements. (8.9.19)
Bugfix: Fixed a bug that caused DROP DATABASE to hang in a rare condition. (8.9.19)
Bugfix: Fixed an issue that returned wrong results in pipelines in a rare race condition. (8.9.19)
Bugfix: Fixed an issue that caused a crash when a query referenced too many tables. (8.9.19)
Bugfix: Changed options for loading libpam.so. (8.9.19)
Bugfix: Fixed a race condition that caused DETACH DATABASE to hang. (8.9.19)
Bugfix: Fixed a buffer overflow issue that occurred while de-parameterizing a multi-column IN-list filter. (8.9.19)
Bugfix: Fixed an issue to prevent LRU eviction from stalling when several table modules are stale. (8.9.19)
Bugfix: Fixed a bug in JSON_TO_ARRAY() join pushdown optimization that occurred when the JSON value being extracted from table_col is a JSON array. (8.9.19)

Enhancement: General improvements and optimizations for CHECK BOTTOMLESS CHECKSUM command. (8.9.18)
Enhancement: Added support for JSON arrays in pipelines loading JSON files. Each JSON record in the array is loaded as a separate row. (8.9.18)
Bugfix: Fixed ALTER RESOURCE POOL to respect queue depth constraints. (8.9.18)
Bugfix: Fixed a bug that produced incorrect results when performing a hash join with a condition that compares mismatched types where one of the types is BSON. (8.9.18)
Bugfix: Fixed a crash caused by libgcc btree bug. (8.9.18)
Bugfix: Reduced contention on a global lock controlling modules, affecting code loading and unwinding of exceptions. (8.9.18)
Bugfix: Removed garbage collection from cleaning up arrangements when such garbage collection is already in process. (8.9.18)

Bugfix: Fixed an issue impacting creating or a altering resource pool with a CPU percentage limitation after a MA failover. (8.9.17)

Bugfix: Fixed CTE subselect to join preprocessing in the INTERSECT/EXCEPT CTE case. (8.9.16)
Bugfix: Reserved memory space for segment_id in the pseudo column's MemoryFile. (8.9.15)
Bugfix: Fixed a bug that occurred when a query used I0 as a column name. (8.9.15)
Bugfix: Fixed a crash that was due to an issue with the rewrite that merges union queries. (8.9.15)

Enhancement: Added support for the LIMIT clause in prepared statements.(8.9.14)
Enhancement: Added an option to run queries that failed asynchronous compilation, allowing users to introspect the compilation results..(8.9.14)
Bugfix: Fixed an issue to prevent a crash when JSON_TO_ARRAY optimization is run against non-nullable JSON columns..(8.9.14)

Enhancement: Added enhanced query tracing in Query_completion event traces. Refer to Query History for details. (8.9.13)
Enhancement: Added the ability for a user to change the login password when the variable password_expiration_mode is set to LIMITED_ACCESS. Now the user is allowed to login even after the password expires but can only execute commands that update the password such as ALTER USER or SET PASSWORD. (8.9.13)
Enhancement: Added a password expiry warning message that is raised each time the user executes a query. This warning starts appearing 14 days before the password actually expires. (8.9.13)
Enhancement: Added auditlogging mode for only the root user. It is a startup-only variable similar to other audit logging configuration variables and should be set along with the auditlog_level. (8.9.13)
Bugfix: Fixed a bug that caused a node to crash when the variable ignore_foreign_keys is ON and the ALTER TABLE command is used to add a foreign key. (8.9.13)
Bugfix: Changed the output projection DDL to CREATE PROJECTION instead of CREATE TABLE _$_table_name_$ _ in debug profile. (8.9.13)
Bugfix: Fixed an issue where DDL statements that were concurrent with PROMOTE AGGREGATOR TO MASTER did not get forwarded to the new Master Aggregator.
Bugfix: Fixed a bug to prevent a crash when using FTS v2 with CTE. (8.9.13)

Bugfix: Fixed a bug that caused incorrect results when running INFER PIPELINE CSV. (8.9.12)
Bugfix: Fixed a rare deadlock occurring when DETACH DATABASE is run. (8.9.12)

Enhancement: Added an internal allocator Alloc_connection_context which tracks certain per-connection allocations that were previously tracked using the standard allocator. (8.9.11)
Bugfix: Fixed a crash that occurred when aggregator functions are used with VECTOR type data inside other built-ins. (8.9.11)
Bugfix: Fixed the userDictionary parameter for the korean full-text tokenizer. (8.9.11)

Enhancement: Added a vector index cache to limit the amount of memory used by vector indexes. Refer to Vector Indexing for details. (8.9.10)
Enhancement: Extended the VECTOR_SUM aggregate function to support the VECTOR data type. (8.9.10)
Enhancement: Computed column definitions now support the SPLIT function. (8.9.10)
Enhancement: The DROP and ALTER TABLE commands no longer have to wait for the plan garbage collector. (8.9.10)
Bugfix: Fixed a crash occuring when a table-valued function (TVF) column is used in a WHERE clause without a wrapping JSON_EXTRACT function in a TABLE(JSON_TO_ARRAY()) join. (8.9.10)
Bugfix: Fixed a bug that caused CDC-in pipelines to fail while inferring the table schema, with the "Failed to allocate slot in the extractor pool" error. (8.9.10)
Bugfix: Fixed a deadlock between ALTER and failover in a rare race condition. (8.9.10)
Bugfix: Fixed a critical issue where clusters were failing to connect to AWS remote storage due to CURL request timeouts. (8.9.10)
Bugfix: Fixed a bug where a CDC pipeline gets stuck while waiting for data. (8.9.10)

Bugfix: Fixed support for plan pinning in IN list factorization. (8.9.9)
Bugfix: Fixed a bug that prevented proper error handling on socket timeout. (8.9.9)
Bugfix: Fixed a bug in JSON_ARRAY_CONTAINS_STRING query rewrite for query shapes with the LIMIT clause. (8.9.9)

Enhancement: Improved pipeline error clearing. Additional errors are cleared when CLEAR PIPELINES ERRORS is run or when the pipelines_errors_retention_minutes limit is reached. (8.9.8)
Bugfix: Fixed network connectivity performance issues impacting BYOC clusters communicating with the remote storage. (8.9.8)
Bugfix: Fixed a bug in lateral join to preserve projection field aliases for lateral join subselects. (8.9.8)

Enhancement: Improve behavior of garbage collection for plancache. (8.9.7)
Bugfix: Fixed a bug caused by using a (cross-segment) vector index (Vector Indexing) with null vectors. Users on prior 8.9 versions will need to recreate (drop and add) vector indexes for this fix to be effective as the cross-segment indexes are built in the background. (8.9.7)

Bugfix: Fixes a data-dependent crash condition occurring in certain TABLE(JSON_TO_ARRAY(... joins introduced in version 8.9. (8.9.6)
Enhancement: Added new nori (Korean) analyzer customizations for Full-Text Search V2. (8.9.5)
Enhancement: Added support for updated Standard and Enterprise licenses. (8.9.5)
Enhancement: Added logging for LRU compiled unit eviction. (8.9.5)
Enhancement: Introduced support for placeholders for partition ID and timestamp in the SELECT INTO ... file name command. (8.9.5)
Bugfix: Fixed a bug causing accumulation of .rem files on disk. (8.9.5)
Bugfix: Fixed out of memory (OOM) errors and extra memory usage in Iceberg ingest. (8.9.5)
Bugfix: Fixed a small memory bug in columnstore scans that use the JSON_TO_ARRAY join optimization. (8.9.5)

Bugfix: Fixed a crash in JSON_EXTRACT_STRING. (8.9.4)
Bugfix: Fixed a bug with updates and asserts using JSON_ARRAY_CONTAINS_<type> predicates. (8.9.4)
Bugfix: Fixed bottomless upload throttling criteria. (8.9.4)
Bugfix: Blocked creation of temporary table as a shallow copy. (8.9.4)
Bugfix: Resolved an issue where an in-development subsystem can leak files on disk. (8.9.4)

Enhancement: Relaxed dependency on partition count of leaf nodes for leaf plans. (8.9.3)
Enhancement: Added Korean language analyzer for full-text search V2. (8.9.3)
Enhancement: Added support to infer CSV files with a single column when the file contains no field terminators in any record. (8.9.3)
Bugfix: Fixed handling of heartbeat messages in the MongoDB® extractor in debug mode. (8.9.3)
Bugfix: Fixed display of default BSON and string values with null-terminators in the information_schema. (8.9.3)
Bugfix: Removed trailing dot for decimal column types that have scale equal to 0. (8.9.3)
Bugfix: Fixed IN-list index matching for columnstore tables when a query has multiple IN-list predicates. (8.9.3)
Bugfix: Fixed a bug that caused an invalid optree error after the JsonArrayContainsToTableBuiltin rewrite. (8.9.3)

Bugfix: Fixed a crash that occurred during spilling when executing a query with a large number of GROUP BY columns. (8.9.2)
Bugfix: Fixed performance regression in Vector Search when using DOT_PRODUCT metric. (8.9.2)
Bugfix: Added an optional 'swap_time' argument to UUID_TO_BIN and BIN_TO_UUID functions. (8.9.2)
Bugfix: Allowed cached table memory to be freed for empty tables in replica databases. (8.9.2)
Bugfix: Fixed a latera l join parsing bug. (8.9.2)

8.9

Enhancement: Added the DETERMINISTIC clause to the CREATE FUNCTION (UDF) command that instructs the query optimizer to assume that the created function is deterministic. Refer to CREATE FUNCTION (UDF) for more information.
Enhancement: Added support for the IGNORE <n> LINES clause to the INFER PIPELINE command for CSV files. Refer to Schema and Pipeline Inference for more information.
Enhancement: Added a query rewrite for correlated sub-selects that allows many additional types of queries with correlated sub-selects to now run successfully.
Enhancement: Added support for using the SKIP ALL ERRORS clause during creation of Kafka pipelines for ingesting JSON formatted data.
Enhancement: Added support for SKIP ALL ERRORS and SKIP PARSER ERRORS clauses during creation of Kafka pipelines for ingesting Avro formatted data.
Enhancement: Added the ability to load Kafka properties and headers with the get_kafka_pipeline_prop("<property>") function.
Enhancement: Added the ability to override the pipelines_max_offsets_per_batch_partition global variable for each Kafka pipeline using the MAX_OFFSETS_PER_BATCH_PARTITION pipeline variable in CREATE PIPELINE and ALTER PIPELINE commands.
New feature: Added support for the following parameters in the CONFIG clause of CREATE PIPELINE AS ... LOAD DATA S3 statement:
- file_compression: Decompresses files with the specified extensions.
- file_time_threshold: Only ingest files modified after the specified timestamp.
Refer to S3 Configurations for more information.
Enhancement: Added ability to re-optimize a query multiple times. Refer to Query Tuning for more information.
Enhancement: Added the ability to use connection links to load Avro and Parquet formatted data stored in an AWS S3 bucket.
Enhancement: Enabled auto PROFILE for INSERT...SELECT and REPLACE...SELECT query shapes.
New feature: Added the ENABLE_OVERWRITE clause in the SELECT ... INTO S3 and SELECT ... INTO LINK statements that enables the overwriting of existing files. Refer to SELECT … INTO S3 for more information.
Enhancement: Updated PROFILE to show number_of_blocks_tested_for_block_elim and number_of_blocks_eliminated_for_block_elim for sub-segment elimination for flexible parallelism.
Enhancement: Updated the JSON_EXTRACT_<type> functions to accept a JSON document as the only argument. With this enhancement, it is possible to extract from JSON documents with a string, numeric, boolean, or NULL value as the root of the document.
Enhancement: Updated the simplified syntax for the JSON_MATCH_ANY() function, to allow specifying MATCH_ELEMENTS by appending a * to the end of the keypath.
Enhancement: Added the ability to load CSV and JSON files from an Amazon S3 bucket using a LOAD DATA query.
Enhancement: Enhanced the TABLE() function to support DISTINCT.
Enhancement: Improved recovery time for tables with incremental autostats by recovering statistics from disk instead of rebuilding them from scratch.
Bugfix: Fixed an issue that caused a deadlock between the DROP TABLE and AGGREGATOR SYNC AUTO_INCREMENT queries.
Enhancement: Improved the logic for selecting vectors close to vector threshold in vector range search.
Bugfix: Fixed an issue where attaching a leaf node to the Master Aggregator (MA) failed if the MA was still starting up.
Enhancement: Added support for BM25 partition-scoped scoring for phrase and proximity search queries.
Enhancement: Improved columnstore hash index performance on low cardinality column.
Enhancement: Improved the Debezium DDL statements parser.
Enhancement: Added row count estimate for joins in the output of EXPLAIN and PROFILE queries.
Enhancement: Added support for LIMIT and OFFSET clauses in non-equality WHERE conditions in subselects.
Enhancements: Added support for LATERAL joins for table-valued functions (TVFs).
Enhancement: SYNC DURABILITY is always enabled for reference databases. Reference tables in user databases that use async durability may notice a decrease in performance for DDL and DML statements.
Enhancement: Added support for HIGHLIGHT ... AGAINST as a computed column expression when creating a new table.
Enhancement: Added support for a query rewrite that enables hash joins when two tables are joined using the JSON_ARRAY_CONTAINS_<type> function.
Enhancement: Added support for sub-select to join rewrites for correlated subselects and nested scalar subselects.
Enhancement: Added support for null-accepting projections in scalar subselect queries.
Enhancement: Added support for LATERAL join subselects to reference any level of outer tables.
Enhancement: Changed the default collation to utf8mb4_general_ci and default character set to utf8mb4 for TEXT and ENUM type columns for CSV, JSON, AVRO, and parquet formats in INFER PIPELINE AS LOAD DATA statements.
Bugfix: Fixed an issue that caused dangling compute sessions after a failed ALTER DATABASE command.
Bugfix: Fixed an issue where concurrent ALTER DATABASE and REBALANCE operations create unlimited storage partitions with wrong compute ID.
Bugfix: Fixed a race condition between the transaction log garbage collection and database transition-to-master that could result in unrecoverable partitions.
Enhancement: Updated the distributed OpenSSL license file to 1.0.2zj.
Enhancement: Added support for multi-column IN list predicate in the WHERE clause of a query.
Enhancement: Function mapped IN-lists now share the same signature in the plancache for the same set of built-in functions.
Bugfix: Removed soft lock on the CHARACTER SET clause and added a warning to indicate invalid character set value in the LOAD DATA statement.
Bugfix: Database names can no longer end with big numbers, such as db_<big_number>, to avoid conflicts with internal databases used in replication.
Enhancement: Each node in the cluster now validates the availability of bottle service every minute and records any consecutive failures in the LMV_EVENTS information schema view.
Enhancement: Added more information to the out-of-memory (OOM) errors.
Bugfix: Fixed an issue that occurred while parsing manifest files (associated with backup or restore operations) having more than 4096 characters.
Bugfix: Fixed an issue that caused duplication of storage blobs that have not been repaired yet for ongoing repair operations.
Bugfix: Fixed an issue in MySQL CDC-in pipelines where some MySQL tables with names containing _ were not being replicated.
Bugfix: Fixed a race condition that caused shutdown to wait on idle async compile manager thread.
Bugfix: Fixed an issue where repair operation gets stuck while converting milestones.
Bugfix: Fixed a synchronization issue where the database was not immediately available after some clustering operations.
Enhancement: Added the estimated number of partitions returned by the query optimizer for cost-based join order optimization to the debug profile.
Enhancement: Added support for VECTOR type in REDUCE built-in function.
Enhancement: Improved the lockdown message when changing collation or character set related engine variables within utf8mb4 character set.
Bugfix: Fixed an issue with PROMOTE AGGREGATOR ... MASTER command where a restart at the end of the command, followed by manually finishing the promote operation, caused reprovisioning of the old Master Aggregator.
Bugfix: Fixed an issue that caused an empty network prefetch queue.
Bugfix: Fixed a distributed deadlock caused by a query blocking reference database reprovisioning in one part of the deadlock cycle.
Enhancement: Added the following headers to Data API responses:
- Cache-control: no-store
- Strict-Transport-Security: max-age=31536000
- X-Content-Type-Options: nosniff
Bugfix: Fixed an issue with counting common table expressions (CTEs) references in a query.
Bugfix: Fixed a memory corruption issue caused by a rare race condition involving MV_ACTIVE_TRANSACTIONS information schema view.
Enhancement: Sharding planner now recognizes non-union style single partitioned derived table.
Enhancement: Disabled pipeline batches sample by default in memsql_exporter.
Enhancement: Improved message for an error where a global variable cannot be set because of a table sharded on a computed column.
Enhancement: Improved connection stability for MongoDB® CDC-in pipelines.
Enhancement: Upgraded the librdkafka library to version 2.4.0-3.
Bugfix: Fixed an issue where the Master Aggregator (MA) temporarily stopped behaving as the MA after being restarted.
Enhancement: Improved the performance of cluster operations through distributed plancache.
Bugfix: Fixed a bug with incorrect privilege checks.
Enhancement: Optimized row locking for internal transactions.
Enhancement: Improved performance of REVOKE in case of errors.
Bugfix: Fixed an engine crash caused when the Master Aggregator received a REMOVE AGGREGATOR query with its own <host>:<port>.
Enhancement: Reduced the chances of reference database reprovisioning in universal storage if a snapshot is taken concurrently with Master Aggregator shutdown.
Enhancement: TO_JSON() and JSON_BUILT_OBJECT() built-in functions now convert VECTOR type arguments to JSON array instead of a JSON string.
Enhancement: Added support for ZSTD compressed Kafka topics to Kafka pipelines.
Bugfix: Fixed an issue where ClampTimestamp spammed the tracelog.
Enhancement: Limit DR connection attempts to the primary node when it is failing connections, avoiding a substantial increase in TIME_WAIT sockets.
Enhancement: Improved error messages related to reprovisioning.
Enhancement: Implemented rotation and deletion of webproxy socket logs.
Enhancement: Ingest Kafka headers into the SingleStore table if they are included in the Kafka message.
Enhancement: Lockdown hints for common table expressions (CTEs).
Enhancement: Locked select and row count hints for multi-table views.
Bugfix: Fixed an issue where slow snapshots blocked clustering operations.
Bugfix: Fixed an issue where unrecoverable reference databases did not auto-heal for a prolonged period of time and get blocked, for example, by alter operations.
Enhancement: Improved the error message for Correlated subselect that cannot be transformed and does not match on shard keys errors.
Bugfix: Fixed data loss in sync durability in a rare race condition.
Enhancement: Improved processing of queries with redundant (superfluous) EXISTS subselects.
Bugfix: Fixed the reference count of committed blobs on upgrade.
Enhancement: Added the ON NODE <node_id> clause to the SHOW PROFILE command, which forwards the command to another aggregator.
Enhancement: Improved enforcement of internal_columnstore_max_uncompressed_blob_size engine variable.
Enhancement: Added a new clause ENSURE_PARTITION_SAFETY to the REMOVE {LEAF | LEAVES} command that prevents a leaf (or leaves) from being removed if it contains the last online instance of a partition.
Enhancement: The memsql_exporter now collects additional fields from the mv_activities_extended_cumulative information schema view.
Enhancement: Optimized full-text queries that contain an ORDER BY ... LIMIT ... over a full-text score and optionally filter on the same full-text clause.
Enhancement: Added support for numeric range queries when doing full-text search against JSON fields.

Refer to numeric range queries for more information.
Enhancement: Set collation utf8_bin for the JSON_TO_ARRAY builtin in cases where the input has utf8 character set, and set collation utf8mb4_bin in cases where the input has utf8mb4 character set.
Enhancement: If a primary key is defined for the table, the DELETE change records for OBSERVE queries now populate the primary key for tables instead of the internal ID.
Enhancement: Added information on remote API calls and bottle service reliability metrics to MV_BOTTOMLESS_STATUS_EXTENDED information schema view.
Enhancement: Added metrics to track the availability of unlimited storage and the bottle service to MV_BOTTOMLESS_SUMMARY information schema view.
Enhancement: Updated the syntax of JSON_MATCH_ANY() to allow specifying MATCH_ELEMENTS by appending a * at the end of the keypath.
Enhancement: Additional binlog position updates for CDC-in pipelines in the extractor pool queue.
Enhancement: Information schema queries are no longer case-insensitive with respect to database names.
Enhancement: Allow rewrite for joining tables on the JSON_ARRAY_CONTAINS_<type> function to support multiple predicates of that form on different columns.
Bugfix: Fixed a snapshot not found error.
Bugfix: Fixed an issue where the result of a seek operation was not handled correctly.
Enhancement: Detect scalar subselect requirement at runtime rather than rewrite time for certain cases.
Bugfix: Fixed a bug in full-text search query compilation.
New Feature: High availability for the Master Aggregator (HA for MA) has been introduced. This enhances the reliability and failover capabilities, allowing mission-critical workloads to remain highly available. Refer to Multi-Datacenter Failover for more information.
Enhancement: Enabled LOAD DATA queries to use the specified CHARACTER SET to ingest string type fields.
Enhancement: Improved performance of queries that use the RAND() builtin or constant expressions involving PARTITION_ID.
Enhancement: Added support for a query rewrite when UNION is used to merge queries that reference the same tables and have mutually exclusive filters.
Bugfix: Fixed a rare issue where ALTER operation and two-phase commit (2PC) caused index corruption.
Bugfix: Fixed an issue that caused a crash when the global collation or character was set to a value with the prefix "utf8mb3".
Bugfix: Fixed an issue that returned incorrect results when JSON_MATCH_ANY() and BSON_MATCH_ANY() queries contained nested JSON_EXTRACT_<type>(), JSON_MATCH_ANY(), BSON_EXTRACT_<type>, or BSON_MATCH_ANY() functions.
Bugfix: Fixed a rare undefined behavior issue caused by SHOW CLUSTER STATUS command.
Bugfix: Fixed an issue caused when multi-part names were used in common table expressions (CTEs) within UNION.
Bugfix: Fixed an issue that caused DROP DATABASE and some other cluster management commands to wait indefinitely for resources.
Bugfix: Fixed a rare issue that caused autostats file leak.
Bugfix: Fixed a rare issue that blocked kill query operations.
Bugfix: Fixed CVE-2024-45772 and BDSA-2024-0720 security vulnerabilities.
Bugfix: Fixed an error in BSON_MATCH_ANY() pushdown.
Bugfix: Fixed the result of ORDER BY <sort_key> LIMIT query on tables with a sort key column defined in the descending order.
Bugfix: Fixed an issue that caused uneven time distribution in the CDC extractor pool.
Bugfix: Fixed an issue that delayed restart of nodes with multiple reference databases.
Bugfix: Fixed an issue where some point-in-time-recovery (PITR) operations failed when started from a restored snapshot.
Enhancement: Upgraded Apache Lucene search engine library to 10.0.0.
Bugfix: Fixed a bug that caused a query to fail in a specific situation when using SEARCH OPTIONS with vector index search.
Bugfix: Fixed an edge case; common table expressions (CTEs) are now recognized in subselect to join rewrite.
Bugfix: Fixed a bug that led to a master aggregator crash when using dynamic resource pools.
Enhancement: Added support for Iceberg v2 merge-on-read ingest mode.
Bugfix: Fixed ingest streaming logic for large Iceberg v2 tables.

8.9 Release Notes

On this page

Release Highlights

Full-Text Search - Analyzers and Tokenizers

Full-Text Search - Enhanced BM25 Scoring

Iceberg Continuous Ingest

Iceberg - New Catalogs

Enhanced Disk Spilling

Writable Views

Other Improvements and Fixes

Vector Index on Nullable Column

Vector Index Memory Tracking

Vector Index Merger

High Availability for the Master Aggregator

Other Performance Enhancements

New Information Schema Views and Columns

New Commands and Functions

New or Modified Engine Variables

Miscellaneous

Updates

8.9

In this section

Was this article helpful?

On this page

Was this article helpful?