8.5 Release Notes

Release Highlights

Note

This is the complete list of new features and fixes in engine version 8.5 and its maintenance patches. For a list of all new features and bug fixes added in maintenance releases only, see the maintenance release changelog. If you are upgrading from an existing 8.5 version, see the changelog for a list of new features and fixes added to 8.5 by date.

A major feature introduced in this release is support for Vector Indexing. Vector indexing allows approximate-nearest-neighbor (ANN) search to improve performance of queries to find the top-K closest matches to a query vector. Supported index types include inverted file (IVF), hierarchical navigable small world (HNSW) and variants of both that use product quantization (PQ) to reduce memory usage. ANN search enables semantic text search, retrieval-augmented generation, and image search on larger-scale data sets more economically and with faster response time than was previously possible. It is an important tool to enable generative AI applications over larger data sets and with higher concurrency.

Query Plan Pinning

SingleStore now offers query plan pinning. Plan pinning ensures the optimizer will always pick a specific plan for a given query. Plan pinning allows users to select the most effective query plans for every query in their workload.

Query plan pinning can be used to prevent query plan regression after automatic upgrades. It can be used to avoid changes to query plans after statistics change or global settings change that affect query plans. It can also be used to avoid changes to query plans when a cluster is resized.

Extensions

Extensions in SingleStore allow you to combine user-defined objects, such as UDFs or UDAFs, into a packaged archive (the extension) and then create, manage, and deploy these objects and other resources using a single command. Extensions support both Wasm-based and PSQL functions. For more information, refer to Extensions.

Trace Events and Query History

Added the ability to trace query completions as events, which is the initial installment of the larger event tracing framework. The Query History feature relies on query event tracing, and can be used to display query trace events over time. The Query History feature can therefore be used to troubleshoot and optimize query performance, including, but not limited to, tracing and recording expensive queries, resolving unexpected slowdowns, and viewing and optimizing workloads in real time. Refer to Query History for more information.

Trace Log Rotation

SingleStore now has a built-in trace log rotation function. This is separate from and mutually exclusive of any manual log rotation implementation already in place, so if you want to use the new function and already have a solution in place using logrotate or similar, you will need to disable that before enabling this new rotation feature.

The built-in trace log rotation feature can be configured to rotate logs based on their size, based on time, or both. You can also, optionally, configure it to automatically delete logs after a specified number of days. The default, when the feature is enabled, is to never delete logs.

The new trace log rotation feature is disabled by default. See Rotating Trace Log to Manage its Size for information on how to enable and configure the new rotation feature.

Improved Memory Management for Resource Pools

Added the QUERY_MEMORY_PERCENTAGE option for resource pools, which restricts memory usage in the pool on a per-individual query basis. This in contrast to MEMORY_PERCENTAGE which restricts usage based on total memory used within the current pool.

For example, when creating or altering a resource pool, setting MEMORY_PERCENTAGE to 60% and QUERY_MEMORY_PERCENTAGE to 50% would configure the system so that all queries running within the specified resource pool should together use a maximum of 60% of system memory, and any single query running within the pool should use, at most, 50% of system memory. Example syntax:

CREATE RESOURCE POOL rpoolmain WITH
MEMORY_PERCENTAGE = 60,
QUERY_MEMORY_PERCENTAGE = 50,
SOFT_CPU_LIMIT_PERCENTAGE = 65,
MAX_CONCURRENCY = 40;

Load Data Updates

SingleStore now supports loading data using the Change Data Capture (CDC) pipelines from the following data sources: MongoDB® and MySQL. Refer to Replicate MongoDB® Collections to SingleStore or Load Data from MySQL for information on loading data from the respective data source.

Other Improvements and Fixes

Other Performance Enhancements

  • SingleStore now supports creating shallow copies of tables. The WITH SHALLOW COPY feature copies an existing table and creates a new table that will have the same structure as the original table. The data is not physically copied to the new table, but referenced against the original table.

  • SingleStore now supports sorted scan query plan operators for queries containing ORDER BY/LIMIT clauses when utilizing flexible parallelism. Before this enhancement, there could be performance regressions for this query shape using flexible parallelism.

  • Improved performance when completing large sets of security operations (creating a lot of groups/users/roles, etc.).

  • Added the ability to use named argument notation when calling a PSQL SP or function. Can reduce total lines of code and make code more readable.

  • Added reduction of memory pre-allocation during columnstore JSON reads.

  • Added ability to check if all leaf node partitions are available, before processing new batches.

  • Addressed a table resolution issue for embedded recursive Common Table Expressions (CTEs).

Query Optimization Enhancements:

  • Added support for Row Count and Selectivity hints in views.

  • Added new join logic to recognize when a non-reference table is being joined exclusively to reference tables and then gather the non-reference table to avoid duplicating work across every partition.

  • Remove redundant aggregation functions and GROUP BY statements.

  • JSON expressions are properly pushed down.

  • Allow columnstore optimization for JSON_MATCH_ANY with JSON_EXTRACT in predicate.

  • Added support for hash joins on null-accepting expressions in the ON condition of outer joins.

  • Automatically rewrite A=B OR (A IS NULL AND B IS NULL) to null safe equal (A<=>B) so that many important optimizations (e.g. shard key joins, hash joins) will work.

  • Perform a subselect to join rewrite in an UPDATE statement when there are multiple columns in the subselect.

  • Removed some query shape lockdowns.

  • Added support for flipping join order for full outer join.

  • Improved performance by not executing query optimization procedures for read queries during the process of persistent plan cache lookup. This optimization strategy has resulted in improved lookup performance, leading to faster data retrieval operations.

  • Added support for statistics on correlations between columns in cases where highly correlated filters are used.

  • The Data API now supports HTTPS for connections where ssl_key is encrypted with ssl_key_passphrase.

  • Removed parametrization of LIMIT 0 and LIMIT 1 to unlock more rewrites, especially for subselects.

  • Modified computed column matching to accurately evaluate JSON expressions containing equality and non-safe equality comparisons.

  • Fixed an issue where filtering with a JSON_EXTRACT_<type> function performs inconsistently.

  • The query optimizer now considers more LEFT JOIN elimination cases.

New Information Schema Views and Columns

  • Added a new view, correlated_column_statistics, to provide metadata on correlated columns.

  • Added a new view, RESOURCE_POOL_PRIVILEGES, to provide information about resource pool grants and privileges.

  • Added the following new columns to MV_BACKUP_HISTORY:

    • error_code: Error code for failed backups.

    • error_message: Error message for failed backups.

  • Added the following new columns to MV_SYSINFO_DISK:

    • read_operations_cumulative_per_device: Number of read operations performed by the device since start up.

    • write_operations_cumulative_per_device: Number of write operations performed by the device since start up.

    • device_name: Name of the device to which the values in read_operations_cumulative_per_device and write_operations_cumulative_per_device are associated.

  • Added the following to support trace events that are used by the Query History feature:

    • MV_TRACE_EVENTS: A snapshot of all trace events, the size of which is dictated by the trace_events_queue_size variable

    • MV_TRACE_EVENTS_STATUS: A view that reflects the status of current trace events

    • LMV_TRACE_EVENTS: A snapshot of each node's trace events

  • Added the following new column to ADVANCED_HISTOGRAMS, L/MV_QUERY_PROSPECTIVE_HISTOGRAMS, and L/MV_PROSPECTIVE_HISTOGRAMS:

    • JSON_KEY: an entry for each (column, json_key) pair. For non-json columns JSON_KEY is NULL.

  • Added the blob_cache_miss_b and blob_cache_wait_time_ms columns to the following information schema views: information_schema.plancache, information_schema.MV_ACTIVITIES, information_schema.MV_ACTIVITIES_CUMULATIVE, information_schema.MV_TASKS, information_schema.MV_FINISHED_TASKS, and information_schema.MV_QUERY_ACTIVITIES_EXTENDED_CUMULATIVE. (8.5.2)

  • Bugfix: Updated the information_schema.USERS view to reflect the account status for locked users.

New Commands and Functions

  • Added support for the REGEXP_MATCH() function. This function returns a JSON array of matching substring(s) within the first match of a regular expression pattern to a string.

  • Added support for CUBE and ROLLUP grouping operations to Wasm-based user-defined aggregate functions (UDAFs) in SingleStore. For more information, refer to CREATE AGGREGATE.

  • Added the following to support trace events that are used by the Query History feature:

    • CREATE EVENT TRACE to create a trace event

    • DROP EVENT to drop a trace event

  • Added ability to use DELETE on identical keys with the ON DUPLICATE KEY clause. This is in addition to existing "upsert" support with ON DUPLICATE KEY UPDATE. This allows new scenarios such as the ability to manage streaming aggregation with INSERT … ON DUPLICATE KEY UPDATE … ELSE DELETE ….

  • INFER PIPELINE for MongoDC CDC-in now generates tables with BSON column types. (8.5.3)

New or Modified Engine Variables

  • Enhancement: The new engine variable parametrize_json_keys allows JSON keys to be parametrized and plans can be reused. (8.5.7)

  • Enhancement. Added an engine variable, innodb_lock_wait_timeout . This variable exists for backwards compatibility with MySQL and is non-operational in SingleStore Helios. (8.5.6)

  • The engine variable backup_multipart_upload_concurrency maximum value has been increased to 15. (8.5.4)

  • The new engine variable max_table_memory_room_mb sets the maximum amount of memory required when creating or attaching a database. Configuring this engine variable allows more control over whether a detached database can be reattached. (8.5.4)

  • Added enable_idle_table_optimizations and enable_idle_table_eviction, which are used to reduce table memory overhead for idle tables on a cluster. The feature is enabled by default on all new and existing clusters. The variable enable_idle_table_optimizations needs to be set at runtime and requires a restart for changes to take effect. It can be set to OFF or ON (default). The other related variable, enable_idle_table_eviction, can be set during a session (though a very small amount of overhead will remain until the server is restarted) and can be set to Full, SkipListsOnly, and Off. It defaults to SkipListsOnly, which means SingleStore will only evict skiplist indexes for idle tables on the cluster. Full means it will evict skiplists and table modules, and Off means no eviction.

  • Added the bottomless_experimental_blobstore_mode engine variable, which when enabled, completes additional verification of persisted files immediately after upload. This mode is experimental and may reduce upload speed. Please use with caution.

  • Added privilege_caches_update_mode, which can be used to address some performance issues that occur when performing large sets of security operations (creating a lot of groups/users/roles, etc.).

  • Added the optimizer_use_average_rowsize engine variable which can be used to now allow row size estimations in query optimization costing.

  • The query_parallelism engine variable (which was deprecated in 8.1) now is non-functional. To modify Flexible Parallelism settings, use query_parallelism_per_leaf_core instead.

  • Added the use_user_provided_index_types_in_show engine variable which controls what will be displayed via the DESCRIBE <table_name> or SHOW COLUMNS syntaxes for backward compatibility.

  • Added the throttle_replay_behind_tail_mb engine variable which controls how far the system allows a child aggregator to lag behind replicating the transaction log of it's master aggregator before throttling is applied to the master aggregator.

  • Added the ability to use the ANALYZE command with JSON keys to create histograms when the new engine variable enable_json_statistics is enabled. The engine variable enable_json_statistics is disabled by default.

  • Added trace_events_queue_size to capture trace events, the first of which is query event tracing. This engine variable is enabled by default (set to a value of 16 MB, where the value must be provided in bytes). Refer to Query History for more information.

  • Added the optimizer_disable_transitive_predicates engine variable which disables predicate transitivity on query rewrites if set to TRUE. This engine variable defaults to FALSE.

  • Added a new engine variable pipelines_cdc_java_heap_size to specify the JVM heap size limit for CDC-in pipelines.

Miscellaneous

  • Bugfix: Fixed an issue with nested extracts in the JSON_MATCH_ANY() statement.

  • Bugfix: Fixed an issue with backup restore when the backup has a corrupted GV timestamp in the about snapshot record.

  • Bugfix: Fixed an issue that causes CREATE TABLES AS INFER PIPELINE to fail with an Out Of Memory error.

  • Bugfix: Fixed an issue with exporter not working with Public-Key Cryptography Standards #8(PKCS#8) pem format.

  • Bugfix: Fixed an issue with potential crashes when specifying a vector index in index hint.

  • Bugfix: Fixed a crash that occurred when trying to grant access to the table with no context database in per-privilege mode.

  • Bugfix: Fixed a memory issue, when calling a Table-Valued Function (TVF) that has arguments with default values.

  • Bugfix: Fixed an issue with the RG pool selector function return type check.

  • Enhancement: information_schema.mv_connection_attributes now shows tls_version and tls_cipher for connections where SSL is enabled.

  • Enhancement: Added ability for predicate pushdown with NOW() and user defined functions (UDF). (8.5.6)

  • Enhancement.: Reduced amount of memory used by unlimited storage file listing, including point-in-time recovery (PITR). (8.5.6)

  • Enhancement. Optimized performance when using VECTOR built-ins with the VECTOR datatype. (8.5.6)

  • Bugfix: Fixed an issue where a backup would inadvertently carry metadata about the retention period of the backed-up unlimited storage database. (8.5.6)

  • Bugfix: Fixed the change to aggregator activity queries in memsql_exporter not being applied properly. (8.5.6)

  • Enhancement: Added ability to memsql_exporter to always return all sub-activities of an aggregator activity. (8.5.5)

  • Enhancement: Improved ability to configure Kafka extract/egress by allowing additional options. (8.5.5)

  • Enhancement:Improved performance for certain query shapes used by SingleStore Kai. (8.5.5)

  • Enhancement: Introduced a new VECTOR datatype that is recommended over the BLOB datatype for vector operations and use with ANN indexes. The VECTOR type improves code readability, error checking, and reduces total lines of code.

  • Enhancement: Added support for DDL forwarding for CLEAR BACKUP_HISTORY.CLEAR BACKUP_HISTORY works on the DML endpoint now.

  • Enhancement: Improved retry logic for connection reset: write errors.

  • Enhancement: SingleStore now natively supports the BSON data type.

  • Enhancement: Added support for collection.exclude.list, database.include.list, and database.exclude.list parameters to the CONFIG/CREDENTIAL clause of the CREATE AGGREGATOR PIPELINE ... AS LOAD DATA MONGODB statement. Refer to Replicate MongoDB® Collections to SingleStore for more information.

  • Enhancement: Added support for Approximate Nearest Neighbor (ANN) vector search using inverted file (IVF) and hierarchical navigable small world (HNSW) indexes, and variants of them based on product quantization (PQ). Enables support of larger-scale semantic search and generative AI applications.

  • Enhancement: Increased name length limit to 256 characters for  tables, views, table columns, view columns, procedures, functions, and aliases.

  • Enhancement: Added the ability to truncate plancache file names if they exceed the operating system's specified limit (255 bytes).

  • Bugfix: Fixed the query-events endpoint link on the memsql_exporter landing page. (8.5.5)

  • Bugfix: Fixed an issue where a segmentation fault appeared after terminating a query on the remote node. (8.5.5)

  • Bugfix: Fixed an issue with crashes occurring when using certain JSON_EXTRACTs in the predicate expression of JSON_MATCH_ANY(). (8.5.5)

  • Bugfix: Fixed an issue where the number of rows affected with SELECT INTO object storage was erroneously reported as 0 rows. (8.5.5)

  • Bugfix: Both SHOW PIPELINES and SELECT * FROM information_schema.pipelines now show consistent pipeline state information across master and child aggregators.

  • Bugfix: Specific pipeline built-ins like pipeline_source_file() and pipeline_batch_Id() should not be used in UPSERT clause when creating a pipeline.

  • Bugfix: Improved the ability to terminate expressions containing JSON built-ins.

  • Enhancement: Added error-handling details for pipelines, including state, error, and performance-related metrics through monitoring solutions.

  • Enhancement: MemSQL Procedural SQL (MPSQL) has been renamed to simply Procedural SQL (PSQL). The name change will only show in some SHOW command output and information schema views. For example, SHOW FUNCTIONS output changed.

  • Enhancement: Added support for %ROWTYPE and %TYPE for use in declaring scalar type variables and parameters. Employing these abbreviations in Procedural SQL (PSQL) can lead to a reduction in the required lines of code.

  • Enhancement: Introduced blob cache profiling metrics for columnstore tables on unlimited storage databases. Refer to the PROFILE page for more information on what blob cache metrics are available.

  • Enhancement: Added infix operators for dot_product (<*>) and euclidean_distance (<->).

  • Enhancement: Added ability to delay retry attempts for pipeline retries.

  • Bugfix: Updated the output for the Key_name and Index_type columns in the SHOW INDEX, SHOW INDEXES, and SHOW KEYS commands for primary keys on columnstore tables. Refer to the SHOW INDEX, SHOW INDEXES, SHOW KEYS page for more information.

  • Bugfix: Improved the error message displayed when trying to create a primary key on an existing table.

  • Bugfix: Improved the error message displayed when a GRANT command fails due to missing permissions. The error message will now show the missing permissions:

    GRANT SELECT, UPDATE, DELETE, EXECUTE ON *.* TO test2;
    ERROR 1045 (28000): Current user is missing UPDATE, DELETE permission(s) for this GRANT
  • Bugfix: Fixed a case when UNIX_TIMESTAMP() was incorrectly returning 999999999.999999 for DATETIME datatypes with precision.

  • Bugfix: Fixed a potential issue that could cause a crash when SELECT INTO OUTFILE is run on a leaf with no context database.

  • Bugfix: Fixed an issue where a user could be erroneously marked as deleted.

  • Bugfix: Fixed an issue where a child aggregator could crash if it ran out of memory during query forwarding.

  • Bugfix: Fixed an issue with blob cache LRU2 eviction that could occur when a query fetches a blob, evict it, and fetches it again.

  • Bugfix: Fixed an issue that could cause information for a blob to be missing from an Information Schema table.

  • Bugfix: Disk Spilling now takes the resource pool settings into consideration.

  • Enhancement: Auto user creation is deprecated and the NO_AUTO_CREATE_USER variable is enabled by default.

  • Bugfix: Fixed an erroneous access denied issue to views selecting from shard tables with computed columns.

  • Bugfix: Fixed a rare issue where the incorrect timezone could be used in logging.

  • Bugfix: Fixed an issue where using user-defined variables inside ORDER/GROUP BY statements could cause a crash.

  • Enhancement: Added column name to error messages when invalid date/time is inserted into a column.

  • Enhancement: Added BSON columnstore functionality over Parquet storage.

  • Enhancement: Added support for SELECT ... INTO KAFKA using OAUTH credentials.

  • Bugfix: Prevent the ability to create a Kafka Pipeline using Parquet Format.

  • Bugfix: Queued Time is now excluded from the cost estimate for workload management leaf memory.

  • Bugfix: Specific error messages are now logged for GCS subprocess failures.

  • Bugfix: JSON_KEY escape characters were not working as expected.

  • Bugfix: ALTER PIPELINE setting max_partitions_per_batch to use a default 0 value is now allowed.

  • Enhancement: Improved performance for JSON_EXTRACT_<type> built-ins in ORDER BY clauses. (8.5.2)

  • Bugfix: Fixed an issue where INSERT...SELECT queries with a partition_id() filter generating an error. (8.5.2)

  • Bugfix: Fixed an issue with memory crashing when using REGEXP_MATCH, JSON_INCLUDE_MASK, or JSON_EXCLUDE_MASK built-ins. (8.5.2)

  • Enhancement: Improved performance by optimizing joins on TABLE(JSON_TO_ARRAY()) queries. (8.5.2)

  • Enhancement: Now suppressing a harmless traceSuspiciousClockUpdate trace message during recovery. (8.5.2)

  • Enhancement: Added the ability for users to kill queries on a remote node without the user having PROCESS privileges. (8.5.2)

  • Bugfix: Fixed an allocation issue that caused poor performance on high load insertion queries. (8.5.2)

  • Enhancement: Enabled support for external UDFs used in INSERT statements with multiple VALUE clauses. (8.5.2)

  • Enhancement: Added BSON fundamentals and column type support for SingleStore Kai. (8.5.2)

  • Bugfix: Fixed a bug that could result in unrecoverable databases if the database had 1024 or more tables. (8.5.3)

  • Bugfix: Fixed an optimization out-of-memory issue cause by operators generated for wide user tables. (8.5.4)

  • Bugfix: Fixed a crash that occurs in rare scenarios involving ALTER TABLE and failovers. (8.5.4)

  • Bugfix: Fixed ineffective search options that change in subsequent vector search queries. (8.5.4)

  • Bugfix: Resolved an issue related to FROM_BASE64 and TO_BASE64 builtins when processing large string inputs thereby preventing potential errors in reading communication packets. (8.5.4)

  • Bugfix: Fixed the code involved in backups to improve download error messaging. (8.5.4)

  • Bugfix: Fixed an bug that could cause scans using non-unique indexes which could return incorrect results or cause crashes. (8.5.4)

Changes in Patch Releases Since 8.1GA

Several features have been added since the GA release of 8.1 (8.1.2). Particularly notable ones include Tracelog Rotation, renaming MPSQL to PSQL, and the 8.1 data_conversion_compatibility_level; however, there are others. Please refer to the 8.1 Release Notes or maintenance release change log for additional details.

In this section

Last modified: February 20, 2024

Was this article helpful?