8.5 Release Notes
On this page
Note
-
To deploy a SingleStore 8.
5 cluster, refer to the Deploy SingleStore Guide. -
To make a backup of a database in this release or to restore a database backup to this release, follow this guide.
-
The data_
conversion_ engine variable now defaults tocompatibility_ level '8.
in new installations.0' This results in stricter data type conversions. The value is not changed when upgrading to version 8. 0. This new data_
setting additionally flags invalid string-to-number conversion inconversion_ compatibility_ level INSERT
statements,PIPELINES
,LOAD DATA
commands, and type casts.Applications will likely see more compatibility issues flagged when run against installations with
data_
set toconversion_ compatibility_ level '8.
than when run with a lower compatibility level.0' -
New deployments of SingleStore 8.
5 require a 64-bit version of RHEL/AlmaLinux 7 or later, or Debian 8 or later, with kernel 3. 10 or later and glibc
2.17 or later. Refer to the System Requirements and Recommendations for additional information.
Release Highlights
Note
This is the complete list of new features and fixes in engine version 8.
Indexed Approximate-Nearest-Neighbor Search
A major feature introduced in this release is support for Vector Indexing.
Query Plan Pinning
SingleStore now offers query plan pinning.
Query plan pinning can be used to prevent query plan regression after automatic upgrades.
Extensions
Extensions in SingleStore allow you to combine user-defined objects, such as UDFs or UDAFs, into a packaged archive (the extension) and then create, manage, and deploy these objects and other resources using a single command.
Trace Events and Query History
Added the ability to trace query completions as events, which is the initial installment of the larger event tracing framework.
Trace Log Rotation
SingleStore now has a built-in trace log rotation function.logrotate
or similar, you will need to disable that before enabling this new rotation feature.
The built-in trace log rotation feature can be configured to rotate logs based on their size, based on time, or both.
The new trace log rotation feature is disabled by default.
Improved Memory Management for Resource Pools
Added the QUERY_
option for resource pools, which restricts memory usage in the pool on a per-individual query basis.MEMORY_
which restricts usage based on total memory used within the current pool.
For example, when creating or altering a resource pool, setting MEMORY_
to 60%
and QUERY_
to 50%
would configure the system so that all queries running within the specified resource pool should together use a maximum of 60% of system memory, and any single query running within the pool should use, at most, 50% of system memory.
CREATE RESOURCE POOL rpoolmain WITHMEMORY_PERCENTAGE = 60,QUERY_MEMORY_PERCENTAGE = 50,SOFT_CPU_LIMIT_PERCENTAGE = 65,MAX_CONCURRENCY = 40;
Load Data Updates
SingleStore now supports loading data using the Change Data Capture (CDC) pipelines from the following data sources: MongoDB® and MySQL.
Projections
New Feature: Added ability to create a Projection which is a copy of some or all columns in a table, and may be sharded and sorted differently than the primary table.ORDER BY
/LIMIT
, GROUP BY
, count(DISTINCT.
, DISTINCT
, joins, and row lookups on secondary keys.
Other Improvements and Fixes
Other Performance Enhancements
-
SingleStore now supports creating shallow copies of tables.
The WITH SHALLOW COPY
feature copies an existing table and creates a new table that will have the same structure as the original table.The data is not physically copied to the new table, but referenced against the original table. -
SingleStore now supports sorted scan query plan operators for queries containing
ORDER BY/LIMIT
clauses when utilizing flexible parallelism.Before this enhancement, there could be performance regressions for this query shape using flexible parallelism. -
Improved performance when completing large sets of security operations (creating a lot of groups/users/roles, etc.
). -
Added the ability to use named argument notation when calling a PSQL SP or function.
Can reduce total lines of code and make code more readable. -
Added reduction of memory pre-allocation during columnstore JSON reads.
-
Added ability to check if all leaf node partitions are available, before processing new batches.
-
Addressed a table resolution issue for embedded recursive Common Table Expressions (CTEs).
Query Optimization Enhancements:
-
Added support for Row Count and Selectivity hints in views.
-
Added new join logic to recognize when a non-reference table is being joined exclusively to reference tables and then gather the non-reference table to avoid duplicating work across every partition.
-
Remove redundant aggregation functions and
GROUP BY
statements. -
JSON expressions are properly pushed down.
-
Allow columnstore optimization for
JSON_
withMATCH_ ANY JSON_
in predicate.EXTRACT -
Added support for hash joins on null-accepting expressions in the
ON
condition of outer joins. -
Automatically rewrite
A=B OR (A IS NULL AND B IS NULL)
to null safe equal(A<=>B)
so that many important optimizations (e.g. shard key joins, hash joins) will work. -
Perform a subselect to join rewrite in an
UPDATE
statement when there are multiple columns in the subselect. -
Removed some query shape lockdowns.
-
Added support for flipping join order for full outer join.
-
Improved performance by not executing query optimization procedures for read queries during the process of persistent plan cache lookup.
This optimization strategy has resulted in improved lookup performance, leading to faster data retrieval operations. -
Added support for statistics on correlations between columns in cases where highly correlated filters are used.
-
The Data API now supports HTTPS for connections where
ssl_
is encrypted withkey ssl_
.key_ passphrase -
Removed parametrization of
LIMIT 0
andLIMIT 1
to unlock more rewrites, especially for subselects. -
Modified computed column matching to accurately evaluate JSON expressions containing equality and non-safe equality comparisons.
-
Fixed an issue where filtering with a
JSON_
function performs inconsistently.EXTRACT_ <type> -
The query optimizer now considers more
LEFT JOIN
elimination cases.
New Information Schema Views and Columns
-
Enhancement: Added the
TABLE_
column to theNAME LOAD_
information schema view.DATA_ ERRORS TABLE_
is the name of the table associated with the error.NAME (8. 5. 38) -
Enhancement: Fixed the
KILLED_
counter inQUERIES MV_
.RESOURCE_ POOL_ STATUS (8. 5. 33) -
Enhancement: Add bottle service reliability metrics to
MV_
.BOTTOMLESS_ STATUS_ EXTENDED (8. 5. 32) -
New Feature: New Projections-related information schema table: PROJECTIONS.
(8. 5. 14) -
Enhancement: Added the
usage_
information schema table, which shows the state of the usage telemetry feature and also an example of a telemetry report.telemetry_ status (8. 5. 10) -
Enhancement: Added the
blob_
andcache_ miss_ b blob_
columns to the following information schema views:cache_ wait_ time_ ms information_
,schema. plancache information_
,schema. MV_ ACTIVITIES information_
,schema. MV_ ACTIVITIES_ CUMULATIVE information_
,schema. MV_ TASKS information_
, andschema. MV_ FINISHED_ TASKS information_
.schema. MV_ QUERY_ ACTIVITIES_ EXTENDED_ CUMULATIVE (8. 5. 2) -
Added a new view,
correlated_
, to provide metadata on correlated columns.column_ statistics -
Added a new view,
RESOURCE_
, to provide information about resource pool grants and privileges.POOL_ PRIVILEGES -
Added the following new columns to
MV_
:BACKUP_ HISTORY -
error_
: Error code for failed backups.code -
error_
: Error message for failed backups.message
-
-
Added the following new columns to
MV_
:SYSINFO_ DISK -
read_
: Number of read operations performed by the device since start up.operations_ cumulative_ per_ device -
write_
: Number of write operations performed by the device since start up.operations_ cumulative_ per_ device -
device_
: Name of the device to which the values inname read_
andoperations_ cumulative_ per_ device write_
are associated.operations_ cumulative_ per_ device
-
-
Added the following to support trace events that are used by the Query History feature:
-
MV_
: A snapshot of all trace events, the size of which is dictated by theTRACE_ EVENTS trace_
variableevents_ queue_ size -
MV_
: A view that reflects the status of current trace eventsTRACE_ EVENTS_ STATUS -
LMV_
: A snapshot of each node's trace eventsTRACE_ EVENTS
-
-
Added the following new column to
ADVANCED_
,HISTOGRAMS L/MV_
, andQUERY_ PROSPECTIVE_ HISTOGRAMS L/MV_
:PROSPECTIVE_ HISTOGRAMS -
JSON_
: an entry for each (column, json_KEY key) pair. For non-json columns JSON_
is NULL.KEY
-
-
Bugfix: Updated the
information_
view to reflect the account status for locked users.schema. USERS
New Commands and Functions
-
Enhancement: Added the
EXPORT TELEMETRY
SQL command, used to write usage telemetry data to a local file.(8. 5. 10) -
Enhancement:
INFER PIPELINE
for MongoDC CDC-in now generates tables with BSON column types.(8. 5. 3) -
Added support for the
REGEXP_
function.MATCH() This function returns a JSON array of matching substring(s) within the first match of a regular expression pattern to a string. -
Added support for
CUBE
andROLLUP
grouping operations to Wasm-based user-defined aggregate functions (UDAFs) in SingleStore.For more information, refer to CREATE AGGREGATE. -
Added the following to support trace events that are used by the Query History feature:
-
CREATE EVENT TRACE
to create a trace event -
DROP EVENT
to drop a trace event
-
-
Added ability to use
DELETE
on identical keys with theON DUPLICATE KEY
clause.This is in addition to existing "upsert" support with ON DUPLICATE KEY UPDATE
.This allows new scenarios such as the ability to manage streaming aggregation with INSERT … ON DUPLICATE KEY UPDATE … ELSE DELETE ….
New or Modified Engine Variables
-
Enhancement: Added a new engine variable
scheduler_
that specifies the threshold for triggering the verbose logging of scheduler thread timing.slow_ threshold_ seconds (8. 5. 39) -
Enhancement: Added a new engine variable
scheduler_
that specifies the threshold for triggering logging of slow ready queue draining.slow_ ready_ queue_ seconds (8. 5. 39) -
Enhancement: Added a new
max_
global variable that sets the maximum number of merger threads that the autostats updater can run on.autostats_ update_ workers (8. 5. 33) -
Enhancement: Added a new filter derivation rewrite controlled by the
optimizer_
engine variable.not_ null_ filter_ derivation (8. 5. 33) -
Enhancement: Added a new engine variable
costing_
which sets the maximum amount of tables we allow in full DQO estimate search when filling table row estimates for distributed join hints.max_ djo_ tables (8. 5. 16) -
Enhancement: Made the following engine variable settable in SingleStore Helios (8.
5. 15): multi_
insert_ tuple_ count, columnstore_ row_ value_ table_ lock_ threshold, internal_ columnstore_ max_ uncompressed_ blob_ size, internal_ columnstore_ max_ uncompressed_ rowindex_ blob_ size, columnstore_ validate_ blob_ before_ merge, default_ max_ histogram_ buckets, enable_ varbuffer_ dictionary_ compression, enable_ alias_ space_ trim, skip_ segelim_ with_ inlist_ threshold, default_ autostats_ histogram_ mode, default_ autostats_ columnstore_ cardinality_ mode, default_ autostats_ rowstore_ cardinality_ mode, default_ autostats_ columnstore_ sampling, experimental_ disable_ multipartition_ parallel_ read, internal_ enable_ parallel_ query_ throttling, enable_ spilling, spilling_ node_ memory_ threshold_ ratio, spilling_ query_ operator_ memory_ threshold, regexp_ compile_ mem_ mb, selectivity_ estimation_ min_ sample_ size, repartition_ minimum_ buffer_ size, use_ seekable_ json, json_ document_ max_ children, json_ document_ max_ leaves, json_ document_ sparse_ children_ check_ threshold, json_ document_ sparse_ children_ check_ ratio, json_ document_ absolute_ sparse_ key_ check_ ratio, json_ document_ page_ data_ soft_ limit, ignore_ insert_ into_ computed_ column, max_ subselect_ aggregator_ rowcount, leaf_ pushdown_ enable_ rowcount, reshuffle_ group_ by_ base_ cost, max_ broadcast_ tree_ rowcount, enable_ broadcast_ left_ join, display_ full_ estimation_ stats, force_ bushy_ joins, force_ heuristic_ rewrites, force_ table_ pushdown, force_ bloom_ filters, explain_ joinplan_ costs, statistics_ warnings, optimizer_ warnings, optimize_ constants, optimize_ huge_ expressions, optimize_ expressions_ larger_ than, optimize_ stmt_ threshold, quadratic_ rewrite_ size_ limit, batch_ external_ functions, batch_ external_ functions_ size, optimizer_ enable_ json_ text_ matching, optimizer_ enable_ orderby_ limit_ self_ join, distributed_ optimizer_ broadcast_ mult, distributed_ optimizer_ old_ selectivity_ table_ threshold, distributed_ optimizer_ selectivity_ fallback_ threshold, distributed_ optimizer_ verbose, old_ local_ join_ optimizer, optimizer_ hash_ join_ cost, optimizer_ merge_ join_ cost, optimizer_ nested_ join_ cost, optimizer_ disable_ right_ join, interpreter_ mode_ sampling_ threshold, hash_ groupby_ segment_ distinct_ values_ threshold, sampling_ estimates_ for_ complex_ filters, enable_ histogram_ to_ unwrap_ literals, estimate_ zero_ rows_ when_ sampling_ data_ is_ missing, exclude_ scalar_ subselects_ from_ filters, leaf_ pushdown_ default, distributed_ optimizer_ nodes, optimizer_ num_ partitions, enable_ binary_ protocol, enable_ multipartition_ queries, enable_ local_ shuffle_ group_ by, enable_ skiplist_ sampling_