7. 3 Release Notes
On this page
Note
To deploy a SingleStore 7.
To make a backup of a database in this release or to restore a database backup to this release, follow this guide.
7. 3 Release Highlights
The SingleStore 7.
System of Record
-
Added three information schema views: mv_
aggregated_ replication_ status, mv_ replication_ status, and lmv_ replication_ status to monitor the progress of a DR replication. Users can now view the aggregated replication status of each database (including partition-level details) and replication links between the primary and the secondary cluster to know if there is any lag in replication, and view statistics related to the lag. -
Added a new column,
type
, to the Backup History Table.This column shows the type of backup and can be accessed by querying the information_
table.schema. MV_ BACKUP_ HISTORY -
Tables that define a unique key using
UNENFORCED
now have anINDEX_
ofTYPE NONE
in the information schema.The INDEX_
was listed asTYPE BTREE
in previous versions.
Storage
-
Implemented forwarding of data definition language (DDL) commands from child to master aggregator.
Previously, these commands could only be run on a master aggregator. See Cluster Management Commands for more information about how to enable this feature. -
Database-level DDL and clustering operations are now allowed to run in parallel across databases.
-
Added new command REBALANCE ALL DATABASES, which rebalances the partitions on all databases in the cluster.
-
Added the
FULL
option to REBALANCE PARTITIONS which takes effect when the number of partitions in the database is not divisible by the number of leaves.The extra partitions are placed on the leaves containing the fewest number of partitions. -
Added new command PROMOTE AGGREGATOR … TO MASTER for all use cases that require promotion of a child aggregator to master, aside from permanent loss of the master aggregator.
-
Added the information schema view MV_
BACKUP_ STATUS for monitoring backup progress.
Universal Storage
-
Added support for INSERT … ON DUPLICATE KEY UPDATE, INSERT … IGNORE, and REPLACE on columnstore tables.
-
Added the columnstore as default feature, which allows you to create a columnstore table using standard
CREATE TABLE
syntax. -
Added support for the
LOAD DATA .
semantics for ingesting data into columnstore tables with unique keys.. . [REPLACE | IGNORE | SKIP { ALL | CONSTRAINT | DUPLICATE KEY } ERRORS] These semantics allow duplicate keys to be handled without returning an error to the client application. See example 10 in LOAD DATA. -
Added support for upserts on columnstore tables using Pipelines.
Query Optimization
-
Improved optimization of queries with large numbers of tables being joined.
Join optimization is now significantly faster and adaptively handles very large join sizes to provide good execution plans while keeping query optimization time low. The engine variable distributed_
is now deprecated and replaced by new variablesoptimizer_ max_ join_ size distributed_
,optimizer_ unrestricted_ search_ threshold distributed_
, andoptimizer_ min_ join_ size_ run_ initial_ heuristics singlebox_
- see their descriptions in the Sync Variables Lists for further information on configuring these variables.optimizer_ cost_ based_ threshold -
Added a new engine variable
profile_
which can be used to enable collection of additional data withfor_ debug PROFILE
that can be displayed usingSHOW PROFILE JSON
and is useful for troubleshooting query optimizer issues.For more information, see PROFILE. -
Improved selectivity estimation by using sampling and histograms together, when both are available.
This improvement only applies when cardinality_
is set toestimation_ level 7.
or higher.3 By default, cardinality_
is set toestimation_ level 7.
.1 -
Decreased query optimization cost of lookups of query plans from the on-disk plancache.
Query Execution
-
Decreased the in-memory size of query plans by up to 80%.
-
Implemented optimizations for system information schema queries, resulting in significant performance increases for tables such as index_
statistics, column_ statistics, and columnar_ segments in particular. -
Added support for EXPLAIN and PROFILE queries in stored procedures.
Usability and Programmability
-
Added a new aggregate function APPROX_
PERCENTILE that calculates the approximate percentile and is about 10 times faster than the PERCENTILE_ DISC and PERCENTILE_ CONT functions. -
The
USING
clause of query text is no longer captured as part of audit logging.It is now included in the output of SHOW PROCESSLIST in a new column titled RPC Info. -
Added a new
JSON
function, JSON_AGG, that aggregates values as a JSON
array.-
Added a new variable,
json_
, which is the maximum string length JSON_agg_ max_ len AGG can return in bytes. For more information, see the Non-Sync Variables List.
-
-
The
ALTER
permission is no longer required forANALYZE
(SELECT
and eitherALTER
orINSERT
are required). -
Added support for defining User-Defined Variables, via
SELECT INTO @varname
.
Ingest
-
Added support for publishing data to Google Cloud Storage (GCS) via SELECT … INTO GCS.
-
Added support for specifying the chunk size while uploading data to an Amazon S3 bucket via SELECT … INTO S3 to enable output of very large files.
-
SingleStore Pipelines now supports new Avro schema evolution capabilities.
Hostname or IP address of the schema registry can be specified at the time of Pipeline creation; any changes to the schema can be easily viewed by the Pipelines. If fields are added to the Avro schema, the Pipelines can be modified without stopping and losing any offsets. -
Added new option,
max_
, for Filesystem Pipeline Syntax and ALTER PIPELINE.retries_ per_ batch_ partition When set, this determines the number of retries that will be attempted for writing batch partition data to the destination table. Specifying fewer retries when there is a large amount of data to load can be useful for conserving resources; the reverse is true for increasing the number of retries for smaller tables where performance is less of a concern. -
Added a new option to the
CONFIG
clause ofCREATE PIPELINE
for Filesystem pipelines calledprocess_
.zero_ byte_ files Enabling this option ensures zero byte files are processed. Otherwise they are skipped by default.
Small Fixes and Otherwise Uncategorized Items
-
Improved the performance of columnstore batch deletes on certain column encodings.
-
Reduced memory usage for columnstore updates and deletes.
-
More concurrent operations are now allowed on the database level.
-
Added
SORT KEY(.
as an alias for. . ) KEY(.
.. . ) USING CLUSTERED COLUMNSTORE -
Added a new variable,
columnstore_
, that sets the threshold at which multiple inserts to a columnstore table with unique keys will switch from row value locking to table locking.row_ value_ table_ lock_ threshold For more information, see the Sync Variables Lists. -
Added two engine variables:
internal_
andcolumnstore_ validate_ blob_ after_ write internal_
that control verification of the checksum of a blob immediately after it is created, and before reading it, respectively.columnstore_ validate_ blob_ before_ read -
Fixed a bug where some materialized common table expressions (CTEs) with cross database joins would not run because the table holding the CTE result set was assigned to the wrong database.
-
Fixed an issue where
JSON_
was not respectingEXTRACT_ STRING() json_
when extracting strings from columnstore tables.extract_ string_ collation -
Fixed an issue that caused
RESTORE DATABASE FROM filesystem
to result in an unbalanced distribution of partitions among nodes in the restored database. -
Previously, running ALTER VIEW on a schema-bound view could cause recovery to fail to create a new schema-bound view referring to the altered view.
This action has been disallowed. -
Changed how SingleStore looks for files when configuring SSL.
SingleStore expects an absolute path ( /path/to/files
).If you specify a relative path ( .
), SingleStore first looks for the path relative to the location of the/path/to/files memsql.
file.cnf If that fails, SingleStore looks for the path relative to the current working directory. If neither of those paths work, the operation fails. -
JSON_
now returns a SQL NULL value when the value of a key is JSON-NULL or otherwise undefined.EXTRACT_ BIGINT Previously, this function returned 0 (not null) when using columnstore. -
Tables created with syntax such as
CREATE TABLE … AS SELECT
will no longer inheritAUTO_
.INCREMENT behavior For example, if CREATE TABLE table_
is used to create1 AS SELECT * FROM table_ 2 table_
where1 table_
has an2 AUTO_
column, it will be created as a non-auto-increment column inINCREMENT table_
.1 -
SingleStore now supports aliases for column names in a PIVOT.
-
information_
now includes files showing global secondary index disk usage.schema. MV_ COLUMNSTORE_ FILES -
information_
now shows queries on columnstore tables that are blocked.schema. MV_ BLOCKED_ QUERIES -
Added connection link feature that stores connection details (credentials and configurations) to supported data providers such as S3, Azure, GCS, HDFS, and Kafka.
Permitted users can run commands such as BACKUP
,RESTORE
,CREATE PIPELINE
, andSELECT … INTO
without specifying the connection details.
In this section
Last modified: July 29, 2024