CREATE TABLE
On this page
Creates a new table.
As of version 7.
Syntax
CREATE [ROWSTORE] [REFERENCE | TEMPORARY | GLOBAL TEMPORARY] TABLE [IF NOT EXISTS] <table_name>(<create_definition>,...)[<table_options>][[AS] SELECT ...]CREATE TABLE [IF NOT EXISTS] new_tbl_name{ LIKE original_tbl_name | (LIKE original_tbl_name) }<create_definition>:<column_name> { <column_definition> | AS <computed_column_definition> }| [CONSTRAINT [symbol]] PRIMARY KEY [<index_type>] (<index_column_name>,...)[<index_option>] ...| { INDEX | KEY } [<index_name>] [<index_type>] (<index_col_name>,...)[<index_option>] ...| [CONSTRAINT [symbol]] UNIQUE [INDEX | KEY][<index_name>] [<index_type>] (<index_column_name>,...)[<index_option>] ...| [CONSTRAINT [symbol]] SHARD KEY [<index_type>] (<index_column_name>,...)[<index_option>] ...| SORT KEY (<index_column_name>,... [DESC])| FULLTEXT[<index_name>] (<index_column_name>,...)<column_definition>:<data_type> [NOT NULL | NULL] [DEFAULT <default_value>] [ON UPDATE <update_value>][AUTO_INCREMENT] [UNIQUE [KEY] | [PRIMARY] KEY] [SPARSE] [SERIES TIMESTAMP]<computed_column_definition>:computed_column_expression PERSISTED data_type<data_type>:BIT[(<length>)]| TINYINT[(<length>)] [UNSIGNED]| SMALLINT[(<length>)] [UNSIGNED]| INT[(<length>)] [UNSIGNED]| INTEGER[(<length>)] [UNSIGNED]| BIGINT[(<length>)] [UNSIGNED]| REAL[(<length>,<decimals>)] [UNSIGNED]| DOUBLE[(<length>,<decimals>)] [UNSIGNED]| DECIMAL[(<length>[,<decimals>])] [UNSIGNED]| NUMERIC[(<length>[,<decimals>])] [UNSIGNED]| DATETIME| DATETIME(6)| TIMESTAMP| TIMESTAMP(6)| DATE| TIME| CHAR[(<length>)][CHARACTER SET <character_set_name>] [COLLATE <collation_name>]| VARCHAR(<length>)[CHARACTER SET <character_set_name>] [COLLATE <collation_name>]| TINYBLOB| BLOB| MEDIUMBLOB| LONGBLOB| TINYTEXT [BINARY]| TEXT [BINARY]| MEDIUMTEXT [BINARY]| LONGTEXT [BINARY]| ENUM(<value1>,<value2>,<value3>,...)| SET(<value1>,<value2>,<value3>,...)| JSON [COLLATE <collation_name>]| GEOGRAPHY| GEOGRAPHYPOINT<index_column_name>:<column_name> [(<length>)] [ASC | DESC]<index_type>:| USING { BTREE | HASH }<index_option>:KEY_BLOCK_SIZE [=] <value>| <index_type>| COMMENT '<string>'| BUCKET_COUNT [=] <value>| WITH (<index_kv_options>)| UNENFORCED [RELY | NORELY]<index_kv_options>:<index_kv_option> [, <index_kv_option>] ...<index_kv_option>:RESOLUTION = <value>| COLUMNSTORE_SEGMENT_ROWS = <value>| COLUMNSTORE_FLUSH_BYTES = <value><table_options>:<table_option> [[,] <table_option>] ...<table_option>:AUTO_INCREMENT [=] <value>| COMMENT [=] '<string>'| AUTOSTATS_ENABLED = { TRUE | FALSE }| AUTOSTATS_CARDINALITY_MODE = {INCREMENTAL|PERIODIC|OFF}| AUTOSTATS_HISTOGRAM_MODE = {CREATE|UPDATE|OFF}| AUTOSTATS_SAMPLING = {ON|OFF}| COMPRESSION = SPARSE
Remarks
Note
Unless CREATE ROWSTORE TABLE .
or SORT KEY()
are specified, the value of the default_
engine variable determines the type of table (columnstore
or rowstore
) that is created.
When default_
is set to columnstore
, you can create a columnstore table using standard CREATE TABLE syntax.
default_
is set to columnstore
by default.
The setting of default_
applies to temporary tables.GLOBAL TEMPORARY
tables, if default_CREATE ROWSTORE GLOBAL TEMPORARY TABLE
.GLOBAL TEMPORARY
is not supported on columnstore tables.
-
For more information about the data types listed above, and for an explanation of
UNSIGNED
, refer to the Data Types topic. -
The
SET
data type restricts the values that can be inserted for a table column.Only the set of strings that are listed for a column at the time of table creation can be inserted. -
<table_
is the name of the table to create in the SingleStore database.name> -
The following note applies when the engine variable
table_
is set toname_ case_ sensitivity OFF
: After you create a table, you cannot create another table having the same table name with a different case.Refer to the Database Object Case Sensitivity topic for more information. -
CREATE TABLE
is slower in SingleStore than in MySQL.See Code Generation for more information. -
The
KEY
syntax is equivalent to usingINDEX
syntax when used inCREATE TABLE
.The convention is to use the KEY
syntax.INDEX
syntax is generally used when creating an index on an existing table.See CREATE INDEX for more information. -
The
BTREE
index type creates a skip list index in SingleStore.This index has very similar characteristics to a BTREE index. -
If you do not want to specify a column (or columns) to sort on, or do not care about the sort order for your data, you can specify an empty key (e.
g. SORT KEY()
). -
The
SORT KEY()
order can be specified as ascending (SORT KEY(index_
) or descending (column_ name) SORT KEY(index_
).column_ name DESC) SingleStore does not support scanning a SORT KEY()
in reverse order to its sort order:CREATE TABLE ct_sort (col1 int, SORT KEY(col1 DESC));EXPLAIN SELECT * FROM ct_sort ORDER BY col1 DESC;+-------------------------------------------------------------------------------------------+ | EXPLAIN | +-------------------------------------------------------------------------------------------+ | Project [remote_0.col1] | TopSort limit:[@@SESSION.`sql_select_limit`] [remote_0.col1 DESC] | | Gather partitions:all alias:remote_0 parallelism_level:sub_partition | | Project [t1.col1] | | Top limit:[?] | | ColumnStoreFilter [<after per-thread scan begin> AND <before per-thread scan end>] | | OrderedColumnStoreScan test1.t1, SORT KEY col1 (col1 DESC) table_type:sharded_columnstore | +-------------------------------------------------------------------------------------------+
EXPLAIN SELECT * FROM ct_sort ORDER BY col1;+------------------------------------------------------------------------------------+ | EXPLAIN | +------------------------------------------------------------------------------------+ | Project [remote_0.col1] | | TopSort limit:[@@SESSION.`sql_select_limit`] [remote_0.col1] | | Gather partitions:all alias:remote_0 parallelism_level:segment | | Project [t1.col1] | | TopSort limit:[?] [t1.col1] | | ColumnStoreScan test1.t1, SORT KEY col1 (col1 DESC) table_type:sharded_columnstore | +------------------------------------------------------------------------------------+
-
SORT KEY()
is not allowed when usingCREATE ROWSTORE TABLE .
.. . -
KEY() USING CLUSTERED COLUMNSTORE
is a legacy syntax that is equivalent toSORT KEY()
.SingleStore recommends using SORT KEY()
. -
BUCKET_
is specific to theCOUNT HASH
index type.It controls the bucket count of the hash table. It applies to rowstore hash indexes only and does not effect columnstore hash indexes. -
The
UNENFORCED
index option can be used on aUNIQUE
constraint to specify that the unique constraint is unenforced.See Unenforced Unique Constraints. -
RESOLUTION
is specific to index on geospatial columns.See Working with Geospatial Features for more information. -
COLUMNSTORE_
,SEGMENT_ ROWS COLUMNSTORE_
controls configuration variables specific to columnstore tables.FLUSH_ BYTES See Advanced Columnstore Configuration Options) for more information. -
SingleStore supports binary,
utf8
, andutf8mb4
character sets.The COLLATE
clause in theCREATE TABLE
statement can be used to override the server character set and collation, which are used as default values for the table.For more information, see Character Set and Collation Override.
-
AUTOSTATS_
controls if automatic statistics should be collected on this table.ENABLED There are three categories of autostats - AUTOSTATS_
,CARDINALITY_ MODE AUTOSTATS_
, andHISTOGRAM_ MODE AUTOSTATS_
.SAMPLING SingleStore allows you to independently control how each category of statistics is automatically gathered. Multiple autostats settings can be combined in a single CREATE TABLE
statement.See Automatic Statistics for more information. -
This command can be run on the master aggregator node, or a child aggregator node (see Node Requirements for SingleStore Commands ).
-
This command causes implicit commits.
Refer to COMMIT for more information. -
<computed_
defines the value of a computed column using other columns in the table, constants, built-in functions, operators, and combinations thereof.column_ expression> For more information see Persistent Computed Columns. -
Temporary tables, created with the
TEMPORARY
option, will be deleted when the client session terminates.For ODBC/JDBC, this is when the connection closes. For interactive client sessions, it is when the user terminates the client program. -
Global temporary tables, created with the
GLOBAL TEMPORARY
option, exist beyond the duration of a client session.If failover occurs, the global temporary tables lose data and enter an errored state; they need to be dropped and recreated. This command can be run only on the master aggregator. See Global Temporary Tables for details. -
The
SERIES TIMESTAMP
clause specifies a table column as the default column defining time order for implicit use by time series functions.This setting can be specified only for a single table column. The column can be one of the following data types: DATE
,TIME
,DATETIME
,DATETIME(6)
,TIMESTAMP
orTIMESTAMP(6)
.SingleStore recommends to use either of the DATETIME
orDATETIME(6)
types instead of one of theTIMESTAMP
types because the automatic update behavior ofTIMESTAMP
is subject to change.See Timestamp Behavior for details. -
The
SERIES TIMESTAMP
clause does not affect the data type of a table column, rather it specifies the behavior of the column in the time-series-specific functions likeFIRST()
,LAST()
, andTIME_
.BUCKET() -
Keyless sharding distributes data across partitions uniformly at random but with the limitation that it does not allow single partition queries or local joins since rows are not assigned to specific partitions.
Keyless sharding is the default for tables that do not have primary key or explicit shard key. You can explicitly declare a table as keyless sharded by specifying a shard key with an empty list of columns in the SHARD KEY()
constraint in the table definition. -
Refer to the Permission Matrix for the required permission.
MySQL Compatibility
SingleStore’s syntax differs from MySQL mainly in the data types and storage it supports, and some specific index hints.
-
KEY_
: value is currently ignored.BLOCK_ SIZE [=] <value>
DEFAULT Behavior
If DEFAULT <default_
is specified in <column_
, and no value is inserted in the column, then <default_
will be placed in the column during an INSERT
operation.
ON UPDATE Behavior
If ON UPDATE <update_
is specified in <column_
, and if any other column is updated but the specified column is not explicitly updated, then update_
will be placed in the column during an UPDATE
operation.TIMESTAMP
, TIMESTAMP(6)
, DATETIME
, or DATETIME(6)
, then you can update <update_
to one of the following values: CURRENT_
, CURRENT_
, NOW()
, or NOW(6)
.
ON UPDATE
can be used with these TIMESTAMP/DATETIME[(6)]
types only, and you can only use one of the time functions as the argument.
Example(s)
CREATE TEMPORARY
CREATE TEMPORARY
or CREATE ROWSTORE TEMPORARY
(if default_
is set to columnstore, you must use the latter syntax) creates a table that will be deleted when the client session terminates.
CREATE TEMPORARY TABLE IF NOT EXISTS ct_temp_1 (id INT AUTO_INCREMENT PRIMARY KEY, a INT, b INT, SHARD KEY(id));CREATE ROWSTORE TEMPORARY TABLE IF NOT EXISTS ct_temp_1 (id INT AUTO_INCREMENT PRIMARY KEY, a INT, b INT, SHARD KEY(id));
CREATE ROWSTORE GLOBAL TEMPORARY
CREATE ROWSTORE GLOBAL TEMPORARY
creates a table that exists beyond the duration of a client session.
CREATE ROWSTORE GLOBAL TEMPORARY TABLE IF NOT EXISTS ct_temp_2 (id INT AUTO_INCREMENT PRIMARY KEY, a INT, b INT, SHARD KEY(id));
CREATE TABLE with an AUTO_ INCREMENT Column
AUTO_
can be used to automatically generate a unique value for new rows.AUTO_
field is DEFAULT
, NULL
, or 0
, SingleStore will automatically assign a value.AUTO_
only guarantees that automatically-generated values are unique.
-
are consecutive or sequential
-
are monotonically increasing
-
start from any particular value
-
are distinct from explicitly-set values
If you explicitly set a value in an INSERT
or UPDATE
statement, it may collide with past or future automatically-generated values.AUTO_
column.AGGREGATOR SYNC AUTO_
is run by the end user (which will reset the auto-increment counter to a value higher than 10).AUTO_
value reaches 10 eventually on its own as additional rows are added to the table.
CREATE TABLE ct_auto (c1 INT AUTO_INCREMENT PRIMARY KEY);INSERT INTO ct_auto (c1) VALUES (10);
A table can have only one AUTO_
column.AUTO_
column must be included in an index (not necessarily a PRIMARY
or UNIQUE
key, a regular key is also allowed).
If syntax such as CREATE TABLE table_
is used to create table_
where table_
has an AUTO_
column, it will be created as a non-auto-increment column in table_
.
See LAST_AUTO_
behavior.
If the AUTO_
behavior described here does not satisfy your requirements you can create your own sequence generator using LAST_
.
Warning
Restarting an aggregator, such as during upgrades or host machine maintenance, will introduce a large gap between any AUTO_
values inserted before the restart and any values inserted after.
These jumps are because each aggregator defines and manages its own range of values to start incrementing from to prevent collisions in a table.AUTO_
values increases to the next 1,000,000 after each restart (e.
AUTO_ INCREMENT in Sharded Tables
On a sharded (distributed) table, AUTO_
can only be used on a BIGINT
column (as they usually use the entire 64 bits).AUTO_
values and uses those values when new rows are added to a table.AUTO_
values in sharded tables are assigned using the high 14 bits to encode the aggregator ID and the bottom 50 bits for a per-aggregator unique value.AUTO_
values, the automatically-generated values from inserts across multiple aggregators are only unique, never sequential.
Here is an example to illustrate how AUTO_
values are generated across aggregators in a cluster as new rows are inserted into table ct_
:
SELECT * FROM ct_tb ORDER BY b;
+-------------------+------+------------+
| a | b | c |
+-------------------+------+------------+
| 1 | 1 | from MA |
| 2 | 2 | from MA |
| 3 | 3 | from MA |
| 4 | 4 | from MA |
| 5 | 5 | from MA |
| 13510798882111489 | 6 | from CA 96 |
| 13510798882111490 | 7 | from CA 96 |
| 13510798882111491 | 8 | from CA 96 |
| 13510798882111492 | 9 | from CA 96 |
| 13510798882111493 | 10 | from CA 96 |
| 14636698788954113 | 11 | from CA 20 |
| 14636698788954114 | 12 | from CA 20 |
| 14636698788954115 | 13 | from CA 20 |
| 14636698788954116 | 14 | from CA 20 |
| 14636698788954117 | 15 | from CA 20 |
| 6 | 16 | from MA |
| 15762598695796737 | 17 | from CA 17 |
| 13510798882111494 | 18 | from CA 96 |
| 7 | 19 | from MA |
| 14636698788954118 | 20 | from CA 20 |
+-------------------+------+------------+
As shown in the example above, automatically-generated AUTO_
values can differ depending on which aggregator you run the inserts on.
AUTO_ INCREMENT in Reference Tables
The AUTO_
value for a reference table is tracked by the master aggregator.AUTO_
value will always be greater than any value previously seen in this column.INSERT
or UPDATE
statement will not create a collision with future automatically generated values.
The next example shows some queries using AUTO_
fields on reference tables.
CREATE REFERENCE TABLE ct_ref_1(id INT AUTO_INCREMENT PRIMARY KEY);INSERT INTO ct_ref_1 VALUES();INSERT INTO ct_ref_1 VALUES(5);INSERT INTO ct_ref_1 VALUES();
SELECT id FROM ct_ref_1 ORDER BY id;
+----+
| id |
+----+
| 1 |
| 5 |
| 6 |
+----+
UPDATE ct_ref_1 SET id = 9 WHERE id = 5;INSERT INTO ct_ref_1();
SELECT id FROM ct_ref_1 ORDER BY id;
+----+
| id |
+----+
| 1 |
| 6 |
| 9 |
| 10 |
+----+
DELETE FROM ct_ref_1;INSERT INTO ct_ref_1 VALUES();
SELECT id FROM ct_ref_1 ORDER BY id;
+----+
| id |
+----+
| 11 |
+----+
Setting AUTO_ INCREMENT Starting Values
It is possible to override the starting AUTO_
value for reference tables by setting the AUTO_
option on a CREATE TABLE
statement.
The following example shows how to set the AUTO_
start value during table creation:
CREATE REFERENCE TABLE ct_ref_2 (id int AUTO_INCREMENT PRIMARY KEY) AUTO_INCREMENT = 7;INSERT INTO ct_ref_2 VALUES (), ();
SELECT * FROM ct_ref_2;
+----+
| id |
+----+
| 7 |
| 8 |
+----+
This syntax has no effect on sharded tables.
AUTO_ INCREMENT During Replication
When replicating data between clusters, the secondary cluster has all the replicated AUTO_
values from the primary cluster.AUTO_
values on every aggregator.
CREATE TABLE IF NOT EXISTS my_MemSQL_table (id INT PRIMARY KEY AUTO_INCREMENT, v VARCHAR(10) NOT NULL);
CREATE TABLE AS SELECT
CREATE TABLE AS SELECT
(also referred to as CREATE TABLE .
) can create one table from results of a SELECT
query.
Here is the basic syntax.
CREATE [ROWSTORE] [REFERENCE | TEMPORARY | GLOBAL TEMPORARY] TABLE [IF NOT EXISTS] <table_name_2>(column_name(s), [SHARD KEY(column_name)] | [SORT KEY(column_name)] | [KEY(column_name)]AS SELECT [*] | [column_name(s)] FROM table_name_1;
Here is an example of a CREATE TABLE AS SELECT
command with a shard key, sort key and an index:
CREATE TABLE ctas_table (a BIGINT, b BIGINT, SHARD KEY(a), SORT KEY(b), KEY(a)) AS SELECT * FROM orig_table;
The table will include a column for each column of the SELECT
query.
CREATE TABLE table_1 (PRIMARY KEY (a, b)) AS SELECT * FROM table_2;CREATE TABLE table_1 (SORT KEY (a, b)) AS SELECT * FROM table_2;CREATE TABLE table_1 (a int, b int) AS SELECT c, d FROM table_2;CREATE TABLE table_1 (b AS a+1 PERSISTED int) AS SELECT a FROM table_2;
In the case that the original table (table_
in the above examples) has an AUTO_
column, it will be created as a non-auto-increment column in the new table (table_
).
CREATE TABLE AS SELECT to Extract Data from One Column in an Existing Table
Extract time column from an event table to build a times table.
CREATE TABLE events (type VARCHAR(256), time TIMESTAMP);INSERT INTO events VALUES('WRITE', NOW());
CREATE TABLE times (id INT AUTO_INCREMENT KEY, time TIMESTAMP) AS SELECT time FROM events;
SELECT * FROM times;
+----+---------------------+
| id | time |
+----+---------------------+
| 1 | 2023-06-21 15:57:35 |
+----+---------------------+
CREATE TABLE AS SELECT to Extract Distinct Values from an Existing Table
SELECT * FROM courses ORDER BY course_code, section_number;
+-------------+----------------+-----------------+
| course_code | section_number | number_students |
+-------------+----------------+-----------------+
| CS-101 | 1 | 20 |
| CS-101 | 2 | 16 |
| CS-101 | 3 | 22 |
| CS-101 | 4 | 25 |
| CS-101 | 5 | 22 |
| CS-150 | 1 | 10 |
| CS-150 | 2 | 16 |
| CS-150 | 3 | 11 |
| CS-150 | 4 | 17 |
| CS-150 | 5 | 9 |
| CS-201 | 1 | 14 |
| CS-201 | 2 | 17 |
| CS-301 | 1 | 7 |
| CS-301 | 2 | 10 |
+-------------+----------------+-----------------+
CREATE TABLE IF NOT EXISTS distinct_courses (PRIMARY KEY(course_code))AS SELECT DISTINCT(course_code) FROM courses;
SELECT * FROM distinct_courses ORDER by course_code;
+-------------+
| course_code |
+-------------+
| CS-101 |
| CS-150 |
| CS-201 |
| CS-301 |
+-------------+
CREATE TABLE USING HASH
The USING HASH
clause creates a hash index in a table.
If a rowstore or columnstore table is being created, the following applies:
-
You can create single-column or multi-column hash indexes.
-
When you create a unique single-column hash index, the shard key can contain only one column and that column must be the same column that you have created the index on.
When you create a unique multi-column hash index, the shard key must be a subset of the columns that you have created the index on. -
You can create multiple single-column hash indexes on a reference table.
If a columnstore table is being created, the following applies:
-
You can create at most one unique hash index.
You can create multiple multi-column hash indexes. -
You cannot create a unique hash index on a
FLOAT
,REAL
, orDOUBLE
column.
CREATE TABLE with Multiple Hash Indexes
The following example creates a columnstore table with three hash indexes.
CREATE TABLE articles_3 (id INT UNSIGNED,month int UNSIGNED,year int UNSIGNED,title VARCHAR(200),body TEXT,SHARD KEY(title),SORT KEY (id),KEY (id) USING HASH,UNIQUE KEY (title) USING HASH,KEY (month,year) USING HASH);
The query SELECT * FROM articles WHERE title = 'Interesting title here';
uses title
’s hash index since the query uses an equality predicate.
The query SELECT * FROM articles WHERE year > 2010 AND month > 5;
does not use the hash index on month
and year
since the query does not use an equality predicate.
See the ColumnstoreFilter in the Query Plan Operations topic for an example EXPLAIN
plan for a columnstore query that uses a hash index.
See Highly Selective Joins for an example of a columnstore query with a join that uses a hash index.
CREATE TABLE with One Hash Index Containing Multiple Columns
KEY(<column 1 name>,<column 2 name>, .
is equivalent to KEY(<column 1 name>,<column 2 name>,.
.
CREATE TABLE ct_hash_1(a INT, b INT, c INT, KEY(a,b));
is equivalent to:
CREATE TABLE ct_hash_1(a INT, b INT, SORT KEY(), KEY(a,b) USING HASH);
A query against t
with an equality filter on a
, an equality filter on b
, or equality filters on both a
and b
could benefit from KEY(a,b) USING HASH
.
Depending on the cardinality, the performance of the query may be worse than the performance of the same query, where t
is a rowstore table and KEY(a,b)
is defined on that table.
CREATE TABLE with FULLTEXT Columns
SingleStore supports full-text search across text columns in a columnstore table using the FULLTEXT
index type.CREATE TABLE
statement and only on the text types CHAR
, VARCHAR
, TEXT
, and LONGTEXT
Data Types
Warning
A FULLTEXT
index cannot be dropped or altered after the table is created, and if the table is dropped, the index is deleted automatically.
If you query a column c
that is part of a multi-column FULLTEXT
index, where the query uses a FULLTEXT MATCH
on c
, the index on c
will be applied.
This differs from a multi-column non-FULLTEXT
index, where behavior is as follows: if you query column c
that is part of index i
, where the query uses an equality filter on c
, the index on c
will only be applied if c
is the leftmost column in i
.
Any column that is part of a FULLTEXT
index can be queried, even if it is not the leftmost.FULLTEXT
columns are done using the SELECT .
syntax.
CREATE TABLE with FULLTEXT Index on Two Columns
This example creates a FULLTEXT
index for both the title column and the body column.MATCH <column_
, and the index on the column would be applied.
CREATE TABLE articles_1 (id INT UNSIGNED,year int UNSIGNED,title VARCHAR(200),body TEXT,SORT KEY (id),FULLTEXT (title,body));
Errors
These are the possible errors you may encounter when using FULLTEXT
.
Error |
Error String |
---|---|
Invalid Type specified for column |
Invalid type specified for |
Specifying |
|
Specifying the same column multiple times |
Column may only be specified once in a |
Specifying a column that is not defined on the table |
Column not defined |
Specifying |
Only column store tables may have a |
CREATE TABLE with COMPRESSION = SPARSE
SingleStore supports sparse data compression for rowstore tables.
Columns that use sparse data compression only store non-NULL
data values.
Sparse compression has the following limitations:
-
The
SPARSE
clause cannot be used for key columns.However, if a rowstore table uses sparse data compression using the COMPRESSION = SPARSE
clause, then the key columns are stored in-row. -
The
SPARSE
clause cannot be used for columns where the non-NULL
size of the column is greater than 15 bytes.
Refer to the Data Types topic for details.
CREATE TABLE with COMPRESSION = SPARSE All Sparse Columns
The following example demonstrates the COMPRESSION = SPARSE
clause.
CREATE ROWSTORE TABLE transaction_1(id BIGINT NOT NULL,explanation VARCHAR(70),shares DECIMAL(18, 2),share_price DECIMAL(18, 2),total_amount as shares * share_price PERSISTED DECIMAL(18,2),transaction_date DATE,dividend_exdate DATE,misc_expenses DECIMAL(18, 2),country_abbreviation CHAR(6),correction_date DATE,settlement_date DATE) COMPRESSION = SPARSE;
CREATE TABLE with COMPRESSION = SPARSE Selected Sparse Columns
The following example demonstrates the SPARSE
clause.
CREATE ROWSTORE TABLE transaction_2(id BIGINT NOT NULL,explanation VARCHAR(70) SPARSE,shares DECIMAL(18, 2) SPARSE,share_price DECIMAL(18, 2),total_amount as shares * share_price PERSISTED DECIMAL(18,2),transaction_date DATE,dividend_exdate DATE SPARSE,misc_expenses DECIMAL(18, 2) SPARSE,country_abbreviation CHAR(6),correction_date DATE SPARSE,settlement_date DATE SPARSE);
Listing Whether Columns use Sparse Compression
The following query lists the columns in the transaction
table that was created in Example 2.
SELECT column_name, is_sparse FROM information_schema.columnsWHERE table_name = 'transaction_2';
+----------------------+-----------+
| column_name | is_sparse |
+----------------------+-----------+
| id | NO |
| explanation | YES |
| shares | YES |
| share_price | NO |
| total_amount | NO |
| transaction_date | NO |
| dividend_exdate | YES |
| misc_expenses | YES |
| country_abbreviation | NO |
| correction_date | YES |
| settlement_date | YES |
+----------------------+-----------+
CREATE TABLE with COMPRESSION = SPARSE Use Case
Sparse rowstore compression works best on a wide table with more than half NULL
values.NULL
values in the table does not contribute to the amount of memory used.
For example, consider this wide table t
having three-hundred columns:
CREATE ROWSTORE TABLE ct_sparse (c1 double,c2 double,…c300 double) COMPRESSION = SPARSE;
In SingleStore 7.t
was loaded with 1.NULL
.t
, run the following command:
SELECT table_name, SUM(memory_use) memory_usage FROM information_schema.table_statisticsWHERE table_name = 'ct_sparse' GROUP BY table_name;
+-------------+--------------+
| table_name | memory_usage |
+-------------+--------------+
| ct_sparse | 1.23 |
+-------------+--------------+
The following table lists the memory usage of table t
, with and without sparse compression:
Compression Setting |
Memory Use |
Savings (Percent) |
---|---|---|
NONE |
2. |
NA |
SPARSE |
1. |
53% |
For this wide table with two-thirds NULL
values, you can store more than twice the data in the same amount of RAM.
Last modified: August 19, 2024