SingleStore Managed Service

String Types

Fixed-Length String

There are three fixed-size string types in SingleStore DB. The space consumed is directly proportional to their length definitions. Values of type CHAR require 4 x LENGTH bytes of memory, as SingleStore DB supports the UTF8 character set including the 4-byte character encoding. Values of type BINARY require LENGTH bytes of memory. char(0) not null and binary(0) not null still allocate 1 byte.

Data Type

Size

Size (Not Null)

CHAR(LENGTH)

4*LENGTH + 1 bytes

4*LENGTH bytes

BINARY(LENGTH)

LENGTH + 1 bytes

LENGTH bytes

A column declared as CHAR() stores values as per the defined length. When CHAR values are stored, they are right-padded with spaces to the specified length but when the CHAR values are retrieved, trailing spaces are removed.

CREATE TABLE t1 (c1 CHAR(3));
INSERT INTO t1 VALUES ('  ');
INSERT INTO t1 VALUES ('ab ');
INSERT INTO t1 VALUES ('abcd');
SELECT c1, LENGTH (c1) FROM t1;
****
+------+-------------+
| c1   | LENGTH (c1) |
+------+-------------+
| abc  |           3 |
| ab   |           2 |
|      |           0 |
+------+-------------+

Variable-Length Strings

There are several variable-length string types. The data for these variable types is stored in a memory allocation designed for them, and pointers to its locations in memory are stored in the main table structure. This means that the space consumed by a particular value is the byte length of the string, plus an 8-byte overhead for the pointer, and a 4-byte overhead for storing the length. (There is no additional 4-byte overhead if the field is nullable.) One exception is values less than 7 bytes long. Those small values are stored directly in the pointer area.

The three main types are VARCHAR, VARBINARY, and LONGTEXT. While they all operate similarly for most English-language alphanumeric strings, there are several important differences in their behavior. The length limit of the VARCHAR is measured in Unicode characters, not in bytes. VARBINARY fields store data verbatim, regardless of whether they are valid Unicode strings. Sorting and comparing a VARBINARY uses the exact binary representation.

To reduce fragmentation, allocations for variable sized data use sizes from a fixed list in multiples of eight (16, 32, 40, etc.). Variable data of other lengths is rounded up to the next nearest, so a ten (10) byte string would allocate sixteen (16) bytes.

Data Type

Max Length

VARCHAR

21,845 characters

VARBINARY

system RAM

LONGTEXT

4 GB

Notice

There are several variants of the BLOB and TEXT types, listed below. Aside from their maximum lengths, there is no practical difference between these types in SingleStore DB, including performance or memory efficiency. They are implemented solely for backwards compatibility with MySQL syntax.

Name

Max Length

Description

LONGBLOB

4 GB

VARBINARY

MEDIUMBLOB

16 MB

VARBINARY

BLOB

65,535 bytes

VARBINARY

TINYBLOB

255 bytes

VARBINARY

MEDIUMTEXT

16 MB

LONGTEXT

TEXT

65,535 bytes

LONGTEXT

TINYTEXT

255 bytes

LONGTEXT

Comparison operators treat trailing spaces differently when used with variable length TEXT type columns such as VARCHAR(), TEXT(), MEDIUMTEXT(), and TINYTEXT(). This behavior does not apply to BLOB type columns. Comparison operators ignore the trailing spaces at the end of the string being compared, except for the LIKE operator. The LIKE operator does not ignore the trailing spaces and considers them while comparing the strings.

The following example demonstrates how comparison operators treat trailing spaces for a VARCHAR() column. Consider a table Employee as follows:

DESCRIBE Employee;
****
+----------+-------------+------+------+---------+-------+
| Field    | Type        | Null | Key  | Default | Extra |
+----------+-------------+------+------+---------+-------+
| Location | varchar(10) | YES  |      | NULL    |       |
+----------+-------------+------+------+---------+-------+

Insert a few values in the Location column, with and without trailing spaces:

INSERT INTO Employee VALUES ('BERLIN '),('BERLIN  '),('BERLIN'),('BERLIN ');

SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee;
****
+------------------+----------------------------+
| LENGTH(Location) | CONCAT('[', Location, ']') |
+------------------+----------------------------+
|                6 | [Berlin]                   |
|                8 | [Berlin  ]                 |
|                7 | [Berlin ]                  |
|                7 | [Berlin ]                  |
+------------------+----------------------------+

The following SELECT statements ignore the trailing spaces and return the same output:

SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee WHERE Location = 'Berlin ';
SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee WHERE Location = 'Berlin';
SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee WHERE Location = 'Berlin  ';
****
+------------------+----------------------------+
| LENGTH(Location) | CONCAT('[', Location, ']') |
+------------------+----------------------------+
|                6 | [Berlin]                   |
|                8 | [Berlin  ]                 |
|                7 | [Berlin ]                  |
|                7 | [Berlin ]                  |
+------------------+----------------------------+

An exception is the LIKE comparison operator, where trailing spaces are significant while comparison.

SELECT LENGTH(Location), CONCAT('[', Location, ']') FROM Employee WHERE Location LIKE 'Berlin';
****
+------------------+----------------------------+
| LENGTH(Location) | CONCAT('[', Location, ']') |
+------------------+----------------------------+
|                6 | [Berlin]                   |
+------------------+----------------------------+