SingleStore DB

String Types

Fixed-Length String

There are three fixed-size string types in SingleStore. The space consumed is directly proportional to their length definitions. Values of type CHAR require 4 x LENGTH bytes of memory, as SingleStore supports the UTF8 character set including the 4-byte character encoding. Values of type BINARY require LENGTH bytes of memory. char(0) not null and binary(0) not null still allocate 1 byte.

Data Type

Size

Size (Not Null)

CHAR(LENGTH)

4*LENGTH + 1 bytes

4*LENGTH bytes

BINARY(LENGTH)

LENGTH + 1 bytes

LENGTH bytes

A column declared as CHAR() stores values as per the defined length. When CHAR values are stored, they are right-padded with spaces to the specified length but when the CHAR values are retrieved, trailing spaces are removed.

CREATE TABLE t1 (c1 CHAR(3));
INSERT INTO t1 VALUES ('  ');
INSERT INTO t1 VALUES ('ab ');
INSERT INTO t1 VALUES ('abcd');
SELECT c1, LENGTH (c1) FROM t1;
****
+------+-------------+
| c1   | LENGTH (c1) |
+------+-------------+
| abc  |           3 |
| ab   |           2 |
|      |           0 |
+------+-------------+

Variable-Length Strings

There are several variable-length string types. The data for this datatype is stored in memory for rowstore tables and the main table structure holds pointers to the memory locations.

The total space consumed by a particular string value is:

Notice

One exception is values less than 7 bytes long. Those small values are stored directly in the pointer area.

  • The byte length of the string.

  • An 8-byte overhead for the pointer.

  • A 4-byte overhead for storing the length. There is no 4-byte overhead if the field is nullable.

The three main types are VARCHAR, VARBINARY, and LONGTEXT. While they all operate the same for most English-language alphanumeric strings, there are several important differences in their behavior. The length limit of the VARCHAR is measured in Unicode characters, not in bytes. VARBINARY fields store data verbatim, regardless of whether they are valid Unicode strings. Sorting and comparing a VARBINARY uses the exact binary representation.

To reduce memory fragmentation, memory for variable-sized data is allocated from a fixed list in multiples of eight (16, 32, 40, etc.), but not every multiple of eight. Variable data of other lengths are rounded up to the next nearest multiple of eight in the list. For example, a ten (10) byte string would be allocated sixteen (16) bytes.

Data Type

Max Length

VARCHAR

21,845 characters

VARBINARY

System RAM

LONGTEXT

4 GB

Notice

There are several variants of the TEXT type, listed below. Aside from their maximum lengths, there is no practical difference between these types in SingleStore, including performance or memory efficiency. They are implemented solely for backward compatibility with MySQL syntax.

Name

Max Length

Description

MEDIUMTEXT

16 MB

LONGTEXT

TEXT

65,535 bytes

LONGTEXT

TINYTEXT

255 bytes

LONGTEXT

Comparison operators do not treat trailing spaces differently when used with variable length TEXT type columns such as VARCHAR(), TEXT(), MEDIUMTEXT(), and TINYTEXT(). Comparison operators ignore the trailing spaces at the end of the string being compared, except for the LIKE operator. The LIKE operator does not ignore the trailing spaces and considers them while comparing the strings.

The following example demonstrates how comparison operators treat trailing spaces for a VARCHAR() column. Consider a table Employee as follows:

DESCRIBE Employee;
****
+----------+-------------+------+------+---------+-------+
| Field    | Type        | Null | Key  | Default | Extra |
+----------+-------------+------+------+---------+-------+
| Location | varchar(10) | YES  |      | NULL    |       |
+----------+-------------+------+------+---------+-------+

Insert a few values in the Location column, with and without trailing spaces:

INSERT INTO Employee VALUES ('BERLIN '),('BERLIN  '),('BERLIN'),('BERLIN ');

SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee;
****
+------------------+----------------------------+
| LENGTH(Location) | CONCAT('[', Location, ']') |
+------------------+----------------------------+
|                6 | [Berlin]                   |
|                8 | [Berlin  ]                 |
|                7 | [Berlin ]                  |
|                7 | [Berlin ]                  |
+------------------+----------------------------+

The following SELECT statements ignore the trailing spaces and return the same output:

SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee WHERE Location = 'Berlin ';
SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee WHERE Location = 'Berlin';
SELECT LENGTH(Location),CONCAT('[', Location, ']') FROM Employee WHERE Location = 'Berlin  ';
****
+------------------+----------------------------+
| LENGTH(Location) | CONCAT('[', Location, ']') |
+------------------+----------------------------+
|                6 | [Berlin]                   |
|                8 | [Berlin  ]                 |
|                7 | [Berlin ]                  |
|                7 | [Berlin ]                  |
+------------------+----------------------------+

An exception is the LIKE comparison operator, where trailing spaces are significant while comparison.

SELECT LENGTH(Location), CONCAT('[', Location, ']') FROM Employee WHERE Location LIKE 'Berlin';
****
+------------------+----------------------------+
| LENGTH(Location) | CONCAT('[', Location, ']') |
+------------------+----------------------------+
|                6 | [Berlin]                   |
+------------------+----------------------------+