Character Sets Supported
On this page
SingleStore supports a variety of character sets in the Unicode standard and their associated collations.
SHOW CHARACTER SET;
+---------+-----------------------+--------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+-----------------------+--------------------+--------+
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| binary | Binary pseudo charset | binary | 1 |
+---------+-----------------------+--------------------+--------+Alternatively, retrieve the supported character sets from the CHARACTER_SELECT statement with optional LIKE and WHERE clauses.
SELECT * FROM INFORMATION_SCHEMA.CHARACTER_SETS WHERE CHARACTER_SET_NAME = 'utf8mb4';
+--------------------+----------------------+---------------+--------+
| CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION | MAXLEN |
+--------------------+----------------------+---------------+--------+
| utf8mb4 | utf8mb4_general_ci | UTF-8 Unicode | 4 |
+--------------------+----------------------+---------------+--------+Character Sets Supported by Features
binary
A character set used for encoding binary strings.binary as the default collation.
Important
The binary character set is a universal feature that is supported across most applicable database schema objects and commands.
utf8
An alias for utf8mb3, which is a Unicode character set that supports encoding of characters using 1 to 3 bytes per character.utf8_ is the default collation assigned to this character set.
Important
The utf8 character set is a universal feature that is supported across most applicable database schema objects and commands.
utf8mb4
A Unicode character set that supports encoding of characters using 1 to 4 bytes per character.utf8mb4_ is the default collation assigned to this character set.
utf8mb4 is supported for specific database schema objects and commands that are discussed in the following sections.
Data Types
The following data types can store utf8mb4 Unicode characters.
-
JSON -
CHAR -
VARCHAR -
LONGTEXT,MEDIUMTEXT,TEXT,TINYTEXT -
ENUM -
SET
String Functions
String Functions can be used with strings with the utf8mb4 character set.utf8mb4 character set.
SELECT LENGTH('Hello world!🙂');
+----------------------------+
| LENGTH('Hello world!🙂') |
+----------------------------+
| 16 |
+----------------------------+JSON Functions
JSON Functions can be used with JSON columns and string arguments with the utf8 and utf8mb4 character sets and the utf8_ and utf8mb4_ collations.JSON column that supports the utf8mb4 character set.
Procedural Extensions
In procedural extensions such as stored procedures and user-defined functions, parameters and variables with utf8mb4 Unicode characters can be used.utf8mb4 Unicode characters.
Pipelines
Pipelines can ingest and process data with the utf8mb4 character set from the supported data sources.utf8mb4 character set.
LOAD DATA
The LOAD DATA statement allows the import of files with any supported character set, including utf8mb4, into SingleStore.utf8mb4 character set.
Last modified: