Character Sets Supported
On this page
SingleStore supports a variety of character sets in the Unicode standard and their associated collations.SHOW CHARACTER SET command.
SHOW CHARACTER SET;
+---------+-----------------------+--------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+-----------------------+--------------------+--------+
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| binary | Binary pseudo charset | binary | 1 |
+---------+-----------------------+--------------------+--------+Alternatively, you can retrieve the supported character sets from the CHARACTER_ view by using a SELECT statement with optional LIKE and WHERE clauses.
SELECT * FROM INFORMATION_SCHEMA.CHARACTER_SETS WHERE CHARACTER_SET_NAME = 'utf8mb4';
+--------------------+----------------------+---------------+--------+
| CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION | MAXLEN |
+--------------------+----------------------+---------------+--------+
| utf8mb4 | utf8mb4_general_ci | UTF-8 Unicode | 4 |
+--------------------+----------------------+---------------+--------+Character Sets Supported by SingleStore Features
binary
A character set used for encoding binary strings.binary as the default collation.
Important
The binary character set is a universal feature that is supported across most applicable database schema objects and commands.
utf8
An alias for utf8mb3, which is a Unicode character set that supports encoding of characters using 1 to 3 bytes per character.utf8_ is the default collation assigned to this character set.
Important
The utf8 character set is a universal feature that is supported across most applicable database schema objects and commands.
utf8mb4
A Unicode character set that supports encoding of characters using 1 to 4 bytes per character.utf8mb4_ is the default collation assigned to this character set.
utf8mb4 is supported for specific database schema objects and commands that are discussed in the following sections.
Data Types
The following data types allow you to store utf8mb4 Unicode characters.
-
JSON -
CHAR -
VARCHAR -
LONGTEXT,MEDIUMTEXT,TEXT,TINYTEXT -
ENUM -
SET
String Functions
String Functions can be used with strings with the utf8mb4 character set.utf8mb4 character set.
select LENGTH('Hello world!🙂');
+----------------------------+
| LENGTH('Hello world!🙂') |
+----------------------------+
| 16 |
+----------------------------+JSON Functions
JSON Functions can be used with JSON columns and string arguments with the utf8mb4 character set.JSON column that supports the utf8mb4 character set.
CREATE TABLE events (name VARCHAR (20), registrations INT, comments JSON COLLATE utf8mb4_general_ci);INSERT events VALUES ("Swimming",50,'{"Registration closed":"✅"}'), ("Biking",28,'{"Registration is open":"⏸"}'), ("Powerlifting",22,'{"Registration is open":"⏸"}');SELECT JSON_AGG(comments) FROM events;
+-----------------------------------------------------------------------------------------------+
| JSON_AGG(comments) |
+-----------------------------------------------------------------------------------------------+
| [{"Registration is open":"⏸"},{"Registration closed":"✅"},{"Registration is open":"⏸"}] |
+-----------------------------------------------------------------------------------------------+Procedural Extensions
In procedural extensions such as stored procedures and user-defined functions, you can use parameters and variables withutf8mb4 Unicode characters.utf8mb4 Unicode characters.
SingleStore Pipelines
SingleStore Pipelines can ingest and process data with the utf8mb4 character set from the supported data sources.utf8mb4 character set.
LOAD DATA
The LOAD DATA statement allows you to import files with any supported character set, including utf8mb4, into SingleStore.utf8mb4 character set.
Last modified: October 16, 2025