Collations Supported
On this page
Each character set supported in SingleStore can have multiple collations with one default collation per character set.SHOW COLLATION
command.SELECT
statement with optional LIKE
and WHERE
clauses to yield specific results from the COLLATIONS
view.
SELECT * FROM INFORMATION_SCHEMA.COLLATIONS WHERE CHARACTER_SET_NAME = 'utf8mb4' LIMIT 5;
+----------------------+--------------------+-----+------------+-------------+---------+
| COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN |
+----------------------+--------------------+-----+------------+-------------+---------+
| utf8mb4_general_ci | utf8mb4 | 45 | Yes | Yes | 1 |
| utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 |
| utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 8 |
| utf8mb4_icelandic_ci | utf8mb4 | 225 | | Yes | 8 |
| utf8mb4_latvian_ci | utf8mb4 | 226 | | Yes | 8 |
+----------------------+--------------------+-----+------------+-------------+---------+
To view the mapping between collations and character sets, issue the following command.
SELECT * FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY;
When applying a character set, the default collation of the character set is automatically assigned unless specified otherwise.Yes
value in the IS_
field.CHARACTER_
view, or from the results of the SHOW COLLATION
or the SHOW CHARACTER SET
command.
By default, SingleStore uses the utf8mb4
character set along with its default collation utf8mb4_
.utf8mb4_
collation.@@
in a SELECT
for one of the following collation variables: collation_
, collation_
, and collation_
.
SELECT @@character_set_server;
+------------------------+
| @@character_set_server |
+------------------------+
| utf8mb4 |
+------------------------+
SELECT @@collation_server;
+--------------------+
| @@collation_server |
+--------------------+
| utf8mb4_general_ci |
+--------------------+
Collation Naming Conventions
The collation names are prefixed with their associated character set name, typically followed by one or more suffixes representing the collation properties.utf8_
contain the language as a suffix and then the _
suffix to indicate case-insensitivity.
Similarly, there are suffixes to indicate case-sensitivity (_
), accent-insensitivity (_
), and accent-sensitivity (_
).*_
collation, which is case-sensitive._
or _
, _
in the name also implies _
and _
also implies _
.utf8mb4_
is explicitly case-insensitive and implicitly accent-insensitive (See also Collation Demonstration).binary
collation, which has no suffixes.
The _
suffix indicates binary, which is different from the binary
collation.binary
collation is the default and the only collation of the binary
character set._
suffix define comparisons based on numeric character code values, which differ from byte values for multibyte characters.
Last modified: June 5, 2024