# Character Set and Collation Override

Although the server character set and collation are applied cluster-wide, you can override the server configurations in procedural extensions, tables, columns, and expressions within each database. You can override the cluster-wide character set and collation when using the `LOAD DATA` statement.

## Table-Level Override

The `COLLATE` clause in the `CREATE TABLE` statement allows you to override the server collation and hence override the server character set.

```sql
CREATE [ROWSTORE] TABLE <table_name> (column_name {<column_definition>}) 
COLLATE <collation_name>;

```

**Examples**

In the following example, the `results` table uses `utf8mb4_general_ci` collation overriding the server collation of `utf8_general_ci`.

```sql
SET collation_server = utf8_general_ci;
 
CREATE TABLE results (student_id INT, grade CHAR(5), pass VARCHAR(5)) COLLATE utf8mb4_general_ci;
 
INSERT results VALUES(1, "B+", "👍");
 
SELECT * FROM results;

```

```output

+------------+-------+------+
| student_id | grade | pass |
+------------+-------+------+
|          1 | B+    | 👍   |
+------------+-------+------+
1 row in set (0.04 sec)
```

In the following example, a table introduced within a stored procedure uses `utf8` collation overriding the server collation of `utf8mb4_general_ci`.

```sql
SET collation_server='utf8mb4_general_ci';

 
DELIMITER //
CREATE OR REPLACE PROCEDURE sample(id INT, city VARCHAR(30)) 
AS
/* A variable uses utf8 overriding the server character set*/
DECLARE
	location TEXT = UCASE(city);
/* A table uses utf8 overriding the server character set and collation */
BEGIN
	CREATE TABLE customers(id INT, city VARCHAR (30) COLLATE utf8_general_ci);
	INSERT INTO customers VALUES(id, location);
END //
DELIMITER ;
 
CALL sample(1, "london 🗺");
 
SELECT * FROM customers;

```

```output

+------+---------+
| id   | city    |
+------+---------+
|    1 | LONDON  |
+------+---------+
1 row in set (0.02 sec)
```

To view the character set used by a table, issue the following command.

```sql
SHOW CREATE TABLE <table_name>;

```

The table character set and collation are used as default values for the columns in the table. You can, however, override the table settings at the column level. For more information, see [Column-level Override](https://docs.singlestore.com/#section-idm4632509205440032485684403804.md).

> **❗ Important**: Table character set and collation values cannot be specified using the `ALTER TABLE` statement.

## Column-Level Override

You can override the character set and collation for table columns using the `COLLATE` clause in the column definition of the `CREATE TABLE` statement. For example, you can specify the character set and the collation for `VARCHAR` and `JSON` columns using the following syntax.

```sql
CREATE [ROWSTORE] TABLE <table_name> (id INT, 
<column_name> VARCHAR(<length>) COLLATE <collation_name>, 
<column_name> JSON COLLATE <collation_name>);

```

**Examples**

In the following example, the `comments.response` column uses `utf8mb4_general_ci` overriding the table collation of `utf8_general_ci`.

```sql
CREATE TABLE response (message VARCHAR (20), 
comments JSON COLLATE utf8mb4_general_ci) COLLATE utf8_general_ci;
 
INSERT response VALUES ('Available 👍','{"Good to meet next week":"🙂"}'), 
('Cannot say 😐','{"Prefer to meet this week":"🙁"}'), ('Unavailable 👎','{"Planned time off":"🙁"}');

SELECT * FROM response ORDER BY message;

```

```output

+----------------+-------------------------------------+
| message        | comments                            |
+----------------+-------------------------------------+
| Available      | {"Good to meet next week":"🙂"}      |
| Cannot say     | {"Prefer to meet this week":"🙁"}    |
| Unavailable    | {"Planned time off":"🙁"}            |
+----------------+-------------------------------------+

```

> **❗ Important**: If no collation is specified in the column definition, `JSON` columns use `utf8mb4_bin` by default regardless of the value set for the [collation engine variable](https://docs.singlestore.com/db/v9.1/reference/sql-reference/character-encoding/specifying-character-set-and-collation-for-clusters/#section-idm454211260567043248567042014.md). The default character set is `utf8mb4`.

In the following example, the `Status.delivery` column uses `utf8mb4_general_ci`, overriding the server collation of `utf8_bin`.

```sql
SET collation_server = utf8_bin;

CREATE TABLE delivery (ID INT, Status ENUM('NOT_DISPATCHED','DELIVERED ✅','IN_TRANSIT') COLLATE utf8mb4_general_ci);
 
INSERT INTO delivery VALUES(567,1);

INSERT INTO delivery VALUES(472,'DELIVERED ✅');

INSERT INTO delivery(ID) VALUES(323);

SELECT * FROM delivery ORDER BY ID ASC;

```

```output

+------+----------------+
| ID   | Status         |
+------+----------------+
| 472  | DELIVERED ✅   |
| 323  | NULL           |
| 567  | NOT_DISPATCHED |
+------+----------------+
```

To view the character sets used by columns in a table, run the following command.

```sql
SHOW CREATE TABLE <table_name>;
```

## Modifying the Collation of Columns

In rowstores, you can additionally use the `ALTER TABLE` statement to modify the collation, and in turn, modify the character set of existing columns.

```sql
CREATE ROWSTORE TABLE posts(comments VARCHAR(50) COLLATE utf8_general_ci);
 
ALTER TABLE posts MODIFY comments VARCHAR(50) COLLATE utf8mb4_general_ci;

```

> **❗ Important**: - In columnstores, the collation of existing columns cannot be modified.
> - In rowstores, when the character set of existing columns is modified from `utf8mb4` to `utf8`, the column entries will be truncated to the first string of valid `utf8` Unicode characters.

## Expression-Level Override

The following example uses the collation clause with the cast (:>) operator to control the GROUP BY clause based on case sensitivity.

Collate `utf8_bin` compares strings based solely on their Unicode code point values. Collate `utf8_general_ci` applies Unicode normalization using language-specific rules and compares strings for case-insensitively.

```sql
CREATE TABLE grades (student_id INT, Lastname varchar(25), 
grade CHAR(5)) COLLATE utf8mb4_general_ci;

INSERT grades values(751, "williams", "A"),
("184", "Williams", "B"),
("938", "Browning", "D"),
("260", "Anderson", "C");

SELECT * FROM grades;

```

```output

+--------------------------------+
| student_id | Lastname  | Grade |
+--------------------------------+
| 184        | Williams  | B     |
+--------------------------------+
| 938        | Browning  | D     |
+--------------------------------+
| 260        | Anderson  | C     |
+--------------------------------+
| 751        | williams  | A     |
+--------------------------------+ 

```

When using `utf8_bin` the results will display all last names from the table regardless of capitalization.

```sql
SELECT lastname :> text COLLATE utf8_bin FROM grades GROUP BY 1;


```

```output

+---------------------------------+
| lastname:>text collate utf8_bin |
|+--------------------------------+
| Williams                        |
+---------------------------------+
| Browning                        |
+---------------------------------+
| Anderson                        |
+---------------------------------+
| williams                        |
+---------------------------------+

```

When using `utf8_general_ci` the results will display only uppercase last names from the table.

```sql
SELECT lastname :> text COLLATE utf8_general_ci FROM grades GROUP BY 1;


```

```output

+-----------------------------------------+
| lastname:>text collate utf8_general_ci  |
|+----------------------------------------+
| Williams                                |
+-----------------------------------------+
| Browning                                |
+-----------------------------------------+
| Anderson                                |
+-----------------------------------------+
```

When using expressions with `JSON` columns, use the `:>` operator to specify a collation for different expressions in a query.

```sql
:> text COLLATE <colation_name>

```

The following example uses a binary collation (`utf8_bin`) for the first expression and a case-insensitive collation (`utf8_general_ci`) for the second expression.

```sql
SELECT * FROM sets
WHERE sets.json_field::$x :> text COLLATE utf8_bin = "string1"
AND sets.json_field::$y :> text COLLATE utf8_general_ci = "string2";
```

> **❗ Important**: If the collation is not specified for expressions, the column character set and collation are used, which may rely on the table settings or the server settings.

## Setting Collation for String Literals

You can set the collation for string literals explicitly.

```sql
SELECT "My string" COLLATE utf8mb4_unicode_ci;
```

## Procedural Extension-Level Override

Parameters and variables in [Procedural Extensions](https://docs.singlestore.com/db/v9.1/developer-resources/procedural-extensions.md) (stored procedures and user-defined functions) use the server character set and collation by default. The `COLLATE` clause in the `CREATE PROCEDURE` and the `CREATE FUNCTION` statements allows you to override the server collation, and in turn, override the server character set for parameters and variables.

Use the following syntax to override the collation for parameters and variables within stored procedures.

```sql
CREATE PROCEDURE procedure_name ( [parameter_name data_type COLLATE collation_name], [...n] ) 
...

```

```sql
CREATE PROCEDURE procedure_name ( [parameter_list] ) 
... 
[DECLARE variable_name type_expression COLLATE collation_name [= value_expression ] ; [... ;] [ ...n ] 
...

```

Similarly, you can use the following syntax to override the collation for parameters and variables in user-defined functions.

```sql
CREATE FUNCTION function_name(id INT, product VARCHAR(20) COLLATE utf8_general_ci);
…

```

```sql
CREATE FUNCTION single_parameter_example(a INT) AS
  DECLARE
    num INT = 0;
    str VARCHAR(255) COLLATE utf8mb4_general_ci;
  ...
```

**Examples**

In the following stored procedure, a parameter uses `utf8` overriding the server character set of `utf8mb4`.

```sql
SET collation_server='utf8mb4_general_ci';
 
/* A parameter uses utf8 overriding the server character set */
DELIMITER //
CREATE OR REPLACE PROCEDURE sample(id INT, city VARCHAR(30) COLLATE utf8_general_ci) 
AS
DECLARE
	location TEXT = UCASE(city);
BEGIN
	CREATE TABLE customers(id INT, city VARCHAR(30) COLLATE utf8_general_ci);
	INSERT INTO customers VALUES(id, location);
END //
DELIMITER ;

 
CALL sample(1, "london 🗺");
 
SELECT * FROM customers;

```

```output

+------+--------+
| id   | city   |
+------+--------+
|    1 | LONDON |
+------+--------+
1 row in set (0.02 sec)
```

In the following user-defined function, a variable uses `utf8` overriding the server character set of `utf8mb4`.

```sql
SET collation_server='utf8mb4_general_ci';

DELIMITER //
 
CREATE OR REPLACE FUNCTION getVersion() RETURNS VARCHAR(255) AS
DECLARE
  var1 VARCHAR(255);
  var2 VARCHAR(255);
  var3 VARCHAR(10) COLLATE utf8_general_ci;
  result VARCHAR(255);
BEGIN
  var1 = @@memsql_version;
  var2 = @@version_compile_os;
  var3 = "👾";
  result = CONCAT("OS: ", var2, "; Version: ", var1, " ", var3);
RETURN result;
END //
 
DELIMITER ;
 
SELECT getVersion() AS Info;

```

```output

+----------------------------+
| Info                       |
+----------------------------+
| OS: Linux; Version: 7.5.1  |
+----------------------------+
1 row in set (0.01 sec)
```

> **❗ Important**: The server character set and collation can be overridden for tables and columns introduced within procedural extensions. For more information, see [Table-Level Override](https://docs.singlestore.com/#section-idm4550248527876832485681353744.md) and [Column-Level Override](https://docs.singlestore.com/#section-idm4632509205440032485684403804.md).

## LOAD DATA

The [LOAD DATA](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/load-data.md) statement allows you to specify a character set for the files to be imported. Use the following syntax.

```sql
LOAD DATA [LOCAL] INFILE '<file_name>'> INTO TABLE '<table_name>' CHARACTER SET <character_set_name>;

```

***

Modified at: November 14, 2025

Source: [/db/v9.1/reference/sql-reference/character-encoding/character-set-and-collation-override/](https://docs.singlestore.com/db/v9.1/reference/sql-reference/character-encoding/character-set-and-collation-override/)

(An index of the documentation is available at /llms.txt)
