SPLIT

Splits a string by the specified separator into an array of substrings and then returns the array.

Syntax

SPLIT (string_to_split [, separator_text])
for binary objects:
SPLIT (blob_to_split , separator_blob) returns ARRAY(blob)

Arguments

  • string_to_split: any string value or a column name. This argument specifies the string to split by the specified separator text.

  • separator_text: any string value. This argument specifies the delimiter that separates the string.

Return Type

Array with string values.

Remarks

  • If the separator is not specified, any amount of white space in the string (blank space, tab, newline, carriage return) serves as the separator.

  • If the first argument is a binary object, the second argument is required and must be binary.

  • The SPLIT() function returns a non-scalar value (array).

  • For string output, create a UDF (user defined function) to convert the output array into a string.

Example

The following example demonstrates the usage of SPLIT() function through UDFs. The UDFs capture the result of the SPLIT() function and convert the array output to a scalar value.

DELIMITER //
CREATE OR REPLACE FUNCTION array_as_string(a ARRAY(VARCHAR(255)) NULL)
RETURNS VARCHAR(255) AS
DECLARE
result VARCHAR(255);
BEGIN
IF isnull(a) THEN
result = "NULL";
ELSE
result = "Values: [";
FOR i IN 0 .. LENGTH(a) - 1 LOOP
IF i < LENGTH(a) - 1 THEN
result = CONCAT(result, a[i], ", ");
ELSE
result = CONCAT(result, a[i], "");
END IF;
END LOOP;
END IF;
RETURN CONCAT(result, "]");
END //
-- Regular string split with delimiter (does not trim whitespaces)
CREATE OR REPLACE FUNCTION string_split(str VARCHAR(255) NULL, delim VARCHAR(255) NULL)
RETURNS varchar(255) AS
DECLARE splitArr array(varchar(255)) NULL;
result varchar(255);
BEGIN
splitArr = SPLIT(str, delim);
result = array_as_string(splitArr);
return result;
END//
-- Unlimited whitespace delimited splitting (trims multiple whitespaces)
CREATE OR REPLACE FUNCTION string_split_ws(str VARCHAR(255) NULL)
RETURNS varchar(255) AS
DECLARE
splitArr array(varchar(255)) NULL;
result varchar(255);
BEGIN
splitArr = SPLIT(str);
result = array_as_string(splitArr);
RETURN result;
END//
DELIMITER ;

The SPLIT() function splits the string str and the result is stored in the splitArr array. The array_as_string() function then normalizes the result (splitArr) from an array to a scalar value.

Using the UDF string_split defined above, the following example uses the string and as the separator.

SELECT string_split("Thomas and Mary and Stan", "and");
+----------------------------------+
| RESULT                           |
+----------------------------------+
| Values: [Thomas ,  Mary ,  Stan] |
+----------------------------------+
1 row in set (0.36 sec)

The following example splits the string using the string_split_ws function and whitespace as delimiters.

SELECT string_split_ws("Thomas and Mary
Stan and Clark
Aaron and Drew
Simon and Darryl" ) as RESULT;
+-------------------------------------------------------------------------------------+
| RESULT                                                                              |
+-------------------------------------------------------------------------------------+
| Values: [Thomas, and, Mary, Stan, and, Clark, Aaron, and, Drew, Simon, and, Darryl] |
+-------------------------------------------------------------------------------------+
1 row in set (0.39 sec)

The following example shows how to use SPLIT to extract a list element.

SELECT (SPLIT('1,2,3',','):>ARRAY(text))[1];
+--------------------------------------+
| (split('1,2,3',','):>array(text))[1] |
+--------------------------------------+
| 2                                    |
+--------------------------------------+

Caution

Implicit Collation

When character_set_server is set to utf8, string literals with characters using 4-byte encoding are implicitly assigned binary collation and processed as a sequence of bytes rather than characters. This implicit conversion to binary collation causes string functions to return unexpected results. To avoid using implicit binary collation, either use explicit type casting or use database columns defined with the utf8mb4 character set.

Last modified: February 27, 2023

Was this article helpful?