SPLIT
Splits a string by the specified separator into an array of substrings and then returns the array.
Syntax
SPLIT (string_to_split [, separator_text]) for binary objects: SPLIT (blob_to_split , separator_blob) returns ARRAY(blob)
Arguments
string_to_split
: any string value or a column name. This argument specifies the string to split by the specified separator text.separator_text
: any string value. This argument specifies the delimiter that separates the string.
Return Type
Array with string values.
Remarks
If the separator is not specified, any amount of white space in the string (blank space, tab, newline, carriage return) serves as the separator.
If the first argument is a binary object, the second argument is required and must be binary.
The
SPLIT()
function returns a non-scalar value (array).For string output, create a UDF (user defined function) to convert the output array into a string.
Example
The following example demonstrates the usage of SPLIT()
function through UDFs. The UDFs capture the result of the SPLIT()
function and convert the array output to a scalar value.
DELIMITER // CREATE OR REPLACE FUNCTION array_as_string(a ARRAY(VARCHAR(255)) NULL) RETURNS VARCHAR(255) AS DECLARE result VARCHAR(255); BEGIN IF isnull(a) THEN result = "NULL"; ELSE result = "Values: ["; FOR i IN 0 .. LENGTH(a) - 1 LOOP IF i < LENGTH(a) - 1 THEN result = CONCAT(result, a[i], ", "); ELSE result = CONCAT(result, a[i], ""); END IF; END LOOP; END IF; RETURN CONCAT(result, "]"); END // -- Regular string split with delimiter (does not trim whitespaces) CREATE OR REPLACE FUNCTION string_split(str VARCHAR(255) NULL, delim VARCHAR(255) NULL) RETURNS varchar(255) AS DECLARE splitArr array(varchar(255)) NULL; result varchar(255); BEGIN splitArr = SPLIT(str, delim); result = array_as_string(splitArr); return result; END// -- Unlimited whitespace delimited splitting (trims multiple whitespaces) CREATE OR REPLACE FUNCTION string_split_ws(str VARCHAR(255) NULL) RETURNS varchar(255) AS DECLARE splitArr array(varchar(255)) NULL; result varchar(255); BEGIN splitArr = SPLIT(str); result = array_as_string(splitArr); RETURN result; END// DELIMITER ;
The SPLIT()
function splits the string str and the result is stored in the splitArr array. The array_as_string()
function then normalizes the result (splitArr) from an array to a scalar value.
Using the UDF string_split
defined above, the following example uses the string “and” as the separator.
SELECT string_split("Thomas and Mary and Stan", "and"); **** +----------------------------------+ | RESULT | +----------------------------------+ | Values: [Thomas , Mary , Stan] | +----------------------------------+ 1 row in set (0.36 sec)
The following example splits the string using the string_split_ws
function and whitespace as delimiters.
SELECT string_split_ws("Thomas and Mary Stan and Clark Aaron and Drew Simon and Darryl" ) as RESULT; **** +-------------------------------------------------------------------------------------+ | RESULT | +-------------------------------------------------------------------------------------+ | Values: [Thomas, and, Mary, Stan, and, Clark, Aaron, and, Drew, Simon, and, Darryl] | +-------------------------------------------------------------------------------------+ 1 row in set (0.39 sec)
The following example shows how to use SPLIT
to extract a list element.
SELECT (SPLIT('1,2,3',','):>ARRAY(text))[1]; **** +--------------------------------------+ | (split('1,2,3',','):>array(text))[1] | +--------------------------------------+ | 2 | +--------------------------------------+
Implicit Collation
When character_set_server
is set to utf8
, string literals with characters using 4-byte encoding are implicitly assigned binary collation and processed as a sequence of bytes rather than characters. This implicit conversion to binary collation causes string functions to return unexpected results. To avoid using implicit binary collation, either use explicit type casting or use database columns defined with the utf8mb4
character set.
For more information, refer to Implicit Collation in Special Cases.