Warning
SingleStore 9.0 gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 8.9 is recommended for production workloads, which can later be upgraded to SingleStore 9.0.
REGEXP_ MATCH()
On this page
The REGEXP_
function returns a JSON array of matching substring(s) within the first match of a regular expression pattern to a string.
REGEXP_
supports the extended and advanced regular expression formats.regexp_
to specify which format you want to use.
Syntax
REGEXP_MATCH (<source_string>, <pattern> [, <parameters> ] ] )
Arguments
-
source_
: The string from which to extract substrings that match a regular expression.string -
pattern
: A string representing a regular expression. -
parameters
:An optional argument, if present, has one of the following values:-
c
: Perform case-sensitive matching. -
i
: Perform case-insensitive matching. -
If
c
ori
is not specified, performs case-sensitive matching.
-
Remarks
-
If no match is found, the
REGEXP_
result will be NULL.MATCH -
If a match is found, and the pattern does not contain parenthesized sub-expressions, the result is a single-element text array containing the substring that matches the entire pattern.
This result is similar to the REGEXP_ SUBSTR() function. -
If a match is found, and the pattern contains parenthesized sub-expressions, the result is a JSON array of strings, whose n'th element is the substring matching the n'th parenthesized sub-expression of the pattern (not counting "non-capturing" parentheses).
See the examples section for more details.
Examples
The following examples demonstrate how REGEXP_
works when the global variable REGEXP_
is set to advanced.
SET GLOBAL REGEXP_FORMAT = 'advanced';
Searches for a group of characters between the alpha and gamma in the substrings.
SELECT REGEXP_MATCH('alphabetagammadelta', 'alpha(.*)gamma');
+-------------------------------------------------------+
| REGEXP_MATCH('alphabetagammadelta', 'alpha(.*)gamma') |
+-------------------------------------------------------+
| ["beta"] |
+-------------------------------------------------------+
Searches for a group of digits, which occurs after the word gamma.
SELECT REGEXP_MATCH('alphabetagamma579XYZ', 'gamma([0-9]+)');
+-------------------------------------------------------+
| REGEXP_MATCH('alphabetagamma579XYZ', 'gamma([0-9]+)') |
+-------------------------------------------------------+
| ["579"] |
+-------------------------------------------------------+
There is no capturing group (i.
SELECT REGEXP_MATCH('alphabetagammadelta', 'alpha.*gamma');
+-----------------------------------------------------+
| REGEXP_MATCH('alphabetagammadelta', 'alpha.*gamma') |
+-----------------------------------------------------+
| ["alphabetagamma"] |
+-----------------------------------------------------+
This example matches data based on the each part of the date (i.
To use \d
expression inside the pattern, the regexp_
SELECT REGEXP_MATCH('04-08-2023', '(\\d{2})-(\\d{2})-(\\d{4})');
+----------------------------------------------------------+
| REGEXP_MATCH('04-08-2023', '(\\d{2})-(\\d{2})-(\\d{4})') |
+----------------------------------------------------------+
| ["04","08","2023"] |
+----------------------------------------------------------+
Returns the first occurrences of a group of lowercase characters followed by groups of digits.
SELECT REGEXP_MATCH('abc357cde123abc', '[a-z]+([0-9]+)');
+---------------------------------------------------+
| REGEXP_MATCH('abc357cde123abc', '[a-z]+([0-9]+)') |
+---------------------------------------------------+
| ["357"] |
+---------------------------------------------------+
Returns NULL since there is no match.
SELECT REGEXP_MATCH('alphabetagammadelta', 'alpha(.*)eplison');
+---------------------------------------------------------+
| REGEXP_MATCH('alphabetagammadelta', 'alpha(.*)eplison') |
+---------------------------------------------------------+
| NULL |
+---------------------------------------------------------+
This is a case-sensitive search.
SELECT REGEXP_MATCH('ALPHAbetagammadelta', 'alpha(.*)gamma', 'c');
+------------------------------------------------------------+
| REGEXP_MATCH('ALPHAbetagammadelta', 'alpha(.*)gamma', 'c') |
+------------------------------------------------------------+
| NULL |
+------------------------------------------------------------+
This is a case-sensitive search.
SELECT REGEXP_MATCH('ALPHAbetagammadelta', 'alpha(.*)gamma', 'i');
+------------------------------------------------------------+
| REGEXP_MATCH('ALPHAbetagammadelta', 'alpha(.*)gamma', 'i') |
+------------------------------------------------------------+
| ["beta"] |
+------------------------------------------------------------+
The first group is non-capturing (?:) but needs to be matched.
SELECT REGEXP_MATCH('barbeta', '(?:ba.)(beta)');
+------------------------------------------+
| REGEXP_MATCH('barbeta', '(?:ba.)(beta)') |
+------------------------------------------+
| ["beta"] |
+------------------------------------------+
The second group is optional if it was not captured.
SELECT REGEXP_MATCH('alphabarbeta', '(bar)(.+)?(beta)');
+--------------------------------------------------+
| REGEXP_MATCH('alphabarbeta', '(bar)(.+)?(beta)') |
+--------------------------------------------------+
| ["bar",null,"beta"] |
+--------------------------------------------------+
The '?='
expression is used for a positive lookahead; it is not a captured group.
SELECT REGEXP_MATCH('barbeta', '(bar)(?=beta)');
+------------------------------------------+
| REGEXP_MATCH('barbeta', '(bar)(?=beta)') |
+------------------------------------------+
| ["bar"] |
+------------------------------------------+
Extracting a Matched Group From a JSON Array in a Table
The REGEXP_
builtin returns a JSON array of matches.JSON_
builtin or shorthand ::$ syntax
.
Examples
Create a table and insert some values.
CREATE TABLE filenames(source_strings text);INSERT INTO filenames VALUES("file_expense_record.pdf"), ("file_24082023.pdf"),("inventory_file_new.pdf.tmp");
This statement extracts the first element from the array (note there is a zero (0) indexing).
SELECT REGEXP_MATCH(source_strings, '^(file.+)\\.pdf$') AS a, JSON_EXTRACT_STRING(a, 0) AS b FROM filenames;
+-------------------------+---------------------+
| a | b |
+-------------------------+---------------------+
| ["file_expense_record"] | file_expense_record |
| NULL | NULL |
| ["file_24082023"] | file_24082023 |
+-------------------------+---------------------+
This statement also extracts the first element but uses the shorthand format.
SELECT REGEXP_MATCH(source_strings, '^(file.+)\\.pdf$') rmatch, rmatch::$`0` FROM filenames;
+-------------------------+---------------------+
| rmatch | rmatch::$`0` |
+-------------------------+---------------------+
| ["file_expense_record"] | file_expense_record |
| NULL | NULL |
| ["file_24082023"] | file_24082023 |
+-------------------------+---------------------+
Last modified: August 24, 2023