REGEXP_SUBSTR
Searches a string for a regular expression pattern and returns the matching substring.
REGEXP_SUBSTR
supports the extended and advanced regular expression formats. You can set the global variable regexp_format
to specify which format you want to use.
Syntax
REGEXP_SUBSTR ( source_string, pattern [, position [, occurrence [, parameters ] ] ] )
Arguments
source_string
A string expression, such as a column name, to be searched.
pattern
A string literal representing a regular expression.
position
Optional if occurrence
and parameters
are not specified.
A positive integer indicating the position within source_string
to begin searching. The position is based on the number of characters, not bytes, so that multibyte characters are counted as single characters. The default is 1
. If position
is less than 1
, the search begins at the first character of source_string
. If position
is greater than the number of characters in source_string
, the result is NULL
.
occurrence
Optional if parameters
is not specified.
A positive integer indicating which occurrence of pattern
in source_string
to return. The search for the occurrence begins at position
. If occurrence
is not present, defaults to 1
. If occurrence
is less than 1
or greater than the number of characters in source_string
, the search is ignored and REGEXP_SUBSTR
returns NULL
.
parameters
Optional. If present, has one of the following values:
c
: Perform case-sensitive matching.
i
: Perform case-insensitive matching.
If c
or i
is not specified, performs case-sensitive matching.
Remarks
REGEXP_SUBSTR
returns NULL
if no match is found, source_string
is NULL
, or pattern
is NULL
.
Examples
The following examples demonstrate how REGEXP_SUBSTR
works when the global variable regexp_format
is set to 'extended'
.
Call to | Result | Comments |
---|---|---|
|
| Returns the substring |
|
| Returns the character preceding the first occurrence of “o”, followed by the substring “o”. |
|
| Returns the first digit. |
|
| Returns the first alphabet. |
|
| Looks for |
|
| Returns the substring |
|
| Searches for |
|
| Returns the first occurrence of |
|
| Returns the second occurrence of |
|
| Returns the first occurrence of |
|
| Returns the first occurrence of |
The following examples demonstrate how REGEXP_SUBSTR
works when the global variable regexp_format
is set to 'advanced'
.
Call to | Result | Comments |
---|---|---|
|
| Returns the first digit. |
|
| Returns the first non-digit. |
|
| Returns the first alpha-numeric character. |
|
| Returns the first non-alpha-numeric character. |
|
| Returns the first occurrence of the word |
Using REGEXP_SUBSTR
With a Table
The following example shows how to call REGEXP_SUBSTR
once per table row. This example uses extended regular expression matching. Advanced regular expression matching could be used as well.
SET GLOBAL regexp_format = 'extended'; DROP DATABASE IF EXISTS singlestore_docs_example; CREATE DATABASE singlestore_docs_example; use singlestore_docs_example; CREATE TABLE regexample_table(id INT, text_field TEXT); INSERT INTO regexample_table VALUES (1, 'the'), (2, 'athe'), (3, 'abthe'), (4, 'abcthe'), (5, 'thethe'); SELECT id, text_field, REGEXP_SUBSTR(text_field,'..the') FROM regexample_table ORDER BY id; **** +------+------------+-----------------------------------+ | id | text_field | REGEXP_SUBSTR(text_field,'..the') | +------+------------+-----------------------------------+ | 1 | the | NULL | | 2 | athe | NULL | | 3 | abthe | abthe | | 4 | abcthe | bcthe | | 5 | thethe | hethe | +------+------------+-----------------------------------+