REGEXP_INSTR

Searches a string for a regular expression pattern and returns an integer indicating the beginning position or ending position of the matched substring.

REGEXP_INSTR supports the extended and advanced regular expression formats. You can set the global variable regexp_format to specify which format you want to use.

Syntax

REGEXP_INSTR (<source_string>, <pattern> [, <position> [, <occurrence> [, <option>
 [, <parameters> ] ] ] ] )

Arguments

  • source_string: A string expression, such as a column name, to be searched.

  • pattern: A string literal representing a regular expression.

  • position: Optional if occurrence,option and parameters are not specified.

  • A positive integer indicating the position within source_string to begin searching. The position is based on the number of characters, not bytes, so that multibyte characters are counted as single characters. The default is 1. If position is less than 1, the search begins at the first character of source_string. If position is greater than the number of characters in source_string, the result is 0.

  • occurrence: Optional if option and parameters are not specified. This is how to specify the "nth occurrence" of a string.

    • A positive integer indicating which occurrence of pattern in source_string to return. The search for the occurrence begins at position.

    • If occurrence is not present, defaults to 1.

    • If occurrence is less than 1 or greater than the number of characters in source_string, the search is ignored and REGEXP_INSTR returns 0.

  • option: Optional if parameters are not specified. If not present, defaults to 0. If present, has one of the following values.

    • 0: Returns the position of the first character of the match.

    • 1: Returns the position of the first character following the end of the match.

    • Any other value: Has the same behavior when the value is 1.

  • parameters: Optional. If present, has one of the following values.

    • c: Perform case-sensitive matching.

    • i: Perform case-insensitive matching.

    • If c or i is not specified, performs case-sensitive matching.

Remarks

  • If no match is found, then REGEXP_INSTR returns 0. If the source_string is NULL or the pattern is NULL, then REGEXP_INSTR returns NULL.

Examples

The following examples demonstrate how REGEXP_INSTR works when the global variable regexp_format is set to 'extended'.

Call to REGEXP_INSTR

Result

Comments

SELECT REGEXP_INSTR('adog','o');

3

Returns the first position of o.

SELECT REGEXP_INSTR('adog','.o');

2

Returns the starting position of the first occurrence of a character followed by o.

SELECT REGEXP_INSTR('dog58','[[:digit:]]');

4

Returns the position of the first digit.

SELECT REGEXP_INSTR('1dogbat','[[:alpha:]]');

2

Returns the position of the first alpha character.

SELECT REGEXP_INSTR('rat','b|c');

0

Looks for b or c. Since neither character is found, returns 0.

SELECT REGEXP_INSTR('mousecat','cat$');

6

Returns the starting position of cat at the end of the string.

SELECT REGEXP_INSTR('2dogcat','^dog');

0

Looks for dog at the beginning of the string. Since dog is not found at the beginning of the string, returns 0.

SELECT REGEXP_INSTR('catcat','ca.',1);

1

Returns the starting position of the first occurrence of ca followed by any character.

SELECT REGEXP_INSTR('catcat','cat',1,2);

4

Returns the starting position of the second occurrence of cat.

SELECT REGEXP_INSTR('catcat','cat',1,2,0);

4

Returns the starting position of the second occurrence of cat.

SELECT REGEXP_INSTR('catcat','cat',1,2,1);

7

Returns the ending position + 1 of the second occurrence of cat.

SELECT REGEXP_INSTR('bat3BaT','BaT',1,1,0,'c');

5

Returns the starting position of the first occurrence of BaT.

SELECT REGEXP_INSTR('bat3BaT','BaT',1,1,0,'i');

1

Returns the starting position of the first occurrence of BaT. Does a case-insensitive search.

The following examples demonstrate how REGEXP_INSTR works when the global variable regexp_format is set to 'advanced'.

Call to REGEXP_INSTR

Result

Comments

SELECT REGEXP_INSTR('dog58','\\d');

4

Returns the position of the first digit.

SELECT REGEXP_INSTR('dog58','\\D');

1

Returns the position of the first non-digit.

SELECT REGEXP_INSTR('%26dog','\\w');

2

Returns the position of the first alpha-numeric character.

SELECT REGEXP_INSTR('%26dog','\\W');

1

Returns the position of the first non-alpha-numeric character.

SELECT REGEXP_INSTR('cat dog','\\mdog');

5

Returns the starting position of the first occurrence of the word dog.

Using REGEXP_INSTR With a Table

The following example shows how to call REGEXP_INSTR once per table row. This example uses extended regular expression matching. Advanced regular expression matching could be used as well.

SET GLOBAL regexp_format = 'extended';
DROP DATABASE IF EXISTS singlestore_docs_example;
CREATE DATABASE singlestore_docs_example;
use singlestore_docs_example;
CREATE TABLE example_table(id INT, text_field TEXT);
INSERT INTO example_table VALUES (1, 'the'), (2, 'athe'), (3, 'abthe'),
(4, 'abcthe'), (5, 'thethe');
SELECT id, text_field, REGEXP_INSTR(text_field,'..the') FROM example_table
ORDER BY id;
+------+------------+----------------------------------+
| id   | text_field | REGEXP_INSTR(text_field,'..the') |
+------+------------+----------------------------------+
|    1 | the        |                                0 |
|    2 | athe       |                                0 |
|    3 | abthe      |                                1 |
|    4 | abcthe     |                                2 |
|    5 | thethe     |                                2 |
+------+------------+----------------------------------+

And this example shows how to find the beginning of the 2nd occurrence of "the" in a row in the same table:

SELECT id, text_field, REGEXP_INSTR(text_field,'the',1,2) FROM example_table
ORDER BY id;
+------+------------+------------------------------------+
| id   | text_field | REGEXP_INSTR(text_field,'the',1,2) |
+------+------------+------------------------------------+
|    1 | the        |                                  0 |
|    2 | athe       |                                  0 |
|    3 | abthe      |                                  0 |
|    4 | abcthe     |                                  0 |
|    5 | thethe     |                                  4 |
+------+------------+------------------------------------+

Last modified: December 6, 2023

Was this article helpful?