Regular Expression Functions

Using Regular Expression Commands

Suppose you want to find the starting position of a string within another string. Call the former the search string and the latter the target string. To find dog within The dog chased the cat, you would run the query SELECT INSTR('The dog chased the cat','dog');. This query returns 5, indicating the position where it found dog.

But INSTR does not allow search strings that contain placeholders. The following examples show search strings containing placeholders, denoted by the ? symbol.

Search String

Placeholder

The dog chased the ?

Any string

The ? chased the cat

Any three letter string

The dog ? the cat

The string chased or found

Regular expressions enable you to define search strings with placeholders. You specify these strings using patterns. For instance, the pattern The .{3,5} chased the mouse|frog matches many target strings. Two of them are The dog chased the frog and The rat chased the mouse.

Patterns contain literals and metacharacters. Literals are characters that a pattern matcher attempts to locate in the target string exactly as the characters are written in the search string. Metacharacters describe the placeholders that the pattern uses.

The pattern The .{3,5} chased the mouse|frog contains the literals The,chased,the,mouse and frog. It contains the metacharacters .,{3,5} and |.

Regular Expression Functions

SingleStore has five built-in functions that allow you to work with regular expressions.

  • REGEXP: Searches a string for a regular expression pattern and returns 1 if the string is found, 0 otherwise.

  • RLIKE: Has the same functionality as REGEXP.

  • REGEXP_INSTR: Searches a string for a regular expression pattern and returns an integer indicating the beginning position or ending position of the matched substring.

  • REGEXP_SUBSTR: Searches a string for a regular expression pattern and returns the matching substring.

  • REGEXP_REPLACE: Searches a string for the first occurrence of a regular expression pattern and replaces that occurrence with a replacement string. Can also perform a search and replace of all occurrences.

Regular Expression Format - Advanced Regular Expression (ARE)

SingleStore supports advanced regular expressions (ARE) as defined in PostgreSQL. ARE are nearly a superset of ERE and support common Perl regular expression extensions.

You can use the ARE format with the built-in SingleStore functions RLIKE, REGEXP,REGEXP_INSTR,REGEXP_SUBSTR, and REGEXP_REPLACE, which are described in the next section.

Set the global variable regexp_format to specify the regular expression format to be used by the built-in functions you call.

To use ARE globally, run:

SET GLOBAL regexp_format=advanced;

You should set regexp_format to 'advanced' if you are developing new regular expression logic.

Advanced Regular Expression (ARE) Support

SingleStore supports ARE as defined in PostgreSQL. This support is summarized below.

Backreferences

A pattern containing a backreference matches the same string that it matched earlier in the string. The earlier string must be enclosed in parentheses. A pattern may contain multiple backreferences that are denoted by \\n, where n is the number of the backreference.

For example, the pattern a(bc)d(ef)g\\1\\2 matches abcdefgbcef.

Escapes

Escapes make it easier to write some types of patterns. Common escapes are listed below. See section 9.7.3.3 of the PostgreSQL pattern matching documentation for a full explanation of the supported escapes.

Metacharacter

Meaning

\d

Matches any digit.

\s

Matches a space.

\w

Matches any alpha-numeric character including the underscore.

\D

Matches any character except a digit.

\S

Matches any character except a space.

\W

Matches any non-alpha-numeric character. Does not match the underscore.

\m

Matches the beginning of a word.

\M

Matches the end of a word.

Note

In PostgreSQL, escapes in patterns begin with one \ character.

In SingleStore, escapes in patterns also begin with one \ character. Additionally, a \ literal in an SingleStore string needs to escaped with a leading \. Hence, pattern strings containing escapes use two \ characters. For example, SELECT REGEXP_INSTR('dog58','\\d');

In this section

Last modified: July 29, 2024

Was this article helpful?