MATCH
On this page
For columnstore tables created with a FULLTEXT index, the text columns in that table can be searched by using the MATCH AGAINST
syntax.
The result of the MATCH
statement is a relevancy score generally greater than or equal to 0 which indicates the quality of the match.
Important
To include recent inserts/updates from the hidden rowstore table in with the results, run OPTIMIZE TABLE tbl_
before running your query.
Refer to Working with Full-Text Search for more conceptual information on this feature.
Syntax
The full-text search version 2 MATCH
syntax is:
MATCH (TABLE <table_name>) AGAINST (<expression>)
The legacy full-text search MATCH
syntax is:
MATCH (<column1>,<column2>,...) AGAINST (<expression>)
The VERSION 2
refers to SingleStore's next generation full-text search process.USING VERSION 2
syntax must be used in the CREATE TABLE
command to utilize the full-text search version 2 process.
The columns specified in a MATCH
clause must be from the same table in the legacy full-text search process.MATCH
clauses to search against multiple tables using the version 2 process.
Note
The legacy full-text search version is deprecated.VERSION 2
for new development of applications that use full-text search.
VERSION 2
does not allow the column names of full-text columns to contain a $
character.$
character.
Operators
Full-text search version 2 supports operators listed on the Java Lucene full-text search string syntax page.
The AGAINST
expression consists of a mix of text with zero or more of the following operators.
Operator |
Description |
---|---|
(no operator) |
When no operator is specified, the word is optional; however, the rows that contain it are rated higher. |
+ |
A leading plus sign indicates that this word must be present in each row returned. |
- |
A leading minus sign indicates that this word must not be present in any of the rows that are returned. |
NOT |
The |
AND |
The |
OR |
The |
() |
Parentheses group words into subexpressions. |
*, ? |
See the Wildcard support section below. |
"" |
A phrase that is enclosed within double quote (") characters matches the words in the quotes as if it is a single word. |
~ |
The tilde symbol is used to support fuzzy searches. |
Note
SingleStore supports only constant expressions (search filters) inside the AGAINST
clause, for example, a regular expression like (Stock*)
or a constant string like "DBC".AGAINST
expressions at runtime.
Stopwords
Certain words are ignored by full-text search due to their commonality resulting in less relevant results.
a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not,
of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with
Wildcard Support
Single and multiple character wildcard searches within single terms are supported (not within phrase queries).
To perform a single character wildcard search use the ?
symbol and to perform a multiple character wildcard search use the *
symbol.
The single character wildcard search looks for terms that match that with the single character replaced.text
or test
you can use the search: te?t
Multiple character wildcard searches looks for zero or more characters.test*
You can also use the wildcard searches in the middle of a term.te*t
Important
Neither ?
or *
are supported at the beginning of a term.
Special Characters
Depending on the use case, FULLTEXT
matching may not always be compatible with match expressions involving special characters.FULLTEXT
removes special characters from the search.
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
Relevancy Score
The relevancy score of an expression in a MATCH
statement denotes the ranking of the expression based on the following factors:
-
Number of times an expression appears in a column.
More occurrences of an expression in the matched column(s) increases its relevancy score. -
Rarity of the expression.
Rare words have a higher relevancy score than commonly used words. -
The length of the column containing the expression.
A column with a short expression has a higher relevancy score than a column with a long expression.
MATCH and BM25 Scoring
Full-text searches using MATCH
calculate scores at the segment level, so MATCH
scores are relative to other documents within the same segment.BM25
scores are relative to other documents within a partition.MATCH
are more efficient than searches using BM25
, but also may return lower-quality results.
For example, when using MATCH
, two rows that contain identical text may receive different scores in the same query if those rows are stored in different segments.BM25
search function uses term and collection statistics from all documents in a partition, ensuring that scores are computed consistently and accurately across the entire set of rows in the partition.
Searching with MATCH
is more efficient.MATCH
when efficiency is important and accuracy of results across segments is less important.
Full-Text Search Version 2 Examples
Create a Table with a Full-Text Version 2 Index
This example creates a FULLTEXT
index for both the title column and the body column.MATCH (TABLE <table_
, and the index on the column will be applied.
CREATE TABLE articles (id INT UNSIGNED,year int UNSIGNED,title VARCHAR(200),body TEXT,SORT KEY (id),FULLTEXT USING VERSION 2 art_ft_index (title, body));
INSERT INTO articles (id, year, title, body) VALUES(1, 2021, 'Introduction to SQL', 'SQL is a standard language for accessing and manipulating databases.'),(2, 2022, 'Advanced SQL Techniques', 'Explore advanced techniques and functions in SQL for better data manipulation.'),(3, 2020, 'Database Optimization', 'Learn about various optimization techniques to improve database performance.'),(4, 2023, 'SQL in Web Development', 'Discover how SQL is used in web development to interact with databases.'),(5, 2019, 'Data Security in SQL', 'An overview of best practices for securing data in SQL databases.'),(6, 2021, 'SQL and Data Analysis', 'Using SQL for effective data analysis and reporting.'),(7, 2022, 'Introduction to Database Design', 'Fundamentals of designing a robust and scalable database.'),(8, 2020, 'SQL Performance Tuning', 'Tips and techniques for tuning SQL queries for better performance.'),(9, 2023, 'Using SQL with Python', 'Integrating SQL with Python for data science and automation tasks.'),(10, 2019, 'NoSQL vs SQL', 'A comparison of NoSQL and SQL databases and their use cases.'),(11, 2020, 'Real-time Data Analysis', 'An introduction to real-time analytics.'),(12, 2021, 'Analysis for Beginners', 'Simple examples of real time analytics.'),(13, 2023, 'Data-Dictionary Design', 'Create and maintain effective data dictionaries.'),(14, 2024, 'Scalable Performance', 'Designing for scalability.');OPTIMIZE TABLE articles FLUSH;
Search for a Term
SELECT *FROM articlesWHERE MATCH (TABLE articles) AGAINST ('body:database');
+----+------+---------------------------------+-----------------------------------------------------------------------------+
| id | year | title | body |
+----+------+---------------------------------+-----------------------------------------------------------------------------+
| 7 | 2022 | Introduction to Database Design | Fundamentals of designing a robust and scalable database. |
| 3 | 2020 | Database Optimization | Learn about various optimization techniques to improve database performance.|
+----+------+---------------------------------+-----------------------------------------------------------------------------+
Use MATCH Twice in One Query
SELECT titleFROM articlesWHERE MATCH (TABLE articles) AGAINST ('title:SQL OR body:("Business Intelligence")')AND MATCH (TABLE articles) AGAINST ('body:web*');
+------------------------+
| title |
+------------------------+
| SQL in Web Development |
+------------------------+
Use Boolean Operators and Wildcards
SELECT titleFROM articlesWHERE MATCH (TABLE articles) AGAINST ('title:(+Data*) OR title:function?');
+---------------------------------+
| title |
+---------------------------------+
| Introduction to Database Design |
| SQL and Data Analysis |
| Database Optimization |
| Data Security in SQL |
| Data-Dictionary Design |
| Real-time Data Analysis |
+---------------------------------+
Use a SQL Predicate
SELECT count(*)FROM articlesWHERE year = 2021AND MATCH (TABLE articles) AGAINST ('body:SQL');
+----------+
| count(*) |
+----------+
| 2 |
+----------+
Search for the Word 'database'
across Two Columns
SELECT *FROM articlesWHERE MATCH (TABLE articles)AGAINST ('title:database OR body:database');
+------+------+---------------------------------+------------------------------------------------------------------------------+
| id | year | title | body |
+------+------+---------------------------------+------------------------------------------------------------------------------+
| 3 | 2020 | Database Optimization | Learn about various optimization techniques to improve database performance. |
| 7 | 2022 | Introduction to Database Design | Fundamentals of designing a robust and scalable database. |
+------+------+---------------------------------+------------------------------------------------------------------------------+
Use a Single Boolean Operator
SELECT titleFROM articlesWHERE MATCH (TABLE articles)AGAINST ('title:Database OR title:"Business Intelligence"');
+---------------------------------+
| title |
+---------------------------------+
| Database Optimization |
| Introduction to Database Design |
+---------------------------------+
Use Multiple Boolean Operators
The following example returns the title of an article that contains either 'Database'
or 'Business Intelligence'
and the string 'real-time analytics'
in the body.
SELECT titleFROM articlesWHERE MATCH (TABLE articles) AGAINST ('(title:Database OR title:"Analysis") AND (body:"real-time analytics")');
+--------------------------+
| title |
+--------------------------+
| Analysis for Beginners |
| Real-time Data Analysis |
+--------------------------+
Use a Wildcard
This example uses the wildcard ’*’
to return all articles with titles starting with the string 'Intro'
such as Introduction.
SELECT id, titleFROM articlesWHERE MATCH(TABLE articles) AGAINST ('title:Intro*');
+------+---------------------------------+
| id | title |
+------+---------------------------------+
| 1 | Introduction to SQL |
| 7 | Introduction to Database Design |
+------+---------------------------------+
Create a Relevance Score as an Output Column
SELECT id, title, MATCH(TABLE articles) AGAINST ('body:database') AS relevanceFROM articlesWHERE MATCH(TABLE articles) AGAINST('body:database');
+------+---------------------------------+---------------------+
| id | title | relevance |
+------+---------------------------------+---------------------+
| 3 | Database Optimization | 0.3346227705478668 |
| 7 | Introduction to Database Design | 0.13076457381248474 |
+------+---------------------------------+---------------------+
Search for Rows with a Relevance Score Greater than a Specific Value
SELECT id, title, MATCH(TABLE articles) AGAINST ('body:database')FROM articlesWHERE MATCH(TABLE articles) AGAINST ('body:database') > .8;
+------+---------------------------------+-------------------------------------------------+
| id | title | MATCH(TABLE articles) AGAINST ('body:database') |
+------+---------------------------------+-------------------------------------------------+
| 7 | Introduction to Database Design | 0.8195944428443909 |
+------+---------------------------------+-------------------------------------------------+
Exact values may differ in this small example depending on the database configuration.
UPDATE or DELETE Queries
The MATCH
command can be used in UPDATE
or DELETE
queries.OPTIMIZE TABLE <table_
after an UPDATE
or DELETE
to make the results of the UPDATE
or DELETE
immediately available.
UPDATE articles_for_update SET title = concat(title,".DATABASE")WHERE MATCH(TABLE articles_for_update) AGAINST ('body:database');
DELETE FROM articles_for_updateWHERE MATCH(TABLE articles_for_update) AGAINST ('body:database');
Full-Text Index over JSON Column
A full-text index can be created over a JSON column in the same manner it can be created over any other text-type column.
CREATE TABLE ft_records (id INT UNSIGNED,title VARCHAR(200),records JSON,SORT KEY(id),FULLTEXT USING VERSION 2 rec_ft_index (title, records));
INSERT INTO ft_records VALUES (1,'document','{"k1": "cucumber","k2": ["dragonfruit", "eggplant"],"k3": [{"k3_1": "fig", "k3_2": "grape"},{"k3_1": ["huckleberry", "iceberg lettuce"]},"jicama"]}');OPTIMIZE TABLE ft_records FLUSH;
Querying over the entire JSON column can be performed in the same way as with any other column part of the full-text index.
SELECT (MATCH (TABLE ft_records) AGAINST ('records:/.*cumber/')) AS cumberFROM ft_records;
+----------+
| cumber |
+----------+
| 1 |
+----------+
To query over a specific JSON keypath, use a query similar to the following:
SELECT (MATCH (TABLE ft_records) AGAINST ('records$k3.k3_1:fig')) AS figFROM ft_records;
+-------------------------+
| fig |
+-------------------------+
| 0.13076457381248474 |
+-------------------------+
The example above shows how you can search for the string fig
at the keypath k3.
in the records
column when using the field grouping syntax.
SELECT idFROM ft_recordsWHERE MATCH (TABLE ft_records) AGAINST ('records:(iceberg lettuce)');
+------+
| id |
+------+
| 1 |
+------+
SELECT idFROM ft_recordsWHERE MATCH (TABLE ft_records) AGAINST ('records:"cucumber dragonfruit"');
Empty set (0.008 sec)
SELECT idFROM ft_recordsWHERE MATCH (TABLE ft_records) AGAINST ('records:(cucumber dragonfruit~100)');
+------+
| id |
+------+
| 1 |
+------+
In the above examples, you can see that the phrase query cucumber dragonfruit
does not match because they belong to different leaf strings.fts2_
.
Full-Text Legacy Version 1 Examples
Create a Table with a Full-Text Index
CREATE TABLE articles (id INT UNSIGNED,year int UNSIGNED,title VARCHAR(200),body TEXT,SORT KEY (id),FULLTEXT USING VERSION 1 (title, body));
The USING VERSION 1
syntax is optional.
Use a SQL Predicate
SELECT count(*)FROM articlesWHERE year = 2017 AND MATCH (body) AGAINST ('memsql');
Search for the Word database
across Two Columns
SELECT * FROM articlesWHERE MATCH (title,body)AGAINST ('database');
Use a Single Operator
SELECT titleFROM articlesWHERE MATCH (title) AGAINST ('Database OR "Business Intelligence"');
Use Multiple Operators
The following example returns the title of an article that contains either database
or Business Intelligence
and the string real-time analytics
in the body.real time
and real-time
will be included.
SELECT titleFROM articlesWHERE MATCH (title) AGAINST ('Database OR "Business Intelligence"')AND MATCH(body) AGAINST ("real-time analytics");
Use a Wildcard
This example uses the wildcard ’*’ to return all articles with titles starting with the word Journal
such as Journalism
, Journalist
, Journals
, and so on.
SELECT *FROM articlesWHERE MATCH(title) AGAINST ('Journal*');
Create a Relevance Score as an Output Column
SELECT id, title, MATCH(body) AGAINST ('database') relevanceFROM articlesWHERE MATCH(body) AGAINST ('database');
Search with a Specific Relevance Score
SELECT id, title, MATCH(body) AGAINST ('database')FROM articlesWHERE MATCH(body) AGAINST ('database') > .12;
UPDATE or DELETE Queries
UPDATE articles set name = concat(name,".DATABASE") where MATCH(body) AGAINST ('database');
DELETE from articles where MATCH(body) AGAINST ('database');
Related Topics
Last modified: October 2, 2024