# Working with Full-Text Search

Full-text search allows you to search for words or phrases in a large body of text efficiently. The search can be exact or “fuzzy” and performed over the text types `JSON`, `CHAR`, `VARCHAR`, `TEXT`, and `LONGTEXT`.

## Overview

SingleStore provides full-text search compatible with Apache's Java Lucene ([Apache Lucene Core](https://lucene.apache.org/core/)) and which uses [BM25](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/bm-25.md) scoring.

Below are select examples of this `VERSION 2` full-text search. These examples are intended to provide a high-level introduction to SingleStore's full-text search. Details and additional examples are provided in the following sections.

The following `articles` table is used for the examples.

```sql
CREATE TABLE articles (
    id INT UNSIGNED,
    year int UNSIGNED,
    title VARCHAR(200),
    body TEXT,
    SORT KEY (id),    
    FULLTEXT USING VERSION 2 art_ft_index (title, body));
```

This example searches for articles containing the term `Database` in the `title` or the phrase `"Business Intelligence"` in the `body` of the article:

```sql
SELECT title, body
FROM articles
WHERE MATCH (TABLE articles) AGAINST ('title:Database OR body:("Business Intelligence")');
```

A proximity search can be used to find articles that contain the phrase `"SQL databases"` in the `body` of the article, with the specification that `SQL` and `databases` must appear within 5 words of each other.

```sql
SELECT title, body
FROM articles
WHERE MATCH (TABLE articles) AGAINST ('body:"SQL databases"~5');

```

A fuzzy search can be used to find articles with words in the `body` of the article that are within an edit distance of 2 of the term `dtabase`, which in the following example is deliberately misspelled.

```sql
SELECT title, body
FROM articles
WHERE MATCH (TABLE articles) AGAINST ('body:dtabase~2');
```

Regular expressions can be used in full-text searches. The following query uses a regular expression for articles with words that begin with `data` in the `body` of the article.

```sql
SELECT title, body
FROM articles
WHERE MATCH (TABLE articles) AGAINST ('body:/data.*/');
```

Finally, full-text search can also be used on JSON columns; refer to [Create a Version 2 Full-Text Index over JSON](https://docs.singlestore.com/#section-idm4593657285028834222863631365.md) examples.

The section [Related Topics](https://docs.singlestore.com/#section-idm4542112525337632485207490414.md) contains additional resources on full-text search including using full-text search in hybrid search and configuring full-text indexes for high performance.

## Version 2 SingleStore Process

SingleStore's `VERSION 2` full-text search system uses a JLucene service. The JLucene service is a Java process that provides an interface for the SingleStore engine. This interface allows the engine to perform full-text searches and create full-text search indexes for future use. The JLucene service uses software from the Apache Lucene project. The SingleStore engine and the JLucene service run on the same machine. Communication between the two processes occurs through domain sockets and shared memory. Typically, communication occurs on a per-segment basis.

The default analyzer for `VERSION 2` is the StandardAnalyzer from Apache Lucene configured to use the StandardTokenizer from Apache Lucene, a lower-case filter, and a set of [stopwords](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match/#UUID-3ba95e6d-0a5b-7e67-43b4-fbf42b975d93.md). The StandardTokenizer is a grammar-based tokenizer which uses the word break rules from the Unicode Text Segmentation, as specified in [Unicode Standard Annex #29](https://unicode.org/reports/tr29/).

![](https://images.contentstack.io/v3/assets/bltac01ee6daa3a1e14/blt1935f451f5e7ad46/6a318be4a5457e0008764768/full-text-search-diagram_staticA-1120x333-light_-dH0xji.png)

## Create, Add, and Drop Full-Text Indexes

SingleStore recommends using `VERSION 2` full-text search for new development. SingleStore's [Legacy (Version 1)](https://docs.singlestore.com/#section-idm234388647864144.md) full-text search has been deprecated.

## Create a Version 2 Full-Text Index

Create a version 2 full-text index with a [CREATE TABLE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-definition-language-ddl/create-table.md) statement using the `FULLTEXT USING VERSION 2` index type. The `USING VERSION 2` syntax must be used in the `CREATE TABLE` command to utilize the `VERSION 2` process.

```sql
CREATE TABLE <table_name> ( 
    <column_definitions>, 
    FULLTEXT USING VERSION 2 [<fts_index_name>] 
    (<fts_col1>,..., <fts_coln>)
    [INDEX_OPTIONS '{...}']
);

```

Add a version 2 full-text index to a table [ALTER TABLE ADD FULLTEXT](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-definition-language-ddl/alter-table.md) statement.

```sql
ALTER TABLE <table_name> 
ADD FULLTEXT USING VERSION 2 [fts_index_name] 
(<fts_col1>,..., <fts_coln>)
[INDEX_OPTIONS '{...}']
;
```

To add a column to a full-text index, drop and recreate the index.

## Drop a Full-Text Index

Drop a full-text index using a [DROP INDEX](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-definition-language-ddl/drop-index.md) or an [ALTER TABLE DROP INDEX](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-definition-language-ddl/alter-table.md) statement.

```sql
DROP INDEX <fts_index_name> | <index_key_name> ON <table_name>;
```

```sql
ALTER TABLE <table_name> DROP INDEX <fts_index_name> | <index_key_name>;
```

If an index name was not designated when creating the table, the full-text index `index_key_name` must be used when dropping the index. The full-text index `index_key_name` is displayed when the `SHOW INDEXES FROM <table_name>` command is executed.

## Remarks

The following apply to `VERSION 2`:

* Full-text indexes are only supported on columnstore tables ([How the Columnstore Works](https://docs.singlestore.com/db/v9.1/create-a-database/columnstore/how-the-columnstore-works.md)).
* Only one full-text index is supported per table.

  * That one index can index multiple columns.
  * The different columns can have different tokenizers, stemming, and analyzers.
* Full-text indexes are not supported on Views.
* During indexing, column values are split into tokens, which are turned into indexed terms.  The maximum length of an indexed term is 32766 bytes.

  * If a column value is longer than that limit, you will see a message similar to: `Forwarding Error (<node>): Leaf Error (<node>): Document contains at least one immense term in field=<field> (whose UTF8 encoding is longer than the max length 32766), all of which were skipped...`
* A `MATCH … AGAINST` clause may not refer to a CTE ([WITH (Common Table Expressions)](https://docs.singlestore.com/db/v9.1/query-data/advanced-query-topics/with-common-table-expressions.md)) because the CTE produces a dynamic table which does not have a full-text index. Similarly, `MATCH … AGAINST` clauses may not refer to derived tables.

  * These restrictions apply to `MATCH`,  `BM25`, and `BM25_GLOBAL`.
* New inserts and updates into columnstore tables may initially be stored in a hidden rowstore table before being flushed to a segment file. The affected segment is re-indexed when the background flusher runs.

  * In that case, the full-text index in the columnstore will be updated asynchronously for new inserts and updates. Inserts and updates from this rowstore table can be force-pushed to the columnstore table by using the `OPTIMIZE TABLE <table_name> FLUSH` command.
* Since an index is created for each segment file, the distribution of words within the segment may affect the score of full-text queries, especially when the segments have very few rows and the columns have very few words.

## Upgrade to Full-Text Version 2

A table can be upgraded from legacy full-text search to `VERSION 2` full-text search. To do so:

* Drop the existing full-text index using the `DROP INDEX` command.
* Use the `ALTER TABLE` command with the `FULLTEXT USING VERSION 2` argument to create a `VERSION 2` full-text index.

Once you have upgraded your table to use a `VERSION 2` full-text index, you will also need to change your queries to use the `VERSION 2` query syntax.

## Version 2 Installation for Self-Managed Users

Full-text version 2 requires  Java 25 or later to be installed on all nodes. Refer to <https://www.oracle.com/java/technologies/downloads/> for more information on how to install Java.

On each node, the following engine variables must be set. These are not sync variables and must be set individually on each node.

* `fts2_java_home`: Specifies the path to the top-level directory of the Java installation which is stored in the `$JAVA_HOME` environment variable.
* `java_path`: Specifies the path to the Java binary. Java Runtime Environment 25+ (JRE 25+) is required.

> **📝 Note**: As of version 9.1, the variable `fts2_java_path` is deprecated and is superseded by `java_path`.As of version 9.1, JRE25+ is required.

For example, assuming all of the hosts have the same Java paths as reflected below, run the following commands to set the `fts2_java_home` and  `java_path` engine variables to the same Java path values on all nodes in the cluster:

```
sdb-admin update-config --all --set-global --key "fts2_java_home" --value "/usr/local/jdk-25"
sdb-admin update-config --all --set-global --key "java_path" --value "/usr/bin/java"
```

If the locations for the Java home directory and Java executable differ from those in the preceding commands, change the path names in the `--value` arguments to match your environment.

To restart your cluster with the updated configurations set, run the following command:

```
sdb-admin restart-node --all --yes
```

Refer to [sdb-admin update-config](https://docs.singlestore.com/db/v9.1/reference/singlestore-tools-reference/sdb-admin-commands/update-config.md) for more information about updating nodes in your cluster.

## Query Full-Text Indexes

A full-text index search matches a search term or terms to content in a table that has been full-text indexed.

The following query which uses the `articles` table from the [Overview](https://docs.singlestore.com/#section-idm234905408733053.md) finds all articles with the word `Database` in the `title`. The `title` column has a full-text index on it.

```sql
SELECT title, body
FROM articles
WHERE MATCH (TABLE articles) AGAINST ('title:Database');
```

Terms can be single terms or phrases, can be modified with wildcards or boosted, can be combined with boolean operators, and more as described in the following sections.

The [MATCH](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match.md),  [BM25](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/bm-25.md), or [BM25\_GLOBAL](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/bm-25.md) functions can be used to search full-text indexed content. These functions provide different tradeoffs between efficiency and accuracy with `MATCH` being the most efficient. SingleStore recommends using `MATCH` for most applications. Refer to [MATCH and BM25 Scoring](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match/#section-idm23455788535454.md) and [Comparison of BM25 and BM25\_GLOBAL](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/bm-25/#section-idm234863501492724.md) for more details.

SingleStore supports [custom analyzers](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/full-text-version-2-custom-analyzers.md) for full-text `VERSION 2` search. Users can customize full-text search by:

* Using built-in analyzers for a variety of languages. The built-in analyzers can be customized with custom stop-word lists.
* Using custom analyzers in which a user can specify a tokenizer, optional token and character filters, and an optional stop-word list.

[Custom column mappings](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/full-text-version-2-custom-analyzers/#section-idm234993094025062.md) can be used to index different columns in a table with different analyzers. For example, you can index columns with text in different languages with analyzers specific to those languages. Refer to [Example 13 Custom Column Mappings](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/full-text-version-2-custom-analyzers/#section-idm234993108252177.md).

JSON columns can be searched using full-text search as shown in [Example 7: Score Over JSON](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/bm-25/#section-idm234557974454307.md). In addition, custom column mappings can be used to index fields in a JSON document with different analyzers. Refer to [Example 13b JSON and BSON Keypath Analyzers](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/full-text-version-2-custom-analyzers/#section-idm234993129355599.md).

## Remarks

* Each `MATCH`, `BM25`, or `BM25_GLOBAL` clause applies to only one table.
* To search against multiple tables, specify multiple `MATCH`, `BM25`, or `BM25_GLOBAL` clauses.

## Terms in Full-Text Searches

There are two types of search terms: single terms and phrases. A single term is a single word such as *test* or *hello* which does not require quotes. A phrase is a group of words surrounded by double quotes such as "hello SingleStore". Multiple search terms can be combined with Boolean operators to form more complex queries.

Search terms can be modified to provide a wide range of search options as described below.

## Wildcard Support

Single and multiple character wildcard searches within single terms are supported, but not within search phrases.

* Use the `?` symbol to perform a single character wildcard search.
* Use the `*` symbol to perform a multi character wildcard search.

> **❗ Important**: Neither `?` or `*` are supported at the beginning of a term. For example, searching for *?ello* or *\*ello* will generate an error.Refer to [Example 12](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/full-text-version-2-custom-analyzers/#section-idm235055586170482.md) for an example of using a `n_gram` tokenizer to emulate using a wildcard at the start and end of a string and search for a substring.

A single character wildcard search matches words based on a single character. For example, to search for “text” or “test”, use the search term: `te?t`.

A multiple character wildcard search matches words based on zero or more characters. For example, to search for “test”, “tests”, or “tester”, use the search term `test*`. Wildcard searches in the middle of a search term can also be used, such as `te*t`.

The following examples demonstrate how these wildcard searches appear in sample queries:

```sql
SELECT * FROM wilsearch1 WHERE MATCH (TABLE wilsearch1) AGAINST ('col1:te?t');
```

```sql
SELECT * FROM wilsearch1 WHERE MATCH (TABLE wilsearch1) AGAINST ('col1:te*t');
```

## Boosting a Term

Boosting a search term means increasing the relevance or importance of that search term in the search results. To boost a search term, use the caret ("^") symbol with a boost factor (a number) at the end of the search term. The higher the boost factor, the more relevant the search term will be.

For example, if you are searching for `Single Store` and you want the term `Store` to be more relevant, boost it using the ^ symbol along with the boost factor next to the term, such as you could type `Single Store^4`. This will make rows with the term Store appear more relevant.

You can also boost phrase as in the example: `"Single Store"^4 "MySQL"`.

> **📝 Note**: The boost factor is 1 by default. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2), which leads to the term or phrase have less relevance.

The following examples show the use of boosting:

```sql
SELECT * FROM bstsearch1 WHERE MATCH (TABLE bstsearch1) AGAINST ('col1:"Single Store"^4 "MySQL"');
```

```sql
SELECT * FROM bstsearch1 WHERE MATCH (TABLE bstsearch1) AGAINST ('col1:SingleStore MySQL^0.02');
```

## Operators in Full-Text Search

Full-text version 2 supports operators listed on the [Java Lucene full-text search string syntax](https://lucene.apache.org/core/10_1_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package.description) page.

## Grouping Terms in a Query

## Grouping Single Terms

Parentheses can be used to group terms to form subqueries. This can be useful for controlling the boolean logic for a query.

For example, to search for either "Single" or "Store" and "MemSQL", use the query `(Single OR Store) AND MemSQL`. This ensures that "MemSQL" exists with either the "Single" or "Store" search terms.

## Grouping Multiple Terms into a Single Field

Parentheses can be used to group multiple clauses into a single field. The following query can be used to search for "SingleStore" in the `col1` full-text column or for "MemSQL" in the `col2` full-text column.

```sql
SELECT * FROM grpsearch1 WHERE MATCH (TABLE grpsearch1) AGAINST ('col1:SingleStore OR col2:MemSQL');
```

## Escaping Special Characters

Special characters that are part of the query syntax must be escaped to directly match them. These special characters are:

```sql
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \ /
```

Use the `\` character before the special character to escape it. For example, to search for `(1+1):2` ,use the query` \(1\+1\)\:2`.

## Regular Expression Searches

Regular expression searches that match a pattern between forward slashes ("/") are supported. Regular expression grammar follows [Apache Lucene RegExp](https://lucene.apache.org/core/10_1_0/core/org/apache/lucene/util/automaton/RegExp.html) grammar.

For example, to find rows containing "moat" or "boat", use `/[mb]oat/`:

```sql
SELECT * FROM rexsearch1 WHERE MATCH (TABLE rexsearch1) AGAINST ('col1:/[mb]oat/'); 
```

The following example searches for dates with the format of month/day/year where the month and day can have one or two digits and year can have two or four digits.

```sql
SELECT * FROM rexsearch1 WHERE MATCH (TABLE rexsearch1) AGAINST ('col1:/\d{1,2}\/\d{1,2}\/\d{2,4}/');
```

## Fuzzy Searches

Fuzzy searches based on the [Damerau-Levenshtein Distance](https://en.wikipedia.org/wiki/Damerau-Levenshtein_distance) are supported. To perform a fuzzy search, use the tilde ("\~") symbol at the end of a single term. For example, to search for a term similar in spelling to "roam" use the fuzzy search: `roam~`, which will find terms like "foam" and "roams".

An *edit* is a change made to a search term to transform it into another string. An optional “edit” parameter can be used after the tilde to specify the maximum number of edits permitted. This value can be 0, 1, or 2. If not specified, the default value is 2 edits.

> **📝 Note**: Only integer (non-fractional) values are permitted.

The following is an example of a fuzzy search of the term "roam" without the optional parameter:

```sql
SELECT * FROM fuzsearch1 WHERE MATCH (TABLE fuzsearch1) AGAINST ('col1:roam~');
```

The following is an example of a fuzzy search of the term "roam" with the optional parameter:

```sql
SELECT * FROM fuzsearch1 WHERE MATCH (TABLE fuzsearch1) AGAINST ('col1:roam~1');
```

## Fuzzy Search OPTIONS Clause

Fuzzy searches can be augmented by using an optional `OPTIONS` clause within the `MATCH` syntax. The syntax for adding options is as follows:

```sql
MATCH (TABLE <table_name>) AGAINST (<col_name$key.path:search_term> OPTIONS <json_options>);
```

The set of supported options are:

* `fuzzy_prefix_length`: This is the number of characters at the start of a search term that must be identical (not fuzzy) to the query term if the query is to match the search term. The default value is `0`.
* `fuzzy_max_expansions` : This is the maximum number of terms to match against. The default value is `50`.
* `fuzzy_transpositions`: This allows transpositions with adjacent characters. This can capture some errors more efficiently, particularly mistyped adjacent characters. When transposition is allowed, matching against "ca\~" will be returned "ac" also. The default value is `TRUE`.

All of the options are case-sensitive, optional, and can be combined as required. For example:

```sql
SELECT * FROM articles WHERE MATCH (TABLE articles) AGAINST ('body:roam~' OPTIONS '{"fuzzy_prefix_length": 2}');
```

```sql
SELECT * FROM articles 
    WHERE MATCH (TABLE articles) AGAINST ('body:roam~' OPTIONS '{"fuzzy_prefix_length": 1, "fuzzy_transpositions": false}');
```

## Proximity Searches

Finding words within a specified distance away is supported. To perform a proximity search, use the tilde ("\~") symbol at the end of a search phrase. For example, to search for a "Single" and "Store" within 10 words of each other in a row, use the search `"Single Store"~10`:

```sql
SELECT * FROM prxsearch1 WHERE MATCH (TABLE prxsearch1) AGAINST ('col1:"Single Store"~10');
```

## Range Searches

Range searches match rows where the column(s) values are between the lower and upper bound specified by the query. Range searches can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically or by value depending on whether the search is over string or numeric values.

Square brackets ("\[]") indicate an inclusive search, curly brackets ("{}") indicate an exclusive search. A range clause of \[123 TO 345] implies a numeric range search, a range clause of \[abc TO def] implies a lexicographic range search, and \[$123$ TO $345$] forces a lexicographic range despite the numeric range values. Specific examples are provided below.

An inclusive range search matches all rows where the values are between the search terms, including values equal to the search terms. Inclusive range queries are denoted by square brackets (“\[ ]”). For example, the following query searches for rows where the numeric value of `col1` is 2008 through 2020:

```sql
SELECT * FROM rngsearch1 
WHERE MATCH (TABLE rngsearch1) AGAINST ('col1:[2008 TO 2020]');
```

An exclusive range search matches all rows where the values are between the search terms, but are not equal to the search terms. Exclusive range queries are denoted by curly brackets (“{ }”). For example, the following query searches for rows where the numeric value of `col1` is between 2008 and 2020, but is not equal to 2008 or 2020:

```sql
SELECT * FROM rngsearch1 
WHERE MATCH (TABLE rngsearch1) AGAINST ('col1:{2008 TO 2020}');
```

Range searches may be lexicographic or numeric. Typically searches over strings are expected to order lexicographically while searches over numeric values are expected to order by numeric value. The type of range search to use (lexicographic or numeric) is determined by the values entered in the range clause in the query. If both the upper and lower bounds of the range are numeric, then a numeric range search will be performed. If either value is not numeric, then a lexicographic query will be used.

The following query will use a numeric search as both the values 2008 and 2020 are numeric.

```sql
SELECT * FROM rngsearch1 
WHERE MATCH (TABLE rngsearch1) AGAINST ('col1:{2008 TO 2020}');
```

The following query will use a lexicographic search as both of the range values are strings that do not contain numeric values. Note that this query is inclusive of A, but exclusive of B.

```sql
SELECT * FROM rngsearch1 
WHERE MATCH (TABLE rngsearch1) AGAINST ('col1:[A TO B}');
```

This query will use a numeric search as both the values "2008" and "2020" can be parsed numerically.

```sql
SELECT * FROM rngsearch1 
WHERE MATCH (TABLE rngsearch1) AGAINST ('col1:{"2008" TO "2020"}');
```

It is possible to force a lexicographic search on a numeric-style range clause. To do so, surround the numeric upper and lower bounds with the '$' character. The following query will use a lexicographic search.

```sql
SELECT * FROM rngsearch1 
WHERE MATCH (TABLE rngsearch1) AGAINST ('col1:[$2008$ TO $2020$]')
```

## ORDER BY ... LIMIT Optimization

Full-text search queries can take advantage of an optimization that pushes down the value of `LIMIT` in an `ORDER BY … LIMIT` query into the full-text index search. This optimization will reduce the number of results returned from the full-text index search which can increase query performance.

For this optimization to work, the query must follow these rules:

* The `ORDER BY` clause must use `DESC` sort order.
* There must be only one full-text search function (`MATCH`, `BM25`, or `BM25_GLOBAL`) in the query.

Refer to [Example ORDER BY … LIMIT](https://docs.singlestore.com/#section-idm234621702469151.md) and [Full-text ORDER BY LIMIT Optimization](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/#configuring-full-text-and-vector-indexes.md) for examples.

## Relevancy Score

The relevancy score of an expression in a `MATCH` statement denotes the ranking of the expression based on the following factors:

* Number of times an expression appears in a column. More occurrences of an expression in the matched column(s) increases its relevancy score.
* Rarity of the expression. Rare words have a higher relevancy score than commonly used words.
* The length of the column containing the expression. A column with a short expression has a higher relevancy score than a column with a long expression.

## BM25 Scoring

Full-text search version 2 supports [BM25](https://en.wikipedia.org/wiki/Okapi_BM25) scoring. Refer to [BM25](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/bm-25.md) for more information.

## Configurations

A set of global variables is available to configure full-text search version 2. Refer to the full-text variables sections of [List of Engine Variables](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/engine-variables/list-of-engine-variables.md) for details. Also, refer to [Configuring Full-Text Indexes](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/configuring-full-text-indexes.md) for information on configuring the engine for high performance for full-text indexes.

## Examples

## Create a Version 2 Full-Text Search Index and Query Using MATCH

This example creates a `FULLTEXT` index for both the title column and the body column. Either column can be queried separately using `MATCH (TABLE <table_name>) AGAINST (<expression>)`, and the index on the column will be applied.

```sql
CREATE TABLE articles (
    id INT UNSIGNED,
    year int UNSIGNED,
    title VARCHAR(200),
    body TEXT,
    SORT KEY (id),    
    FULLTEXT USING VERSION 2 art_ft_index (title, body));
```

```sql
INSERT INTO articles (id, year, title, body) VALUES
   (1, 2021, 'Introduction to SQL', 'SQL is a standard language for accessing and manipulating databases.'),
   (2, 2022, 'Advanced SQL Techniques', 'Explore advanced techniques and functions in SQL for better data manipulation.'),
   (3, 2020, 'Database Optimization', 'Learn about various optimization techniques to improve database performance.'),
   (4, 2023, 'SQL in Web Development', 'Discover how SQL is used in web development to interact with databases.'),
   (5, 2019, 'Data Security in SQL', 'An overview of best practices for securing data in SQL databases.'),
   (6, 2021, 'SQL and Data Analysis', 'Using SQL for effective data analysis and reporting.'),
   (7, 2022, 'Introduction to Database Design', 'Fundamentals of designing a robust and scalable database.'),
   (8, 2020, 'SQL Performance Tuning', 'Tips and techniques for tuning SQL queries for better performance.'),
   (9, 2023, 'Using SQL with Python', 'Integrating SQL with Python for data science and automation tasks.'),
   (10, 2019, 'NoSQL vs SQL', 'A comparison of NoSQL and SQL databases and their use cases.'),
   (11, 2020, 'Real-time Data Analysis', 'An introduction to real-time analytics.'),
   (12, 2021, 'Analysis for Beginners', 'Simple examples of real time analytics.'),
   (13, 2023, 'Data-Dictionary Design', 'Create and maintain effective data dictionaries.'),
   (14, 2024, 'Scalable Performance', 'Designing for scalability.');

OPTIMIZE TABLE articles FLUSH;
```

## Search for a Single Word

Search for rows with the word `database` in the `body` column.

```sql
SELECT * 
FROM articles 
WHERE MATCH (TABLE articles) AGAINST ('body:database');

```

```output

+----+------+---------------------------------+-----------------------------------------------------------------------------+
| id | year | title                           | body                                                                        |
+----+------+---------------------------------+-----------------------------------------------------------------------------+
| 7  | 2022 | Introduction to Database Design |	Fundamentals of designing a robust and scalable database.                   |
| 3  | 2020 | Database Optimization	      | Learn about various optimization techniques to improve database performance.|
+----+------+---------------------------------+-----------------------------------------------------------------------------+

```

## Boolean OR Search

Search for rows with the word `Database` in the title column or the phrase `Business Intelligence` in the body column.

```sql
SELECT title 
FROM articles 
WHERE MATCH (TABLE articles) AGAINST ('title:Database OR body:("Business Intelligence")');

```

```output

+---------------------------------+
| title                           |     
+---------------------------------+
| Introduction to Database Design |
| Database Optimization           |
+---------------------------------+
```

## Boolean AND Search

Search for rows with the word `SQL` in the `title` column and the phrase `"Data Security"` in the `title` column.

```sql
SELECT title 
FROM articles 
WHERE MATCH (TABLE articles) AGAINST ('title:SQL AND title:("Data Security")');

```

```output

+----------------------+
| title                |     
+----------------------+
| Data Security in SQL |
+----------------------+
```

## Multiple MATCH Clauses

Use two `MATCH` clauses to search for rows with the word `SQL` in the `title` or the phrase `Business Intelligence` in the `title`, and that also have the word `development` in the `body`. In this example, the `+` indicates that the search term is required.

```sql
SELECT title
    FROM articles  
    WHERE MATCH (TABLE articles) AGAINST ('title:SQL OR body:("Business Intelligence")')
    AND MATCH (TABLE articles) AGAINST ('body:web+');

```

```output

+------------------------+
| title                  |     
+------------------------+
| SQL in Web Development |
+------------------------+
```

## Use +, \*, and ?

Search for rows that have a word starting with `Data` and followed by an arbitrary number of characters in the `title`, or words like `function` followed by a single character (e.g. functions, but not functional) in the title. In this example, the `+` indicates that `Data` should appear at the beginning of the word.

```sql
SELECT title
    FROM articles  
    WHERE MATCH (TABLE articles) AGAINST ('title:(+Data*) OR title:function?');

```

```output

+---------------------------------+
| title                           |     
+---------------------------------+
| Introduction to Database Design |
| SQL and Data Analysis           |
| Database Optimization           |
| Data Security in SQL            |
| Data-Dictionary Design          |
| Real-time Data Analysis         |
+---------------------------------+

```

## Create a Version 2 Full-Text Index over JSON

A full-text index can be created over a JSON column in the same manner it can be created over any other text-type column.

```sql
CREATE TABLE ft_records (
    id INT UNSIGNED,
    title VARCHAR(200),
    records JSON,
    SORT KEY(id),
    FULLTEXT USING VERSION 2 rec_ft_index (title, records));
```

The full-text index is created over a JSON column by concatenating all leaf string values in the JSON as a multi-valued field. The engine variable `fts2_position_increment_gap` defines the logical spacing between concatenated leaf string values to prevent matching across different leaf string values. The default value is 100. In addition to having a field for each column in the full-text index, there will also be additional fields for each unique keypath in the JSON document.

```sql
INSERT INTO ft_records VALUES (
1, 
'document', 
'{
         	 	"k1": "cucumber",
         	 	"k2": ["dragonfruit", "eggplant"],
          		"k3": [
              			{"k3_1": "fig", "k3_2": "grape"},
              			{"k3_1": ["huckleberry", "iceberg lettuce"]},
              			"jicama"
          		      ]
 }');

OPTIMIZE TABLE ft_records FLUSH;
```

```sql
SELECT title, records 
FROM ft_records 
WHERE id = 1;

```

```output

+----------+--------------------------------------------------------------------------------------------------------------------------------------------+ 
| title	   | records                                                                                                                                    |
+----------+--------------------------------------------------------------------------------------------------------------------------------------------+
| document | {"k1":"cucumber","k2":["dragonfruit","eggplant"],"k3":[{"k3_1":"fig","k3_2":"grape"},{"k3_1":["huckleberry","iceberg lettuce"]},"jicama"]} | 
+----------+--------------------------------------------------------------------------------------------------------------------------------------------+

```

The JSON document for the `records` column for the row inserted above is shown below.

```json
{
	"title": "document",
	"records": ["cucumber", "dragonfruit", "eggplant",
                   "fig", "grape", "huckleberry", 
          	      "iceberg lettuce", "jicama"],
    	"records$k1": "cucumber",
    	"records$k2": ["dragonfruit", "eggplant"],
   	"records$k3": "jicama",
    	"records$k3.k3_1": ["fig", "huckleberry", "iceberg lettuce"],
    	"records$k3.k3_2": "grape"
}
```

Here the `records` column is a multi-valued field of all leaf string values in JSON. This field can be queried like any other full-text indexed field. An exception is that matching will not occur over separate values unless the maximum number of positions allowed between matching phrases, or the “slop,” exceeds the value of `fts2_position_increment_gap`.

Along with `records`, other fields like `records$k1` and `records$k3.k3_1` are created to allow searching at each unique keypath present in the JSON document. The dollar sign ("$") is used as a delimiter between the SQL column name and the JSON keypath.

## Query over a JSON column

Querying over the entire JSON column can be performed in the same way as with any other column that is part of the full-text index.

```sql
SELECT (MATCH (TABLE ft_records) AGAINST ('records:/.*cumber/')) AS cumber
FROM ft_records;

```

```output

+----------+ 
| cumber   |
+----------+ 
| 1        | 
+----------+

```

## Query over a JSON keypath

The following example shows how you can search for the string `fig` at the keypath `k3.k3_1` in the `records` column using the field grouping syntax.

```sql
SELECT (MATCH (TABLE ft_records) AGAINST ('records$k3.k3_1:fig')) AS fig
FROM ft_records;

```

```output

+------------------------+ 
| fig                    |
+------------------------+ 
| 0.13076457381248474    | 
+------------------------+
```

## Query for Two Terms

Enclose the terms `cucumber` and `raspberry` in parentheses to do a boolean `OR` search for the terms (`cucumber` and `raspberry`) in the document. The document matches because the term `cucumber` appears in the document.

```sql
SELECT id
FROM ft_records
WHERE MATCH (TABLE ft_records) AGAINST ('records:(cucumber raspberry)');

```

```output

+------+ 
| id   |
+------+ 
| 1    | 
+------+

```

## Phrase Query and Proximity Search

In the example below, quotes are placed around the words `cucumber` and `dragonfruit` to search for the phrase `"cucumber dragonfruit"`.

```sql
SELECT id 
FROM ft_records 
WHERE MATCH (TABLE ft_records) AGAINST ('records:"cucumber dragonfruit"');

```

```output

Empty set (0.008 sec)
```

The phrase `"cucumber dragonfruit"` does not match the JSON document in the `ft_records` table because the words `cucumber` and `dragonfruit` belong to different leaf strings in that document.

When a proximity search query for the phrase `"cucumber dragonfruit"` with a slop of `~100` is used, the JSON document matches.

```sql
SELECT id
FROM ft_records
WHERE MATCH (TABLE ft_records) AGAINST ('records:"cucumber dragonfruit"~100');

```

```output

+------+
| id   | 
+------+ 
| 1    | 
+------+
```

Slop indicates the maximum number of words allowed between words in a phrase for the phrase to be considered a match. That is, a slop of `~100` means if there are 100 words or less between `cucumber` and `dragonfruit` in the document, the document will be considered a match.

Further, a proximity search is only done on multi-valued fields such as JSON (or BSON) columns when the slop value is greater than or equal to `fts2_position_increment_gap`.

In this example slop is `100` and `fts2_position_increment_gap=100` (100 is the default value of `fts2_position_increment_gap`). Since slop is equal to `fts2_position_increment_gap` matching across separate values occurs. And since the words `cucumber` and `dragonfruit` are within 100 words of each other in the document, a match is returned.

## Use of \~ in Fuzzy Search and Proximity Search

As described above, the \~ symbol is used for both proximity searches and fuzzy searches. If the \~ appears after a phrase (multiple words delimited by quotes "), this indicates that a proximity search with a slop value of the number appearing after the \~ should be performed. If the \~ appears after a single word, this indicates that an edit distance (Levenshtein) comparison should be performed on that word (also called a fuzzy search).

The following example shows a fuzzy search.

```sql
SELECT id
FROM ft_records
WHERE MATCH (TABLE ft_records) AGAINST ('records:dronfruit~2');

```

```output

+------+
| id   |
+------+
|    1 |
+------+

```

A match is returned because `dronfruit`, which is intentionally misspelled, is within an edit distance of 2 of `dragonfruit`.

## Phrase Search vs. Boolean Term Search

It is important to understand the difference between a phrase search and a boolean term search as it relates to JSON fields.

Words in quotes (like `"cucumber dragonfruit"`)  are searched for as a single phrase in a single JSON field. Words that appear in parentheses (like `(cucumber dragonfruit)` are searched as if there is a boolean `OR` between the words. When terms are separated by logical operators, they may be matched in different JSON fields.

In the following example, `cucumber` and `dragonfruit` appear in fields `k1` and `k2` respectively, and thus a match occurs.

```sql
SELECT id
FROM ft_records
WHERE MATCH (TABLE ft_records) AGAINST ('records:(cucumber dragonfruit)');

```

```output

+------+
| id   |
+------+
|    1 |
+------+

```

## Example - `ORDER BY … LIMIT`

The query below will take advantage of the `ORDER BY...LIMIT` pushdown. In this query there is one `MATCH` function, named `match_res`, the columns in the `WHERE` and `ORDER BY` clauses are the same, and the sort order is `DESC`.

```sql
SELECT id, title, 
   MATCH (TABLE articles) AGAINST ('body:database') AS match_res
FROM articles 
WHERE match_res
ORDER BY match_res DESC LIMIT 25;
```

## Legacy (Version 1) SingleStore Process

SingleStore's legacy full-text search system uses a CLucene service which is embedded in the SingleStore database engine. The legacy (`VERSION 1`) full-text search has been deprecated, SingleStore recommends using `VERSION 2` full-text search for new development.

Version 1 full-text indexes are only supported on columnstore tables. They can only be enabled as part of a `CREATE TABLE` statement using the [FULLTEXT](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-definition-language-ddl/create-table.md) index type. This means version 1 full-text indexes cannot be altered after the table is created. If the table is dropped, then the index is deleted automatically.

```sql
CREATE TABLE <table_name> (... FULLTEXT USING VERSION 1 [<fts_index_name>] (<fts_col>))
```

Content in columns that are full-text indexed can be searched using the [MATCH](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match.md) function. Each `MATCH` clause applies to only one table. To search against multiple tables, specify multiple `MATCH` clauses.

[Upgrade to Full-Text Version 2](https://docs.singlestore.com/#section-idm235055478427269.md) describes how to upgrade to full-text `VERSION 2`.

> **📝 Note**: New inserts and updates into columnstore tables may initially be stored in a hidden rowstore table before being flushed to a segment file. The affected segment is re-indexed when the background flusher runs.In that case, the full-text index in the columnstore will be updated asynchronously for new inserts and updates. Inserts and updates from this rowstore table can be force-pushed to the columnstore table by using the `OPTIMIZE TABLE <table_name> FLUSH` command.Since an index is created for each segment file, the distribution of words within the segment may affect the score of full-text queries, especially when the segments have very few rows and the columns have very few words.

The following example illustrates how to create a table with a legacy version full-text search index and how to query from that table.SingleStore recommends explicitly using the `VERSION 1` syntax.

```sql
CREATE TABLE articles (
    id INT UNSIGNED,
    year int UNSIGNED,
    title VARCHAR(200),
    body TEXT,
    SORT KEY (id),
    FULLTEXT USING VERSION 1 (title, body));

```

```sql
SELECT * FROM articles
    WHERE MATCH (title,body)
    AGAINST ('database');

```

Refer to the [MATCH](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match.md) page for more examples.

## Related Topics

* [Working with Vector Data](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-vector-data.md)- Allows for semantic searching, which is searching based on meanings, not keywords.
* [Hybrid Search](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/hybrid-search-re-ranking-and-blending-searches.md) - Allows full-text and vector search methods in one query. Full-text and vector search ranking can be combined.
* [Configuring Full Text and Vector Indexes](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/#configuring-full-text-and-vector-indexes-2.md)
* [BM25 and BM25\_GLOBAL](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/bm-25.md)
* [MATCH](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match.md)
* [HIGHLIGHT](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/highlight.md)
* Training: [Full-Text Index and Search](https://training.singlestore.com/learn/course/internal/view/elearning/635/full-text-index-and-search)

***

Modified at: May 15, 2026

Source: [/db/v9.1/developer-resources/functional-extensions/working-with-full-text-search/](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-full-text-search/)

(An index of the documentation is available at /llms.txt)
