Important

The SingleStore 9.1 release candidate (RC) gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 9.0 is recommended for production workloads, which can later be upgraded to SingleStore 9.1.

ANALYZE FULLTEXT

Displays the tokens generated by a Lucene analyzer for a string.

This command shows the tokens that would be generated by a Lucene analyzer when creating a full-text index for a specified string.

Syntax

ANALYZE FULLTEXT "<text>" [OPTIONS '<analyzer options>'];

Arguments

  • <text> is the string to be analyzed.

  • <analyzer options> is a JSON string that specifies the analyzer used to tokenize the <text> string.

    • <analyzer options> uses the same syntax as the analyzer key in INDEX_OPTIONS for full-text indexes. Refer to Full Text VERSION 2 Custom Analyzers for information.

    • If <analyzer options> is not provided, the Lucene StandardAnalyzer is used.

This command is intended to allow the user to see tokens that would be generated for a string for a full-text index that has been or will be built.

To see the tokens, set <analyzer options> in the ANALYZE FULLTEXT command to the value of analyzer in INDEX_OPTIONS in the index creation command.

For example, if an index was created with: INDEX_OPTIONS '{"analyzer" : "spanish"}', use OPTIONS '{"analyzer" : "spanish"}' in the ANALYZE FULLTEXT command.

Output

The output of ANALYZE FULLTEXT includes tokens from the Lucene analyzer and attributes that are associated with each token, which are created from Lucene attributes. 

The following table lists the attributes in the output of the ANALYZE FULLTEXT command and the Lucene Attributes used to create them.

SingleStore Attribute(s)

Lucene Attribute

Description

token

CharTermAttribute

The term text of a token. 

position_length

PositionLengthAttribute

The number of positions occupied by a token.

type

TypeAttribute

The type of the token. 

start_offset, end_offset

OffsetAttribute

The start and end offset of a token in characters.

position

PositionIncrementAttribute

position: The absolute position of the token. The first token is at position 0.

PositionIncrementAttribute: The position of a token relative to the previous token.

Refer to Attribute and Attribute Source for additional information about the Lucene Attributes.

Examples

Example 1 - Use the StandardAnalyzer

The following command displays the tokens and associated attributes that are produced by the Lucene StandardAnalyzer for the string "the train is moving".

ANALYZE FULLTEXT "the train is moving";
+-----------------------------------------------+
| Response                                      |
+-----------------------------------------------+
| {"tokens": [
    {
      "position_length": 1,
      "end_offset": 3,
      "start_offset": 0,
      "position": 0,
      "type": "<ALPHANUM>",
      "token": "the"
    },
    {
      "position_length": 1,
      "end_offset": 9,
      "start_offset": 4,
      "position": 1,
      "type": "<ALPHANUM>",
      "token": "train"
    },
    {
      "position_length": 1,
      "end_offset": 12,
      "start_offset": 10,
      "position": 2,
      "type": "<ALPHANUM>",
      "token": "is"
    },
    {
      "position_length": 1,
      "end_offset": 19,
      "start_offset": 13,
      "position": 3,
      "type": "<ALPHANUM>",
      "token": "moving"
    }
  ]} |
+-----------------------------------------------+

Example 2 - Specify an Analyzer

The following command shows the tokens that would be generated by the english analyzer for the string "the train is moving".

ANALYZE FULLTEXT "the train is moving" OPTIONS '{"analyzer": "english"}';
+-----------------------------------------------+
| Response                                      |
+-----------------------------------------------+
| {"tokens": [
    {
      "position_length": 1,
      "end_offset": 9,
      "start_offset": 4,
      "position": 1,
      "type": "<ALPHANUM>",
      "token": "train"
    },
    {
      "position_length": 1,
      "end_offset": 19,
      "start_offset": 13,
      "position": 3,
      "type": "<ALPHANUM>",
      "token": "move"
    }
  ]} |
+-----------------------------------------------+

Example 3 - Custom Tokenizer and Token Filter

The following command generates the tokens for the string "MemSQL is SingleStore." for the specified custom analyzer. The analyzer specified in this case uses the Lucene standard tokenizer with a token filter customization that specifies that all occurrences of MemSQL should be replaced with SingleStore.

ANALYZE FULLTEXT "MemSQL is SingleStore."
OPTIONS
'{"analyzer":
{"custom":
{"tokenizer": "standard",
"token_filters":
[{"pattern_replace":
{"pattern": "MemSQL","replacement": "SingleStore"}
}]
}
}
}';
+-----------------------------------------------+
| Response                                      |
+-----------------------------------------------+
| {"tokens": [
    {
      "position_length": 1,
      "end_offset": 6,
      "start_offset": 0,
      "position": 0,
      "type": "<ALPHANUM>",
      "token": "SingleStore"
    },
    {
      "position_length": 1,
      "end_offset": 9,
      "start_offset": 7,
      "position": 1,
      "type": "<ALPHANUM>",
      "token": "is"
    },
    {
      "position_length": 1,
      "end_offset": 21,
      "start_offset": 10,
      "position": 2,
      "type": "<ALPHANUM>",
      "token": "SingleStore"
    }
  ]} |
+-----------------------------------------------+

Example 4 - Custom Analyzer with Whitespace Tokenizer

The following command generates the tokens for the string "This guide teaches you how to build a multimodal Retrieval-Augmented Generation (RAG) application … " using a Whitespace tokenizer as used in Example 1 on the Full Text Version 2 Custom Analyzers page.

ANALYZE FULLTEXT "This guide teaches you how to build a multimodal Retrieval-Augmented Generation (RAG) application using SingleStore, integrating various data types for enhanced AI responses."
OPTIONS
'{
"analyzer": {
"custom": {
"tokenizer": "whitespace"
}
}
}';
+-----------------------------------------------+
| Response                                      |
+-----------------------------------------------+
| {"tokens": [
    {
      "position_length": 1,
      "end_offset": 4,
      "start_offset": 0,
      "position": 0,
      "type": "word",
      "token": "This"
    },
    {
      "position_length": 1,
      "end_offset": 10,
      "start_offset": 5,
      "position": 1,
      "type": "word",
      "token": "guide"
    },
    {
      "position_length": 1,
      "end_offset": 18,
      "start_offset": 11,
      "position": 2,
      "type": "word",
      "token": "teaches"
    },
    {
      "position_length": 1,
      "end_offset": 22,
      "start_offset": 19,
      "position": 3,
      "type": "word",
      "token": "you"
    },
    {
      "position_length": 1,
      "end_offset": 26,
      "start_offset": 23,
      "position": 4,
      "type": "word",
      "token": "how"
    },
    {
      "position_length": 1,
      "end_offset": 29,
      "start_offset": 27,
      "position": 5,
      "type": "word",
      "token": "to"
    },
    {
      "position_length": 1,
      "end_offset": 35,
      "start_offset": 30,
      "position": 6,
      "type": "word",
      "token": "build"
    },
    {
      "position_length": 1,
      "end_offset": 37,
      "start_offset": 36,
      "position": 7,
      "type": "word",
      "token": "a"
    },
    {
      "position_length": 1,
      "end_offset": 48,
      "start_offset": 38,
      "position": 8,
      "type": "word",
      "token": "multimodal"
    },
    {
      "position_length": 1,
      "end_offset": 68,
      "start_offset": 49,
      "position": 9,
      "type": "word",
      "token": "Retrieval-Augmented"
    },
    {
      "position_length": 1,
      "end_offset": 79,
      "start_offset": 69,
      "position": 10,
      "type": "word",
      "token": "Generation"
    },
    {
      "position_length": 1,
      "end_offset": 85,
      "start_offset": 80,
      "position": 11,
      "type": "word",
      "token": "(RAG)"
    },
    {
      "position_length": 1,
      "end_offset": 97,
      "start_offset": 86,
      "position": 12,
      "type": "word",
      "token": "application"
    },
    {
      "position_length": 1,
      "end_offset": 103,
      "start_offset": 98,
      "position": 13,
      "type": "word",
      "token": "using"
    },
    {
      "position_length": 1,
      "end_offset": 116,
      "start_offset": 104,
      "position": 14,
      "type": "word",
      "token": "SingleStore,"
    },
    {
      "position_length": 1,
      "end_offset": 128,
      "start_offset": 117,
      "position": 15,
      "type": "word",
      "token": "integrating"
    },
    {
      "position_length": 1,
      "end_offset": 136,
      "start_offset": 129,
      "position": 16,
      "type": "word",
      "token": "various"
    },
    {
      "position_length": 1,
      "end_offset": 141,
      "start_offset": 137,
      "position": 17,
      "type": "word",
      "token": "data"
    },
    {
      "position_length": 1,
      "end_offset": 147,
      "start_offset": 142,
      "position": 18,
      "type": "word",
      "token": "types"
    },
    {
      "position_length": 1,
      "end_offset": 151,
      "start_offset": 148,
      "position": 19,
      "type": "word",
      "token": "for"
    },
    {
      "position_length": 1,
      "end_offset": 160,
      "start_offset": 152,
      "position": 20,
      "type": "word",
      "token": "enhanced"
    },
    {
      "position_length": 1,
      "end_offset": 163,
      "start_offset": 161,
      "position": 21,
      "type": "word",
      "token": "AI"
    },
    {
      "position_length": 1,
      "end_offset": 174,
      "start_offset": 164,
      "position": 22,
      "type": "word",
      "token": "responses."
    }
  ]} |
+-----------------------------------------------+

Last modified: February 11, 2026

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK

Try Out This Notebook to See What’s Possible in SingleStore

Get access to other groundbreaking datasets and engage with our community for expert advice.