Models

Note

This is a Preview feature.

Models integrate directly with existing SingleStore workflows and provide enterprise-grade performance, security, and observability to support building GenAI-ready applications.

This Helios feature supports:

  1. LLM models: Deploy and run large language models (LLMs) for text generation and conversational tasks.

  2. Embedding models: Generate embeddings for semantic search, recommendations, and similarity tasks.

  3. ML models: Build, train, and run custom ML models for specific applications.

Create a Model Inference

LLM Model Inference

Create an LLM model inference by following these steps:

  1. Navigate to the Cloud Portal.

  2. In the left navigation, select AI > Models.

  3. Select the LLM Models tab, and then select New.

    Select Model

    Provider

    Select the provider from one of the following:

    • Amazon Bedrock

    • Azure AI Services

    Publisher

    Select the name of the publisher from one of the following:

    • Anthropic

    • OpenAI

    SingleStore supports the following LLM models:

    Provider

    Publisher

    Model

    Amazon Bedrock

    Anthropic

    • Claude 3.5 Haiku

    • Claude Sonnet 4

    Azure AI Services

    OpenAI

    • gpt-4.1-mini

    • gpt-4.1

    Select Next.

  4. In Inference API Settings, enter and review the following information:

    Name

    Enter a name for the selected model.

    Description

    Enter the selected model description.

    Region

    Select the region for the selected model.

    Estimated Cost

    Displays the maximum number of tokens allowed for the selected model.

  5. Accept the terms of service agreement and select Publish for the selected LLM model inference deployment.

Once the LLM model inference is deployed, follow the sample code to start integrating with the model in either Python or Node.js.

Embedding Model Inference

Create an embedding model inference by following these steps:

  1. Navigate to the Cloud Portal.

  2. In the left navigation, select AI > Models.

  3. Select the Embedding Models tab, and then select New.

    Select Model

    Provider

    Select the provider among:

    • Amazon Bedrock

    • Azure AI Services

    Publisher

    Select the name of the publisher among:

    • Amazon

    • OpenAI

    SingleStore supports the following embedding models:

    Provider

    Publisher

    Model

    Amazon Bedrock

    Amazon

    • Titan Embeddings G1 - Text

    • Titan Text Embeddings V2

    Azure AI Services

    OpenAI

    • text-embedding-3-small

    • text-embedding-3-large

    Select Next.

  4. In Inference API Settings, enter and review the following information:

    Name

    Enter a name for the selected model.

    Description

    Enter the selected model description.

    Region

    Select the region for the selected model.

    Estimated Cost

    Displays the maximum number of tokens allowed for the selected model.

  5. Accept the terms of service agreement and select Publish for the selected embedding model inference deployment.

Once the embedding model inference is deployed, follow the sample code to start integrating with the model in either Python or Node.js.

Train a New ML Model

Refer to Train a New ML Model for more information.

Manage LLMs and Embedding Model Inference

Existing LLMs and embedding model inferences can be managed by performing following actions:

  • Explore the existing model inferences in Playground

  • Update

  • Share

  • Delete

A notebook can be generated for the embedding model inference only.

Use an Existing Model Inference

To use the existing model inference, perform the following actions:

  • Select the model in the Name column and then select Details to view the sample code for the integration.

  • Create an Aura App API key to use the existing model inference. Refer to Aura App API Keys for related information.

Generate an Embeddings Notebook

Embedding notebooks can be generated for an embedding model inference. Select Details > Generate Notebook. The embedding notebook uses an existing embedding model inference to generate embeddings for a column in a specified table. It stores the resulting vectors in a new column within the same table that enables it to manage and query the embeddings alongside the original data. In the Embeddings Notebook dialog, enter or select the following:

Workspace

Select a workspace.

Database

Select a database.

Table

Select a table. The selected table must have embeddable columns.

Source Column

Select a source column.

Destination Column

Enter the name of the destination column.

API Key Secret

Select the Aura App API key created for the embedding model inference.

Select Generate to generate the embedding notebook. A Vector Embedding Pipeline notebook is created. View the generated embedding notebook in Editor > Shared.

Explore an Existing Model Inference in Playground

To explore an existing model inference in the playground, select the model inference in the Name column and then select Playground.

Embedding Model

In Text Input, enter the text to generate embeddings and view the results in Embedding Results.

Chat Completion Model

In Text Input, enter the text to generate text or chat with the model. Set model parameters as required.

Parameters

System Prompt

Enter the system prompt for the chat completion model.

Temperature

Set the temperature of the model. Lower values are less random and near zero is deterministic. The default value is 0.7. It ranges between 0 and 1.

Max. Tokens

Set the maximum number of tokens generated by the model. The default value is 512. It ranges between 1 and 1024.

Frequency Penalty

Set the frequency penalty of the model. It penalizes new tokens based on existing frequency which reduces repetition. The default value is 0. It ranges between -2 and 2.

Top P

Set the top P parameter of the model. It controls diversity via nucleus sampling (for example, 0.5 considers half of options). The default value is 1. It ranges between 0 and 1.

Update an Existing Model Inference

To update an existing model inference, select the model in the Name column and then select Update in the upper right. Enter the updates required for the existing model inference, and select Update.

Share an Existing Model Inference

To share an existing model inference, navigate to Models > LLM Models/Embedding Models, select the ellipsis in the Actions column of the model, and select Share. Select the User or Team with which to share the model inference and select Save.

Delete an Existing Model Inference

To delete an existing model inference, select the ellipsis in the Actions column of the model, and select Delete.

Manage an Existing ML Model

Refer to Manage an Existing ML Model for more information.

Last modified: November 6, 2025

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK

Try Out This Notebook to See What’s Possible in SingleStore

Get access to other groundbreaking datasets and engage with our community for expert advice.