# Models

> **📝 Note**: This is a Preview feature.

Models integrate directly with existing SingleStore workflows and provide enterprise-grade performance, security, and observability to support building GenAI-ready applications.

This Helios feature supports:

1. **LLM models**: Deploy and run large language models (LLMs) for text generation and conversational tasks.

2. **Embedding models**: Generate embeddings for semantic search, recommendations, and similarity tasks.

3. **ML models**: Build, train, and run custom ML models for specific applications.

SingleStore Models supports the following deployment options:

* Externally hosted:

  * Azure AI Services
  * Amazon Bedrock
* Aura hosted: This option hosts the model on SingleStore Aura container service that provides lowest latency (model co-located with compute and data) and ensures that data never leaves the SingleStore VPC. It supports custom and open-source models.

## LLM Models

SingleStore supports the following LLM models:

| Provider               | Publisher                                                   | Model                                                          |
| ---------------------- | ----------------------------------------------------------- | -------------------------------------------------------------- |
| Amazon Bedrock         | Anthropic                                                   | <ul> <li>Claude Opus 4.6</li> <li>Claude Sonnet 4.6</li> </ul> |
| Azure AI Services      | OpenAI                                                      | <ul> <li>gpt-5.4</li> <li>gpt-4.1-mini</li> </ul>              |
| Aura Hosted LLM Models | Open source models (Refer to License for individual models) | <ul> <li>Llama 3.2</li> <li>Qwen</li> </ul>                    |

## Embedding Models

SingleStore supports the following embedding models:

| Provider                     | Publisher                                                   | Model                                                                            |
| ---------------------------- | ----------------------------------------------------------- | -------------------------------------------------------------------------------- |
| Amazon Bedrock               | Amazon                                                      | <ul> <li>Titan Embeddings G1 - Text</li> <li>Titan Text Embeddings V2</li> </ul> |
| Amazon Bedrock               | Anthropic                                                   | <ul> <li>Amazon Nova Multimodal Embeddings</li> </ul>                            |
| Azure AI Services            | OpenAI                                                      | <ul> <li>text-embedding-3-small</li> <li>text-embedding-3-large</li> </ul>       |
| Aura Hosted embedding models | Open source models (Refer to License for individual models) | <ul> <li>qwen 0.6B</li> <li>qwen 4B</li> </ul>                                   |

## ML Models

Refer to [ML Functions](https://docs.singlestore.com/cloud/ai/ai-ml-functions/ml-functions.md) for more information.

## Create a Model Inference

## LLM Model Inference

Create an LLM model inference by following these steps:

1. Navigate to the [Cloud Portal](http://portal.singlestore.com).

2. In the left navigation, select **AI > Models**.

3. Select the **LLM Models** tab, and then select **New**.

4. In **Select Model**, select the **Provider** and the model of your choice, and then select **Next**.

5. In **Model Settings**, enter and review the following information:
   | **Name**           | Enter a name for the selected model.                                  |
   | ------------------ | --------------------------------------------------------------------- |
   | **Description**    | Enter the selected model description.                                 |
   | **Region**         | Select the region for the selected model.                             |
   | **Estimated Cost** | Displays the maximum number of tokens allowed for the selected model. |

6. Accept the terms of service agreement and select **Publish** for the selected LLM model inference deployment.

Once the LLM model inference is deployed, use the LLM model inference via the following options:

* API: Follow the sample code to start integrating with the model in either Python or Node.js.
* AI Functions: Refer to [Text Processing Functions](https://docs.singlestore.com/cloud/ai/ai-ml-functions/ai-functions/#section-id235175307180532.md) for more information.

## Embedding Model Inference

Create an embedding model inference by following these steps:

1. Navigate to the [Cloud Portal](http://portal.singlestore.com).

2. In the left navigation, select **AI > Models**.

3. Select the **Embedding Models** tab, and then select **New**.

4. In **Select Model**, select the **Provider** and the model of your choice, and then select **Next**.

5. In **Model Settings**, enter and review the following information:
   | **Name**           | Enter a name for the selected model.                                  |
   | ------------------ | --------------------------------------------------------------------- |
   | **Description**    | Enter the selected model description.                                 |
   | **Region**         | Select the region for the selected model.                             |
   | **Estimated Cost** | Displays the maximum number of tokens allowed for the selected model. |

6. Accept the terms of service agreement and select **Publish** for the selected embedding model inference deployment.

Once the embedding model inference is deployed, use the embedding model inference via the following options:

* API: Follow the sample code to start integrating with the model in either Python or Node.js.
* AI Functions: Refer to [Embedding Function](https://docs.singlestore.com/cloud/ai/ai-ml-functions/ai-functions/#section-id235175344319444.md) for more information.

## Train a New ML Model

Refer to [Train a New ML Model](https://docs.singlestore.com/cloud/ai/ai-ml-functions/ml-functions/#section-id23517557347527.md) for more information.

## Manage LLMs and Embedding Model Inference

Existing LLMs and embedding model inferences can be managed by performing following actions:

* Explore the existing model inferences in Playground
* Update
* Share
* Delete

A notebook can be generated for the embedding model inference only.

## Use an Existing Model Inference

To use the existing model inference, perform the following actions:

* Select the model in the **Name** column and then select **Details** to view the sample code for the integration.
* Create an Aura App API key to use the existing model inference. Refer to [Aura App API Keys](https://docs.singlestore.com/cloud/container-services/container-app-api-keys.md) for related information.

## Generate an Embeddings Notebook

Embedding notebooks can be generated for an embedding model inference. Select **Details > Generate Notebook**. The embedding notebook uses an existing embedding model inference to generate embeddings for a column in a specified table. It stores the resulting vectors in a new column within the same table that enables it to manage and query the embeddings alongside the original data. In the **Embeddings Notebook** dialog, enter or select the following:

| **Workspace**          | Select aworkspace.                                                     |
| ---------------------- | ---------------------------------------------------------------------- |
| **Database**           | Select a database.                                                     |
| **Table**              | Select a table. The selected table must have embeddable columns.       |
| **Source Column**      | Select a source column.                                                |
| **Destination Column** | Enter the name of the destination column.                              |
| **API Key Secret**     | Select the Aura App API key created for the embedding model inference. |

Select **Generate** to generate the embedding notebook. A **Vector Embedding Pipeline Notebook** is created. View the generated embedding notebook in **Editor > Shared**.

## Explore an Existing Model Inference in Playground

To explore an existing model inference in the playground, select the model inference in the **Name** column and then select **Playground**.

## Embedding Model

In **Text Input**, enter the text to generate embeddings and view the results in **Embedding Results**.

## Chat Completion Model

In **Text Input**, enter the text to generate text or chat with the model. Set model parameters as required.

**Parameters**

| **System Prompt**     | Enter the system prompt for the chat completion model.                                                                                                                            |
| --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Temperature**       | Set the temperature of the model. Lower values are less random and near zero is deterministic. The default value is 0.7. It ranges between 0 and 1.                               |
| **Max. Tokens**       | Set the maximum number of tokens generated by the model. The default value is 512. It ranges between 1 and 1024.                                                                  |
| **Frequency Penalty** | Set the frequency penalty of the model. It penalizes new tokens based on existing frequency which reduces repetition. The default value is 0. It ranges between -2 and 2.         |
| **Top P**             | Set the top P parameter of the model. It controls diversity via nucleus sampling (for example, 0.5 considers half of options). The default value is 1. It ranges between 0 and 1. |

## Update an Existing Model Inference

To update an existing model inference, select the model in the **Name** column and then select **Update** in the upper right. Enter the updates required for the existing model inference, and select **Update**.

## Share an Existing Model Inference

To share an existing model inference, navigate to **Models > LLM Models/Embedding Models**, select the ellipsis in the **Actions** column of the model, and select **Share**. Select the **User** or **Team** with which to share the model inference and select **Save**.

> **📝 Note**: Custom models must be shared before use. Hence, to use a newly created model instance in an AI function, you have to explicitly share the model with the service account created during AI function installation.

## Delete an Existing Model Inference

To delete an existing model inference, select the ellipsis in the **Actions** column of the model, and select **Delete**.

## Manage an Existing ML Model

Refer to [Manage an Existing ML Model](https://docs.singlestore.com/cloud/ai/ai-ml-functions/ml-functions/#section-id235175587330847.md) for more information.

***

Modified at: June 1, 2026

Source: [/cloud/ai/models/](https://docs.singlestore.com/cloud/ai/models/)

(An index of the documentation is available at /llms.txt)
