Models

Create a Model Inference

LLM Model Inference

Create an LLM model inference by following these steps:

Navigate to the Cloud Portal.
In the left navigation, select AI > Models.

Select the LLM Models tab, and then select New.

Select Model

Provider

Select the provider from one of the following:

Amazon Bedrock
Azure AI Services

Publisher

Select the name of the publisher from one of the following:

Anthropic
OpenAI

SingleStore supports the following LLM models:

Provider	Publisher	Model
Amazon Bedrock	Anthropic	Claude 3.5 Haiku Claude Sonnet 4
Azure AI Services	OpenAI	gpt-4.1-mini gpt-4.1

Select Next.

In Inference API Settings, enter and review the following information:

Name	Enter a name for the selected model.
Description	Enter the selected model description.
Region	Select the region for the selected model.
Estimated Cost	Displays the maximum number of tokens allowed for the selected model.

Accept the terms of service agreement and select Publish for the selected LLM model inference deployment.

Once the LLM model inference is deployed, follow the sample code to start integrating with the model in either Python or Node.js.

Embedding Model Inference

Create an embedding model inference by following these steps:

Navigate to the Cloud Portal.
In the left navigation, select AI > Models.

Select the Embedding Models tab, and then select New.

Select Model

Provider

Select the provider among:

Amazon Bedrock
Azure AI Services

Publisher

Select the name of the publisher among:

Amazon
OpenAI

SingleStore supports the following embedding models:

Provider	Publisher	Model
Amazon Bedrock	Amazon	Titan Embeddings G1 - Text Titan Text Embeddings V2
Azure AI Services	OpenAI	text-embedding-3-small text-embedding-3-large

Select Next.

In Inference API Settings, enter and review the following information:

Name	Enter a name for the selected model.
Description	Enter the selected model description.
Region	Select the region for the selected model.
Estimated Cost	Displays the maximum number of tokens allowed for the selected model.

Accept the terms of service agreement and select Publish for the selected embedding model inference deployment.

Once the embedding model inference is deployed, follow the sample code to start integrating with the model in either Python or Node.js.

Train a New ML Model

Refer to Train a New ML Model for more information.

Manage LLMs and Embedding Model Inference

Existing LLMs and embedding model inferences can be managed by performing following actions:

Explore the existing model inferences in Playground
Update
Share
Delete

A notebook can be generated for the embedding model inference only.

Use an Existing Model Inference

To use the existing model inference, perform the following actions:

Select the model in the Name column and then select Details to view the sample code for the integration.
Create an Aura App API key to use the existing model inference. Refer to Aura App API Keys for related information.

Generate an Embeddings Notebook

Embedding notebooks can be generated for an embedding model inference. Select Details > Generate Notebook. The embedding notebook uses an existing embedding model inference to generate embeddings for a column in a specified table. It stores the resulting vectors in a new column within the same table that enables it to manage and query the embeddings alongside the original data. In the Embeddings Notebook dialog, enter or select the following:

Workspace	Select a workspace.
Database	Select a database.
Table	Select a table. The selected table must have embeddable columns.
Source Column	Select a source column.
Destination Column	Enter the name of the destination column.
API Key Secret	Select the Aura App API key created for the embedding model inference.

Select Generate to generate the embedding notebook. A Vector Embedding Pipeline Notebook is created. View the generated embedding notebook in Editor > Shared.

Explore an Existing Model Inference in Playground

To explore an existing model inference in the playground, select the model inference in the Name column and then select Playground.

Embedding Model

In Text Input, enter the text to generate embeddings and view the results in Embedding Results.

Chat Completion Model

In Text Input, enter the text to generate text or chat with the model. Set model parameters as required.

Parameters

System Prompt	Enter the system prompt for the chat completion model.
Temperature	Set the temperature of the model. Lower values are less random and near zero is deterministic. The default value is 0.7. It ranges between 0 and 1.
Max. Tokens	Set the maximum number of tokens generated by the model. The default value is 512. It ranges between 1 and 1024.
Frequency Penalty	Set the frequency penalty of the model. It penalizes new tokens based on existing frequency which reduces repetition. The default value is 0. It ranges between -2 and 2.
Top P	Set the top P parameter of the model. It controls diversity via nucleus sampling (for example, 0.5 considers half of options). The default value is 1. It ranges between 0 and 1.

Update an Existing Model Inference

To update an existing model inference, select the model in the Name column and then select Update in the upper right. Enter the updates required for the existing model inference, and select Update.

To share an existing model inference, navigate to Models > LLM Models/Embedding Models, select the ellipsis in the Actions column of the model, and select Share. Select the User or Team with which to share the model inference and select Save.

Delete an Existing Model Inference

To delete an existing model inference, select the ellipsis in the Actions column of the model, and select Delete.

Manage an Existing ML Model

Refer to Manage an Existing ML Model for more information.

Models

On this page

Create a Model Inference

LLM Model Inference

Embedding Model Inference

Train a New ML Model

Manage LLMs and Embedding Model Inference

Use an Existing Model Inference

Generate an Embeddings Notebook

Explore an Existing Model Inference in Playground

Embedding Model

Chat Completion Model

Update an Existing Model Inference

Delete an Existing Model Inference

Manage an Existing ML Model

Was this article helpful?

On this page

Was this article helpful?

Models

On this page

Create a Model Inference

LLM Model Inference

Embedding Model Inference

Train a New ML Model

Manage LLMs and Embedding Model Inference

Use an Existing Model Inference

Generate an Embeddings Notebook

Explore an Existing Model Inference in Playground

Embedding Model

Chat Completion Model

Update an Existing Model Inference

Share an Existing Model Inference

Delete an Existing Model Inference

Manage an Existing ML Model

Was this article helpful?

On this page

Was this article helpful?