Models
On this page
Note
This is a Preview feature.
Models integrate directly with existing SingleStore workflows and provide enterprise-grade performance, security, and observability to support building GenAI-ready applications.
This Helios feature supports:
-
LLM models: Deploy and run large language models (LLMs) for text generation and conversational tasks.
-
Embedding models: Generate embeddings for semantic search, recommendations, and similarity tasks.
-
ML models: Build, train, and run custom ML models for specific applications.
Create a Model Inference
LLM Model Inference
Create an LLM model inference by following these steps:
-
Navigate to the Cloud Portal.
-
In the left navigation, select AI > Models.
-
Select the LLM Models tab, and then select New.
Select Model
Provider
Select the provider from one of the following:
-
Amazon Bedrock
-
Azure AI Services
Publisher
Select the name of the publisher from one of the following:
-
Anthropic
-
OpenAI
SingleStore supports the following LLM models:
Provider
Publisher
Model
Amazon Bedrock
Anthropic
-
Claude 3.
5 Haiku -
Claude Sonnet 4
Azure AI Services
OpenAI
-
gpt-4.
1-mini -
gpt-4.
1
Select Next.
-
-
In Inference API Settings, enter and review the following information:
Name
Enter a name for the selected model.
Description
Enter the selected model description.
Region
Select the region for the selected model.
Estimated Cost
Displays the maximum number of tokens allowed for the selected model.
-
Accept the terms of service agreement and select Publish for the selected LLM model inference deployment.
Once the LLM model inference is deployed, follow the sample code to start integrating with the model in either Python or Node.
Embedding Model Inference
Create an embedding model inference by following these steps:
-
Navigate to the Cloud Portal.
-
In the left navigation, select AI > Models.
-
Select the Embedding Models tab, and then select New.
Select Model
Provider
Select the provider among:
-
Amazon Bedrock
-
Azure AI Services
Publisher
Select the name of the publisher among:
-
Amazon
-
OpenAI
SingleStore supports the following embedding models:
Provider
Publisher
Model
Amazon Bedrock
Amazon
-
Titan Embeddings G1 - Text
-
Titan Text Embeddings V2
Azure AI Services
OpenAI
-
text-embedding-3-small
-
text-embedding-3-large
Select Next.
-
-
In Inference API Settings, enter and review the following information:
Name
Enter a name for the selected model.
Description
Enter the selected model description.
Region
Select the region for the selected model.
Estimated Cost
Displays the maximum number of tokens allowed for the selected model.
-
Accept the terms of service agreement and select Publish for the selected embedding model inference deployment.
Once the embedding model inference is deployed, follow the sample code to start integrating with the model in either Python or Node.
Train a New ML Model
Refer to Train a New ML Model for more information.
Manage LLMs and Embedding Model Inference
Existing LLMs and embedding model inferences can be managed by performing following actions:
-
Explore the existing model inferences in Playground
-
Update
-
Share
-
Delete
A notebook can be generated for the embedding model inference only.
Use an Existing Model Inference
To use the existing model inference, perform the following actions:
-
Select the model in the Name column and then select Details to view the sample code for the integration.
-
Create an Aura App API key to use the existing model inference.
Refer to Aura App API Keys for related information.
Generate an Embeddings Notebook
Embedding notebooks can be generated for an embedding model inference.
|
Workspace |
Select a workspace. |
|
Database |
Select a database. |
|
Table |
Select a table. |
|
Source Column |
Select a source column. |
|
Destination Column |
Enter the name of the destination column. |
|
API Key Secret |
Select the Aura App API key created for the embedding model inference. |
Select Generate to generate the embedding notebook.
Explore an Existing Model Inference in Playground
To explore an existing model inference in the playground, select the model inference in the Name column and then select Playground.
Embedding Model
In Text Input, enter the text to generate embeddings and view the results in Embedding Results.
Chat Completion Model
In Text Input, enter the text to generate text or chat with the model.
Parameters
|
System Prompt |
Enter the system prompt for the chat completion model. |
|
Temperature |
Set the temperature of the model. |
|
Max. |
Set the maximum number of tokens generated by the model. |
|
Frequency Penalty |
Set the frequency penalty of the model. |
|
Top P |
Set the top P parameter of the model. |
Update an Existing Model Inference
To update an existing model inference, select the model in the Name column and then select Update in the upper right.
Share an Existing Model Inference
To share an existing model inference, navigate to Models > LLM Models/Embedding Models, select the ellipsis in the Actions column of the model, and select Share.
Delete an Existing Model Inference
To delete an existing model inference, select the ellipsis in the Actions column of the model, and select Delete.
Manage an Existing ML Model
Refer to Manage an Existing ML Model for more information.
Last modified: November 6, 2025