AI Functions
On this page
Install AI Functions
To install AI Functions, navigate to AI > AI & ML Functions, select the deployment on which to install AI Functions.
Once the AI Functions are installed, query them in the SQL Editor or SingleStore notebooks.
|
Category |
Function |
|---|---|
|
Text Processing Functions |
|
| |
| |
| |
| |
| |
|
Embedding and Vector Functions |
|
|
Text Processing Functions
AI_ COMPLETE
Provides batched LLM powered completion of every input text.
Syntax
AI_COMPLETE(texts, model)
Arguments
-
texts: A prompt. -
model: An LLM model.
Return Type
string
Usage
|
Basic usage with the default model |
|
|
Basic usage with a specific model |
|
|
Input example on database |
|
AI_ SENTIMENT
Provides sentiment classification and score for all user-defined inputs.
Syntax
AI_SENTIMENT(texts, model)
Arguments
-
texts: A prompt. -
model: An LLM model.
Return Type
string
Usage
|
Basic usage with default model |
|
|
Basic usage with selected model |
|
|
Input example on database |
|
AI_ TRANSLATE
Provides translation of user-provided documents from source language to target language.
Syntax
AI_TRANSLATE(texts, source_languages, target_languages, model)
Arguments
-
texts: A prompt. -
source_: The language in which the prompt is written.languages -
target_: The language to which the prompt gets translated.languages -
model: An LLM model.
Return Type
string
Usage
|
Basic usage with default model |
|
|
Basic usage with selected model |
|
|
Input example on database |
|
AI_ SUMMARIZE
Provides summary of user-provided documents within the specified length.
Syntax
AI_SUMMARIZE(texts, model, max_lengths)
Arguments
-
texts: A prompt. -
model: An LLM model. -
max_: Maximum length of the summary.lengths
Return Type
string
Usage
|
Basic usage with default model and length |
|
|
Basic usage with selected model |
|
|
Input example on database |
|
AI_ CLASSIFY
Provides classification of each input text into one of the given categories or labels.
Syntax
AI_CLASSIFY(texts, categories, model)
Arguments
-
texts: A prompt. -
categories: Categories for classification. -
model: An LLM model.
Return Type
string
Usage
|
Basic usage with default model |
|
|
Basic usage with selected model |
|
|
Input example on database |
|
AI_ EXTRACT
Extracts information from a block or text based on the specified natural language question.
Syntax
AI_EXTRACT(texts, questions, model)
Arguments
-
texts: A prompt. -
questions: Input natural language question on which the LLM model extracts information. -
model: An LLM model.
Return Type
string
Usage
|
Basic usage with default model |
|
|
Basic usage with selected model |
|
|
Input example on database |
|
Embedding and Vector Functions
VECTOR_ SIMILARITY
Returns similarity score of each input vector using the specified vector similarity method.
Note
VECTOR_ is a preview feature and is not intended for production use.
Syntax
VECTOR_SIMILARITY(vec1, vec2, methods)
Arguments
-
vec1: First vector input with typebytes. -
vec2: Second vector input with typebytes. -
methods: A SingleStore vector similarity method.
Return Type
float32
Usage
|
Basic usage |
|
|
Input example on database |
|
EMBED_ TEXT
Provides batched embeddings of all input text.
Syntax
EMBED_TEXT(texts, model)
Arguments
-
texts: A prompt. -
model: An embedding model.
Return Type
bytes
Usage
|
Basic usage with default model |
|
|
Basic usage with selected embedding model |
|
|
Input example on database |
|
Usage Recommendations for AI Functions
To optimize performance and control costs when using AI Functions, SingleStore recommends the following:
-
Use Common Table Expressions (CTEs) to filter rows before making calls to large language models (LLMs).
The query engine currently sends data to the LLM before applying LLM or WHEREfilters. -
LLM calls are expensive.
Begin with a small dataset to evaluate response quality and verify the results meet the requirements before scaling up. -
Enterprise plans support three model providers; Aura, Amazon Bedrock, and Azure AI Services.
Data is processed according to each provider’s policies. If a row violates provider rules, the system fails the batch that includes the row and returns errors for these rows. -
Strict usage quotas apply per model and per organization.
These quotas are not configurable by end users. For higher usage limits, contact SingleStore Support. Self-service quota configuration will be available in the future.
Demonstrate some common AI function usecases
Note
You can use your existing Standard or Premium workspace with this Notebook.
This feature is currently in Private Preview. Please reach out to support@singlestore.com to confirm if this feature can be enabled in your org.
This Jupyter notebook will help you:
- Load the Amazon Fine Foods Reviews dataset from Kaggle
- Store the data in SingleStore
- Demonstrate powerful AI Functions for text processing and analysis
Prerequisites: Ensure AI Functions are installed on your deployment (AI Services > AI & ML Functions).
Create some simple tables
This setup establishes a basic relational structure to store some reviews for restaurants. Ensure you have selected a database and have CREATE permissions to create/delete tables.
%%sqlCREATE DATABASE IF NOT EXISTS temp;USE temp;
%%sqlDROP TABLE IF EXISTS reviews;CREATE TABLE IF NOT EXISTS reviews (Id INT PRIMARY KEY,ProductId VARCHAR(20),UserId VARCHAR(50),ProfileName VARCHAR(255),HelpfulnessNumerator INT,HelpfulnessDenominator INT,Score INT,Time BIGINT,Summary TEXT,Text TEXT);
Install the required packages
!pip install -q httplib2 kagglehub pandas
Download and Load Dataset
import kagglehubimport pandas as pd# Download the Amazon Fine Foods Reviews dataset from Kaggleprint("Downloading dataset from Kaggle...")path = kagglehub.dataset_download("snap/amazon-fine-food-reviews")print(f"Dataset downloaded to: {path}")# Read the CSV filedf = pd.read_csv(f"{path}/Reviews.csv")# Display dataset infoprint(f"\nDataset shape: {df.shape}")print(f"Columns: {list(df.columns)}")print("\nFirst few rows:")df.head()
Load Data into SingleStore
import singlestoredb as s2# Create SQLAlchemy engine instead of regular connectionengine = s2.create_engine(database='temp')# Take a sample of 10,000 reviews for demo purposessample_df = df.head(10000).copy()print(f"Loading {len(sample_df)} reviews into SingleStore...")# Write dataframe to SingleStore table using SQLAlchemy enginesample_df.to_sql('reviews',con=engine, # Use engine instead of connectionif_exists='append',index=False,chunksize=1000)print("Data loaded successfully!")
Verify Data Load
%%sql-- Check the number of reviews loadedSELECT COUNT(*) as total_reviews FROM reviews;
Sample Data Preview
%%sql-- View sample reviewsSELECT Id, ProductId, Score, Summary, LEFT(Text, 100) as Review_PreviewFROM reviewsLIMIT 10;
AI Functions Demonstrations
Now let's explore the power of SingleStore AI Functions for text analysis and processing. Ensure that AI functions are enabled for the org and you are able to list the available AI functions
%%sqlSHOW functions in cluster;
%%sql-- AI_COMPLETE: Ask general questions and get LLM-powered completionsSELECT cluster.AI_COMPLETE('What is SingleStore?') AS completion;
%%sql-- AI_SENTIMENT: Analyze sentiment of customer reviews for a specific product-- WHERE ProductId = <Your choice>-- Remember to specify the datbase name. In this example 'temp' is the Database nameSELECTId,ProductId,Score,LEFT(Text, 80) as Review_Snippet,cluster.AI_SENTIMENT(Text) AS sentimentFROM temp.reviewsWHERE ProductId = 'B000NY8ODS'LIMIT 10;
%%sql-- Aggregate sentiment analysis across products-- Using CTE to filter and prepare data firstWITH filtered_reviews AS (SELECTProductId,TextFROM temp.reviewsWHERE ProductId IN (SELECT ProductIdFROM temp.reviewsGROUP BY ProductIdHAVING COUNT(*) >= 5)LIMIT 100),grouped_reviews AS (SELECTProductId,COUNT(*) as review_count,GROUP_CONCAT(Text SEPARATOR '. ') as combined_textFROM filtered_reviewsGROUP BY ProductIdLIMIT 5)SELECTProductId,review_count,cluster.AI_SENTIMENT(combined_text) as overall_sentimentFROM grouped_reviews;
%%sql-- AI_SUMMARIZE: Create concise summaries of lengthy reviews-- Filter long reviews first using CTEWITH long_reviews AS (SELECTId,ProductId,Text,LEFT(Text, 150) as Original_ReviewFROM temp.reviewsWHERE LENGTH(Text) > 200LIMIT 5)SELECTId,ProductId,Original_Review,cluster.AI_SUMMARIZE(Text,'aifunctions_chat_default',15) AS summaryFROM long_reviews;
%%sql-- AI_CLASSIFY: Classify customer feedback into categories-- Filter negative reviews first using CTEWITH negative_reviews AS (SELECTId,ProductId,Text,LEFT(Text, 100) as Review_TextFROM temp.reviewsWHERE Score <= 3LIMIT 10)SELECTId,ProductId,Review_Text,cluster.AI_CLASSIFY(Text,'[quality, price, shipping, taste]') AS classificationFROM negative_reviews;
%%sql-- AI_EXTRACT: Extract specific information from reviews-- Filter positive reviews first using CTEWITH positive_reviews AS (SELECTId,ProductId,Text,LEFT(Text, 100) as Review_TextFROM temp.reviewsWHERE Score >= 4LIMIT 10)SELECTId,ProductId,Review_Text,cluster.AI_EXTRACT(Text,'Does this customer indicate they will buy this product again? Answer with yes, no, or unclear only') AS repeat_purchase_intentFROM positive_reviews;
%%sql-- AI_EXTRACT: Identify reviews with high churn risk-- Filter low-rated reviews first using CTEWITH low_rated_reviews AS (SELECTId,ProductId,Score,Text,LEFT(Text, 120) as Review_TextFROM temp.reviewsWHERE Score <= 2LIMIT 10)SELECTId,ProductId,Score,Review_Text,cluster.AI_EXTRACT(Text,'Is this customer at high risk of not purchasing again? Answer with high, medium, or low only') AS churn_riskFROM low_rated_reviews;
%%sql-- AI_TRANSLATE: Translate text between languages-- Filter reviews with substantial summaries first using CTEWITH translatable_reviews AS (SELECTId,Summary as Original_EnglishFROM temp.reviewsWHERE Score = 5AND Summary IS NOT NULLAND LENGTH(Summary) > 20LIMIT 5)SELECTId,Original_English,cluster.AI_TRANSLATE(Original_English,'english','spanish') AS spanish_translationFROM translatable_reviews;
%%sql-- Combined AI Functions: Comprehensive product analysis-- Filter to products with multiple reviews firstWITH popular_products AS (SELECT ProductIdFROM temp.reviewsGROUP BY ProductIdHAVING COUNT(*) >= 10LIMIT 5),product_reviews AS (SELECTr.ProductId,r.Text,r.Score,LEFT(r.Text, 80) as Review_SampleFROM temp.reviews rINNER JOIN popular_products p ON r.ProductId = p.ProductIdLIMIT 10)SELECTProductId,Score,Review_Sample,cluster.AI_SENTIMENT(Text) as sentiment,cluster.AI_CLASSIFY(Text, '[quality, value, taste, packaging]') as category,cluster.AI_SUMMARIZE(Text, 'aifunctions_chat_default', 10) as brief_summaryFROM product_reviews;
Cleanup
%%sqlDROP TABLE IF EXISTS reviews;DROP DATABASE IF EXISTS temp;

Last modified: February 4, 2026