AI Functions

Install AI Functions

To install AI Functions, navigate to AI > AI & ML Functions, select the deployment on which to install AI Functions. In the AI Functions tab, select Install, review the AI Functions Summary and then select Deploy.

Once the AI Functions are installed, query them in the SQL Editor or SingleStore notebooks. SingleStore provides the following AI Functions:

Category	Function
Text Processing Functions	SQL AI_COMPLETE(texts, model)
	SQL AI_SENTIMENT(texts, model)
	SQL AI_TRANSLATE(texts, source_languages, target_languages, model)
	SQL AI_SUMMARIZE(texts, model, max_lengths)
	SQL AI_CLASSIFY(texts, categories, model)
	SQL AI_EXTRACT(texts, questions, model)
Embedding and Vector Functions	SQL VECTOR_SIMILARITY(vec1, vec2, methods)
Embedding and Vector Functions	SQL EMBED_TEXT(texts, model)

Text Processing Functions

AI_COMPLETE

Provides batched LLM powered completion of every input text. Used for general purpose text generation, completion, and complex reasoning.

Syntax

SQL

AI_COMPLETE(texts, model)

Arguments

texts: A prompt.
model: An LLM model.

Return Type

string

Usage

Basic usage with the default model	SQL SELECT cluster.AI_COMPLETE('Hello, how are you?') AS completion;
Basic usage with a specific model	SQL SELECT cluster.AI_COMPLETE('Hello, how are you?', 'claude-3-5-sonnet') AS completion;
Input example on database	SQL SELECT cluster.AI_COMPLETE(column_1) FROM table;

AI_SENTIMENT

Provides sentiment classification and score for all user-defined inputs.

Syntax

SQL

AI_SENTIMENT(texts, model)

Arguments

texts: A prompt.
model: An LLM model.

Return Type

string

Usage

Basic usage with default model

SQL

SELECT cluster.AI_SENTIMENT('Wow, that food was rotten') AS sentiment;

Basic usage with selected model

SQL

SELECT cluster.AI_SENTIMENT('Wow, that ice cream tasted amazing', 'claude-3-5-sonnet') AS sentiment;

Input example on database

SQL

SELECT cluster.AI_SENTIMENT(column_1) FROM table;

AI_TRANSLATE

Provides translation of user-provided documents from source language to target language. Supports multi-language translation.

Syntax

SQL

AI_TRANSLATE(texts, source_languages, target_languages, model)

Arguments

texts: A prompt.
source_languages: The language in which the prompt is written.
target_languages: The language to which the prompt gets translated.
model: An LLM model.

Return Type

string

Usage

Basic usage with default model

SQL

SELECT cluster.AI_TRANSLATE('Hello, how are you?', 'English', 'Spanish') AS summary;

Basic usage with selected model

SQL

SELECT cluster.AI_TRANSLATE('Hello, how are you?', 'English', 'Spanish', 'claude-3-5-sonnet') AS translation;

Input example on database

SQL

SELECT cluster.AI_TRANSLATE(column_1, "source_language", "target_language") FROM table;

AI_SUMMARIZE

Provides summary of user-provided documents within the specified length.

Syntax

SQL

AI_SUMMARIZE(texts, model, max_lengths)

Arguments

texts: A prompt.
model: An LLM model.
max_lengths: Maximum length of the summary.

Return Type

string

Usage

Basic usage with default model and length

SQL

SELECT cluster.AI_SUMMARIZE('That movie was amazing ... I would definitely go again.') AS summary;

Basic usage with selected model

SQL

SELECT cluster.AI_SUMMARIZE('Hello, how are you?', 'claude-3-5-sonnet', 3) AS summary;

Input example on database

SQL

SELECT cluster.AI_SUMMARIZE(column_1) FROM table;

AI_CLASSIFY

Provides classification of each input text into one of the given categories or labels.

Syntax

SQL

AI_CLASSIFY(texts, categories, model)

Arguments

texts: A prompt.
categories: Categories for classification.
model: An LLM model.

Return Type

string

Usage

Basic usage with default model

SQL

SELECT cluster.AI_CLASSIFY('Hello, how are you?', '[greeting, goodbye]') AS classification;

Basic usage with selected model

SQL

SELECT cluster.AI_CLASSIFY('Hello, how are you?', '[greeting, goodbye]', 'claude-3-5-sonnet') AS classification;

Input example on database

SQL

SELECT cluster.AI_CLASSIFY(column_1, categories) FROM table;

AI_EXTRACT

Extracts information from a block or text based on the specified natural language question.

Syntax

SQL

AI_EXTRACT(texts, questions, model)

Arguments

texts: A prompt.
questions: Input natural language question on which the LLM model extracts information.
model: An LLM model.

Return Type

string

Usage

Basic usage with default model

SQL

SELECT cluster.AI_EXTRACT("Hello, how are you?', 'What is the first word' ) AS answer;

Basic usage with selected model

SQL

SELECT cluster.AI_EXTRACT("Hello, how are you?', 'What is the first word', 'claude-3-5-sonnet' ) AS answer;

Input example on database

SQL

SELECT cluster.AI_EXTRACT(column_1, question) FROM table;

Embedding and Vector Functions

VECTOR_SIMILARITY

Returns similarity score of each input vector using the specified vector similarity method.

Note

VECTOR_SIMILARITY is a preview feature and is not intended for production use. For production vector search SingleStore provides DOT_PRODUCT and EUCLIDEAN_DISTANCE functions and Approximate Nearest Neighbor (ANN) indexes. Refer to Working with Vector Data for an overview of SingleStore's vector processing.

Syntax

SQL

VECTOR_SIMILARITY(vec1, vec2, methods)

Arguments

vec1: First vector input with type bytes.
vec2: Second vector input with type bytes.
methods: A SingleStore vector similarity method.

Return Type

float32

Usage

Basic usage	SQL SELECT cluster.VECTOR_SIMILARITY(JSON_ARRAY_PACK('[2,3,4]'):>VECTOR(3), JSON_ARRAY_PACK('[2,3,4]'):>VECTOR(3), 'cosine') AS similarity;
Input example on database	SQL SELECT cluster.VECTOR_SIMILARITY(column_1, column_2) FROM table;

EMBED_TEXT

Provides batched embeddings of all input text. Converts text into high-dimensional vector embeddings for semantic search and RAG applications.

Syntax

SQL

EMBED_TEXT(texts, model)

Arguments

texts: A prompt.
model: An embedding model.

Return Type

bytes

Usage

Basic usage with default model	SQL SELECT cluster.EMBED_TEXT('Hello, how are you?') AS embedding;
Basic usage with selected embedding model	SQL SELECT cluster.EMBED_TEXT('Hello, how are you?, 'openai-text-embed-large') AS embedding;
Input example on database	SQL SELECT cluster.EMBED_TEXT(column_1) FROM table;

Example Notebook

The following notebook demonstrates AI Functions:

demonstrate-some-common-ai-function-usecases.ipynb

SingleStore Notebooks

Demonstrate some common AI function usecases

Note

You can use your existing Standard or Premium workspace with this Notebook.

This feature is currently in Private Preview. Please reach out to support@singlestore.com to confirm if this feature can be enabled in your org.

This Jupyter notebook will help you:

Load the Amazon Fine Foods Reviews dataset from Kaggle
Store the data in SingleStore
Demonstrate powerful AI Functions for text processing and analysis

Prerequisites: Ensure AI Functions are installed on your deployment (AI Services > AI & ML Functions).

Create some simple tables

This setup establishes a basic relational structure to store some reviews for restaurants. Ensure you have selected a database and have CREATE permissions to create/delete tables.

SQL

%%sql
CREATE DATABASE IF NOT EXISTS temp;
USE temp;

SQL

%%sql
DROP TABLE IF EXISTS reviews;

CREATE TABLE IF NOT EXISTS reviews (
    Id INT PRIMARY KEY,
    ProductId VARCHAR(20),
    UserId VARCHAR(50),
    ProfileName VARCHAR(255),
    HelpfulnessNumerator INT,
    HelpfulnessDenominator INT,
    Score INT,
    Time BIGINT,
    Summary TEXT,
    Text TEXT
);

Install the required packages

Python

!pip install -q httplib2 kagglehub pandas

Download and Load Dataset

Python

import kagglehub
import pandas as pd

# Download the Amazon Fine Foods Reviews dataset from Kaggle
print("Downloading dataset from Kaggle...")
path = kagglehub.dataset_download("snap/amazon-fine-food-reviews")
print(f"Dataset downloaded to: {path}")

# Read the CSV file
df = pd.read_csv(f"{path}/Reviews.csv")

# Display dataset info
print(f"\nDataset shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print("\nFirst few rows:")
df.head()

Load Data into SingleStore

Python

import singlestoredb as s2

# Create SQLAlchemy engine instead of regular connection
engine = s2.create_engine(database='temp')

# Take a sample of 10,000 reviews for demo purposes
sample_df = df.head(10000).copy()

print(f"Loading {len(sample_df)} reviews into SingleStore...")

# Write dataframe to SingleStore table using SQLAlchemy engine
sample_df.to_sql(
    'reviews',
    con=engine,  # Use engine instead of connection
    if_exists='append',
    index=False,
    chunksize=1000
)

print("Data loaded successfully!")

Verify Data Load

SQL

%%sql
-- Check the number of reviews loaded
SELECT COUNT(*) as total_reviews FROM reviews;

Sample Data Preview

SQL

%%sql
-- View sample reviews
SELECT Id, ProductId, Score, Summary, LEFT(Text, 100) as Review_Preview
FROM reviews
LIMIT 10;

AI Functions Demonstrations

Now let's explore the power of SingleStore AI Functions for text analysis and processing. Ensure that AI functions are enabled for the org and you are able to list the available AI functions

SQL

%%sql
SHOW functions in cluster;

SQL

%%sql
-- AI_COMPLETE: Ask general questions and get LLM-powered completions
SELECT cluster.AI_COMPLETE(
    'What is SingleStore?'
) AS completion;

SQL

%%sql
-- AI_SENTIMENT: Analyze sentiment of customer reviews for a specific product
-- WHERE ProductId = <Your choice>
-- Remember to specify the datbase name. In this example 'temp' is the Database name
SELECT
    Id,
    ProductId,
    Score,
    LEFT(Text, 80) as Review_Snippet,
    cluster.AI_SENTIMENT(Text) AS sentiment
FROM temp.reviews
WHERE ProductId = 'B000NY8ODS'
LIMIT 10;

SQL

%%sql
-- Aggregate sentiment analysis across products
-- Using CTE to filter and prepare data first
WITH filtered_reviews AS (
    SELECT
        ProductId,
        Text
    FROM temp.reviews
    WHERE ProductId IN (
        SELECT ProductId
        FROM temp.reviews
        GROUP BY ProductId
        HAVING COUNT(*) >= 5
    )
    LIMIT 100
),
grouped_reviews AS (
    SELECT
        ProductId,
        COUNT(*) as review_count,
        GROUP_CONCAT(Text SEPARATOR '. ') as combined_text
    FROM filtered_reviews
    GROUP BY ProductId
    LIMIT 5
)
SELECT
    ProductId,
    review_count,
    cluster.AI_SENTIMENT(combined_text) as overall_sentiment
FROM grouped_reviews;

SQL

%%sql
-- AI_SUMMARIZE: Create concise summaries of lengthy reviews
-- Filter long reviews first using CTE
WITH long_reviews AS (
    SELECT
        Id,
        ProductId,
        Text,
        LEFT(Text, 150) as Original_Review
    FROM temp.reviews
    WHERE LENGTH(Text) > 200
    LIMIT 5
)
SELECT
    Id,
    ProductId,
    Original_Review,
    cluster.AI_SUMMARIZE(
        Text,
        'aifunctions_chat_default',
        15
    ) AS summary
FROM long_reviews;

SQL

%%sql
-- AI_CLASSIFY: Classify customer feedback into categories
-- Filter negative reviews first using CTE
WITH negative_reviews AS (
    SELECT
        Id,
        ProductId,
        Text,
        LEFT(Text, 100) as Review_Text
    FROM temp.reviews
    WHERE Score <= 3
    LIMIT 10
)
SELECT
    Id,
    ProductId,
    Review_Text,
    cluster.AI_CLASSIFY(
        Text,
        '[quality, price, shipping, taste]'
    ) AS classification
FROM negative_reviews;

SQL

%%sql
-- AI_EXTRACT: Extract specific information from reviews
-- Filter positive reviews first using CTE
WITH positive_reviews AS (
    SELECT
        Id,
        ProductId,
        Text,
        LEFT(Text, 100) as Review_Text
    FROM temp.reviews
    WHERE Score >= 4
    LIMIT 10
)
SELECT
    Id,
    ProductId,
    Review_Text,
    cluster.AI_EXTRACT(
        Text,
        'Does this customer indicate they will buy this product again? Answer with yes, no, or unclear only'
    ) AS repeat_purchase_intent
FROM positive_reviews;

SQL

%%sql
-- AI_EXTRACT: Identify reviews with high churn risk
-- Filter low-rated reviews first using CTE
WITH low_rated_reviews AS (
    SELECT
        Id,
        ProductId,
        Score,
        Text,
        LEFT(Text, 120) as Review_Text
    FROM temp.reviews
    WHERE Score <= 2
    LIMIT 10
)
SELECT
    Id,
    ProductId,
    Score,
    Review_Text,
    cluster.AI_EXTRACT(
        Text,
        'Is this customer at high risk of not purchasing again? Answer with high, medium, or low only'
    ) AS churn_risk
FROM low_rated_reviews;

SQL

%%sql
-- AI_TRANSLATE: Translate text between languages
-- Filter reviews with substantial summaries first using CTE
WITH translatable_reviews AS (
    SELECT
        Id,
        Summary as Original_English
    FROM temp.reviews
    WHERE Score = 5
    AND Summary IS NOT NULL
    AND LENGTH(Summary) > 20
    LIMIT 5
)
SELECT
    Id,
    Original_English,
    cluster.AI_TRANSLATE(
        Original_English,
        'english',
        'spanish'
    ) AS spanish_translation
FROM translatable_reviews;

SQL

%%sql
-- Combined AI Functions: Comprehensive product analysis
-- Filter to products with multiple reviews first
WITH popular_products AS (
    SELECT ProductId
    FROM temp.reviews
    GROUP BY ProductId
    HAVING COUNT(*) >= 10
    LIMIT 5
),
product_reviews AS (
    SELECT
        r.ProductId,
        r.Text,
        r.Score,
        LEFT(r.Text, 80) as Review_Sample
    FROM temp.reviews r
    INNER JOIN popular_products p ON r.ProductId = p.ProductId
    LIMIT 10
)
SELECT
    ProductId,
    Score,
    Review_Sample,
    cluster.AI_SENTIMENT(Text) as sentiment,
    cluster.AI_CLASSIFY(Text, '[quality, value, taste, packaging]') as category,
    cluster.AI_SUMMARIZE(Text, 'aifunctions_chat_default', 10) as brief_summary
FROM product_reviews;

Cleanup

SQL

%%sql
DROP TABLE IF EXISTS reviews;
DROP DATABASE IF EXISTS temp;

Usage Recommendations for AI Functions

To optimize performance and control costs when using AI Functions, SingleStore recommends the following:

Use Common Table Expressions (CTEs) to filter rows before making calls to large language models (LLMs). The query engine currently sends data to the LLM before applying LLM or WHERE filters.
LLM calls are expensive. Begin with a small dataset to evaluate response quality and verify the results meet the requirements before scaling up.
Enterprise plans support three model providers; Aura, Amazon Bedrock, and Azure AI Services. Data is processed according to each provider’s policies. If a row violates provider rules, the system fails the batch that includes the row and returns errors for these rows.
Strict usage quotas apply per model and per organization. These quotas are not configurable by end users. For higher usage limits, contact SingleStore Support. Self-service quota configuration will be available in the future.

AI Functions

On this page

Install AI Functions

Text Processing Functions

AI_COMPLETE

Syntax

Arguments

Return Type

Usage

AI_SENTIMENT

Syntax

Arguments

Return Type

Usage

AI_TRANSLATE

Syntax

Arguments

Return Type

Usage

AI_SUMMARIZE

Syntax

Arguments

Return Type

Usage

AI_CLASSIFY

Syntax

Arguments

Return Type

Usage

AI_EXTRACT

Syntax

Arguments

Return Type

Usage

Embedding and Vector Functions

VECTOR_SIMILARITY

Syntax

Arguments

Return Type

Usage

EMBED_TEXT

Syntax

Arguments

Return Type

Usage

Example Notebook

Demonstrate some common AI function usecases

Create some simple tables

Install the required packages

Download and Load Dataset

Load Data into SingleStore

Verify Data Load

Sample Data Preview

AI Functions Demonstrations

Cleanup

Usage Recommendations for AI Functions

Was this article helpful?

On this page

Was this article helpful?