# ML Functions

## Overview

Machine learning (ML) functions enable trained models to be run directly within SQL queries. They support real-time classification of new data and detection of anomalies without requiring custom code. These functions allow predictions to be embedded directly into workflows that operationalize insights at the data layer.

## Introduction to Machine Learning

Machine learning is a field of study in artificial intelligence that develops and applies methods for learning patterns from historical data and using those patterns to make predictions or decisions on new data. Unlike rule-based systems, ML models adapt automatically based on the data. Common use cases include fraud detection, predictive maintenance, and customer behavior analysis. Once trained, models can be deployed and invoked using ML functions to generate predictions at scale.

## Classification

Classification is a supervised learning technique that assigns each input to one of a predefined set of classes or labels. Models are trained on labeled datasets, where each input is paired with its correct output, and then used to classify new data. For example, a model may predict whether an incoming email is "spam" or "not spam" based on its content and metadata.

Following are the types of classification:

* **Binary Classification**: Predicts one of two possible outcomes (for example, fraud vs. non-fraud).
* **Multiclass Classification**: Predicts one label from multiple possible categories (for example, product type A, B, or C).
* **Multilabel Classification**: Assigns multiple labels to a single data point (for example, tagging an image with "beach" and "sunset").

Examples:

* A credit-card transaction classified as “fraudulent” or “legitimate.”
* Customer support tickets categorized as “billing,” “technical issue,” or “account upgrade.”

The `ML_CLASSIFY`  is a supervised machine learning function for classification tasks. It supports both binary (two classes) and multi-class (more than two classes) classification. It leverages algorithms such as logistic regression, random forest, and gradient boosting. Use SQL queries to call `ML_CLASSIFY` function and return predicted class labels.

## Anomaly Detection

Anomaly detection identifies data points that deviate significantly from expected patterns. Anomalies signal critical issues such as fraud, equipment failure, or network intrusions. An anomaly is any value or pattern that does not match normal behavior. Anomalies indicate the following:

* Performance issues (for example, server overload)
* System faults (for example, failed jobs or memory leaks)
* Opportunities (for example, traffic spikes caused by a marketing campaign)

For example, if a cluster's CPU usage normally stays between 20–60% and suddenly rises to 95%, the spike is an anomaly.

## Time-Series Anomaly Detection

Time-series anomaly detection analyzes data collected over time. For example, CPU or memory utilization per minute or hour. It considers not only individual values, but also the sequence and patterns in the data. It learns seasonal patterns (daily or weekly), long‑term trends, and normal variability ranges. The system flags values that deviate from these learned patterns.

Following are the types of time-series anomaly detection:

* **Supervised**: Supervised models use labeled anomalies to learn failure patterns.
* **Unsupervised**: Unsupervised models learn normal behavior from historical data and flag deviations without labels.

The `ML_ANOMALY_DETECT` function is an unsupervised time-series anomaly detection function currently. It supports statistical methods (e.g., z-score, interquartile range) and machine learning methods (e.g., Isolation Forest, One-Class SVM). This function returns a prediction for each row, identifying it as normal or anomalous, which can trigger alerts or be recorded for further analysis.

## Install ML Functions

To install ML Functions, navigate to **AI > AI & ML Functions**, select the deployment on which to install ML Functions. In the **ML Functions** tab, select **Install**, review the **ML Functions Summary** and then select **Deploy**.

Once the ML Functions are installed, query them in the SQL Editor or SingleStore Notebooks. SingleStore provides the following ML Functions:

| **Category**                                              | **Function**                                        |
| --------------------------------------------------------- | --------------------------------------------------- |
| Statistical and Predictive Functions                      | `ML_CLASSIFY(model_name, TO_JSON(selected_data.*))` |
| `ML_ANOMALY_DETECT(model_name, TO_JSON(selected_data.*))` |                                                     |

## Statistical and Predictive Functions

## ML\_CLASSIFY

Performs binary and multi-class classification on a dataset using standard machine learning algorithms. Supports common algorithms including:

* Logistic Regression
* Random Forest
* Gradient Boosting

## Syntax

```sql
ML_CLASSIFY(model_name, TO_JSON(selected_data.*))
```

## Arguments

* `model_name`: Name of the trained ML model to use.
* `selected_data`: A row or set of rows selected for prediction.

## Return Type

`string`

## Usage

| Basic usage                     | `SELECT cluster.ML_CLASSIFY(model_name, TO_JSON(selected_data.*)) AS predictions  FROM (SELECT * FROM table) AS selected_data;`                                                                             |
| ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Basic usage with`LIMIT`         | `SELECT cluster.ML_CLASSIFY(model_name, TO_JSON(selected_data.*)) AS predictions FROM (SELECT * FROM table WHERE column1 > 100000LIMIT 100) AS selected_data;`                                              |
| Insert predictions into a table | `INSERT INTO predictions_table (id, prediction); SELECT selected_data.id,  cluster.ML_CLASSIFY(model_name, TO_JSON(selected_data.*)) AS prediction  FROM (SELECT * FROM table LIMIT 100) AS selected_data;` |

## ML\_ANOMALY\_DETECT

Detects outliers and anomalies in datasets using statistical or machine learning-based methods. Suitable for security, monitoring, and anomaly detection applications. Supports the following methods:

* **Statistical**: z-score, interquartile range (IQR)
* **ML-based**: Isolation Forest, One-Class SVM

## Syntax

```sql
ML_ANOMALY_DETECT(model_name, TO_JSON(selected_data.*))
```

## Arguments

* `model_name`: Name of the trained ML model to use.
* `selected_data`: A row or set of rows selected for prediction.

## Return Type

`string`

## Usage

| Basic usage                     | `SELECT cluster.ML_ANOMALY_DETECT(model_name, TO_JSON(selected_data.*)) AS predictions FROM (SELECT * FROM table) AS selected_data;`                                                                              |
| ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Basic usage with`LIMIT`         | `SELECT cluster.ML_ANOMALY_DETECT(model_name, TO_JSON(selected_data.*)) AS predictions FROM (SELECT * FROM tableWHERE column1 > 100000LIMIT 100) AS selected_data;`                                               |
| Insert predictions into a table | `INSERT INTO predictions_table (id, prediction); SELECT selected_data.id,  cluster.ML_ANOMALY_DETECT(model_name, TO_JSON(selected_data.*)) AS prediction  FROM (SELECT * FROM table LIMIT 100) AS selected_data;` |

## Train a New ML Model

To train a new ML model, follow these steps:

1. Navigate to **AI > Models**.

2. Select **ML Models** tab and then select **Train New ML Model**.

3. In the **Select Function** dialog, select one of the following ML functions:

   * `ML_CLASSIFY`
   * `ML_ANOMALY_DETECT`

   Select **Next** to configure the model.

**Configure Model**

| **Model Name**           | Enter the name of the ML model.                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Training Description** | Enter the training description.                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| **Workspace**            | Select theSingleStoredeployment (workspace) the notebook connects to.Specifying aworkspaceallows natively connecting theSingleStoredatabases referenced in the notebook.                                                                                                                                                                                                                                                                                                                                      |
| **Compute Size**         | Select one of the following compute sizes:<ul> <li>Small</li> <li>Medium</li> <li>GPU-T4</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                           |
| **Run as**               | Run the notebook for training a model with or without personal credentials. Select one of the following:<ul> <li><strong>Run as &#x3C;username></strong>: Runs the notebook using the permissions and access of the current user account.</li> <li><strong>Run as a Service Account</strong>: Runs the notebook independently of personal credentials, using a service account. <blockquote> <p><strong>📝 Note</strong>: <p>Service accounts can only be created by Admin.</p></p> </blockquote> </li> </ul> |

Select **Next**.

**Select Training Data**

| **Database**               | Select the database that contains the training data.                             |
| -------------------------- | -------------------------------------------------------------------------------- |
| **Table**                  | Select the table from the selected database to train the machine learning model. |
| **Target Column**          | Select the column that represents the prediction target for the model.           |
| **Feature Selection Mode** | Specify how feature columns are selected.                                        |
| **Feature Column**         | Select one or more columns to be used as input features for training the model.  |

Preview the data and select **Next**.

Review the **Summary** and generated Fusion SQL syntax in the **Generated SQL Script**. The generated script performs the following:

* Creates and trains a ML model
* Uses data from the selected table in the selected database
* Predicts values of target column status
* Runs on the selected compute instance
* Uses all available features by default

Following is the syntax of Fusion SQL script:

```sql
%s2ml train <machine_learning_algorithm>
	--model <model_name>
	--db <database_name>
	--input_table <table_name>
	--target_column <target_column>
        --description <training_description>
	--runtime <compute_instance>
	--selected_features { \"mode\": <feature_selection_mode>, \"features\": <feature_column> }

```

Select **Start Training** to train the ML model.

## Example Notebooks

The following notebooks demonstrates how to use ML Functions:

## ML Functions: Classification

## ML Functions: Anomaly Detection

## Manage an Existing ML Model

Existing ML models can be managed by performing the following actions:

* View details
* Run prediction
* Share
* Delete

## View Details of an Existing ML Model

To view details of an existing ML model, select the ellipsis under **Actions** column of the trained ML model, and select **View Details**. Alternatively, select the ML model in the **Name** column. Select the **Details** tab to view training status, training configuration, training logs, and details about how to use the ML model.

## Run Prediction on an Existing ML Model

Run batch prediction on the existing ML model.

## Run a Batch Prediction

To run a batch prediction on the existing ML model, select the ellipsis under **Actions** column of the trained ML model, and select **Run Prediction**.

**Select Prediction Data**

| **Database**         | Select the database.                                                           |
| -------------------- | ------------------------------------------------------------------------------ |
| **Target Table**     | Select the target table on which the prediction will be run.                   |
| **Target Column**    | Select the target column on which the prediction will focus on.                |
| **Timestamp Column** | Select the column having timestamp data. Available for`ML_ANOMALY_DETECT`only. |

Preview the data and select **Next**.

**Configure Destination**

| **Prediction Interval Width** | Select the interval width of prediction. Available for`ML_ANOMALY_DETECT`only.                                                                                                                                                                                                                                                                                                                                                                                                                             |
| ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Destination Table Name**    | Select the destination table in which the prediction results will be stored.                                                                                                                                                                                                                                                                                                                                                                                                                               |
| **Destination Column**        | Select the destination column in which the prediction data will be saved.                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| **Run as**                    | Run the notebook for training a model with or without personal credentials. Select one of the following:<ul> <li><strong>Run as &#x3C;username></strong>: Runs the notebook using the permissions and access of the current user account.</li> <li><strong>Run as a Service Account</strong>: Runs the notebook independently of personal credentials, using a service account. <blockquote> <p><strong>📝 Note</strong>: <p>Service accounts are only created by Admin.</p></p> </blockquote> </li> </ul> |

Review the **Summary** and generated Fusion SQL syntax in the **Generated SQL Script**. Select **Start Prediction** to run batch prediction on the trained ML model.

## View Predictions of an Existing ML Model

To view the predictions of the trained ML model, select the ML model in the **Name** column. Select the **Predictions** tab to view prediction metadata and status.

## Share an Existing ML Model

To share an existing ML model, select the ellipsis under the **Actions** column of the trained ML model, and select **Share**.

## Delete an Existing ML Model

To delete an existing ML model, select the ellipsis under **Actions** column of the trained ML model, and select **Delete**.

## Status of ML Models

| **Status**     | **Description**                                                                               |
| -------------- | --------------------------------------------------------------------------------------------- |
| Pre-processing | The system is preparing data for ML model training (e.g., data cleaning, feature extraction). |
| Training       | The ML model is currently being trained but results are not yet available.                    |
| Done           | The ML model has been successfully trained and is ready for use.                              |
| Error          | The ML model training or processing failed due to an error.                                   |

## In this section

* [ML Functions Release Notes](https://docs.singlestore.com/cloud/ai/ai-ml-functions/ml-functions/ml-functions-release-notes.md)

***

Modified at: May 22, 2026

Source: [/cloud/ai/ai-ml-functions/ml-functions/](https://docs.singlestore.com/cloud/ai/ai-ml-functions/ml-functions/)

(An index of the documentation is available at /llms.txt)
