Python UDFs

Note

This is a Preview feature.

A Python User-Defined Function (UDF) is an external function that allows you to execute Python code outside of the SingleStore engine's process. It enables you to extend SingleStore with custom Python logic in a SingleStore Notebook. Python UDFs are especially useful when you need to integrate with AI applications, machine learning (ML) models and perform vector operations with libraries like NumPy, Pandas, Polars, or call external APIs.

Prerequisites

To enable Python UDFs in your SingleStore deployment, ensure the following:

  • SingleStore Version: SingleStore version 8.9 or later.

  • Environment: The SingleStore deployment must run in an AWS EKS IRSA-supported environment.

  • Engine Global Variable: enable_managed_functions must be enabled.

    SET GLOBAL enable_managed_functions = 1;

    Once this engine variable is enabled, you can create and deploy Python UDFs and TVFs.

Publish a Python UDF

Create a Python UDF

Python UDFs can be created in Shared Notebook only. To create a new Python UDF, perform the following steps:

  1. In the left navigation, select Query Editor > Shared.

  2. Select Publish (on the top right).

New Python UDF

After selecting Publish, a new dialog box appears.

Publish Settings

Publish as

Select Python UDF.

Name

Enter a name of the Python UDF.

Description

Enter the Python UDF description.

Notebook

Select a shared notebook to publish as a Python UDF. The shared notebook is pre-selected when the Python UDF is published through Notebooks.

Deployment

Select the SingleStore deployment (workspace) your notebook will connect to.

Selecting a workspace allows you to connect to your SingleStore databases referenced in the Notebook natively.

Runtime

Select runtime among:

  • Small

  • Medium

  • GPU-T4

Idle Timeout

Select idle timeout.

Select Next.

Select Publish to publish the notebook as Python UDF. Once the Python UDF is published, you can call the function using SQL Editor.

Manage an Existing Python UDF

You can view, update and delete Python UDFs in the Cloud Portal by navigating to Container Apps > Python UDFs.

View an Existing Python UDF

To view an existing Python UDF, select the Python UDF from the Name column. You can also update, delete, and view live logs of the selected Python UDF using the options from the right side.

View Live Logs

To view live logs of the selected Python UDF, select View Live Logs from the ellipsis on the right side. A new window appears, where you can view the Timestamp and the message in the Body column. You can also view Log JSON by selecting the eye icon.

Update an Existing Python UDF

To update an existing Python UDF, select the ellipsis in the Actions column of the Python UDF, and select Update.

Delete an Existing Python UDF

To delete an existing Python UDF, select the ellipsis in the Actions column of the Python UDF, and select Delete.

Defining Python UDFs

Each Python UDF must meet these requirements:

  1. The function's parameters and return types must be annotated.

  2. The function must be wrapped with the @udf decorator, which is located in singlestoredb.functions.

The @udf decorator is a critical component, as it automatically analyzes the type annotations to map Python data types to SingleStore data types. It then uses this mapping to generate the necessary CREATE EXTERNAL FUNCTION statement in the SingleStore database, ensuring a reliable connection between your Python code and your SQL queries. Refer to Equivalent Data Types for related information.

There are two main types of Python UDFs, defined by the type annotations you use:

  • Scalar

  • Vectorized

Scalar Python UDFs

Scalar Python UDFs are defined with standard Python type annotations, such as int, float, or str. When called from the database, the Python UDF server receives a batch of rows, but the Python UDF itself is invoked once for each individual row of data. This is useful in complex logic with individual records.

The following example demonstrates a scalar Python UDF:

from singlestoredb.functions import udf
import singlestoredb.apps as apps
@udf
async def multiply(x: float, y: float) -> float:
return x * y
# Start Python UDF server
connection_info = await apps.run_udf_app()
print("UDF server running. Connection info:", connection_info)

This creates the following external function:

CREATE EXTERNAL FUNCTION multiply(x DOUBLE NOT NULL, y DOUBLE NOT NULL)
RETURNS DOUBLE NOT NULL
AS REMOTE SERVICE "http://<svchost>/<endpoint>/invoke" FORMAT ROWDAT_1;

Use async def for improved cancellation handling.

Vectorized Python UDFs

Vectorized Python UDFs are defined with vector type annotations, such as numpy.ndarray, pandas.Series, polars.Series, or pyarrow.Array. The Python UDF is called only once for each batch of rows received from the SingleStore database. The entire batch is converted into vectorized inputs, where each column of data corresponds to a single vector object passed as a function parameter. This is useful in high-performance numerical processing.

The following example demonstrates a vectorized Python UDF.

import numpy as np
import numpy.typing as npt
from singlestoredb.functions import udf
@udf
async def vec_multiply(
x: npt.NDArray[np.float64],
y: npt.NDArray[np.float64]
) -> npt.NDArray[np.float64]:
return x * y
# Start Python UDF server
import singlestoredb.apps as apps
connection_info = await apps.run_udf_app()

This creates the following external function:

CREATE EXTERNAL FUNCTION vec_multiply(x DOUBLE NOT NULL, y DOUBLE NOT NULL)
RETURNS DOUBLE NOT NULL
AS REMOTE SERVICE "http://<svchost>/<endpoint>/invoke" FORMAT ROWDAT_1;

Scalar Python TVFs

Scalar Python TVFs are defined in the same way as scalar Python UDFs, except that a scalar TVF uses a Table annotation to indicate that the function returns a table. The function must also return the final result wrapped in a Table object.

The following example demonstrates a scalar Python TVF:

import numpy as np
import numpy.typing as npt
from singlestoredb.functions import udf, Table
@udf
def async number_stats(
n: npt.NDArray[np.int_],
) -> Table[npt.NDArray[np.int_], npt.NDArray[np.int_], npt.NDArray[np.float64]]:
numbers = np.arange(1, n[0] + 1, dtype=np.int_)
squares = numbers ** 2
roots = np.sqrt(numbers).round(2)
return Table(numbers, squares, roots)
# Start Python UDF server
import singlestoredb.apps as apps
connection_info = await apps.run_udf_app()

This creates the following external function:

CREATE EXTERNAL FUNCTION
`number_stats`(`n` BIGINT NOT NULL)
RETURNS TABLE(`a` BIGINT NOT NULL)
AS REMOTE SERVICE "http://<svchost>/<endpoint>/invoke" FORMAT ROWDAT_1;

SingleStore automatically generates generic column names if no names are associated with the return fields (a in this example). You can explicitly name result columns using overrides or schema classes. Refer to Overriding Parameters and Return Value Types for related information.

Vectorized Python TVFs

Vectorized Python TVFs operate in a way similar to vectorized Python UDFs. The main difference is that the vectorized Python TVFs can return multiple columns as output. In vectorized Python TVFs, each returned vector corresponds to a column.

The following example demonstrates a vectorized Python TVF:

import numpy as np
import numpy.typing as npt
from singlestoredb.functions import udf, Table
@udf
def vec_table_function(
n: npt.NDArray[np.int_],
) -> Table[npt.NDArray[np.int_], npt.NDArray[np.float64], npt.NDArray[np.str_]]:
x = np.array([10] * n[0], dtype=np.int_)
y = np.array([10.0] * n[0], dtype=np.float64)
z = np.array(['ten'] * n[0], dtype=np.str_)
# Returns a tuple of vectors (each column of the output)
return Table(x, y, z)
# Start Python UDF server
import singlestoredb.apps as apps
connection_info = await apps.run_udf_app()

This creates the following external function:

CREATE EXTERNAL FUNCTION
`vec_table_function`(`n` BIGINT NOT NULL)
RETURNS TABLE(
`a` BIGINT NOT NULL,
`b` DOUBLE NOT NULL,
`c` TEXT NOT NULL
)
AS REMOTE SERVICE "http://<svchost>/<endpoint>/invoke" FORMAT ROWDAT_1;

SingleStore automatically generates generic column names if no names are associated with the return fields (a, b, c). You can explicitly name result columns using overrides or schema classes. Refer to Overriding Parameters and Return Value Types for related information.

In this section

Last modified: September 18, 2025

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK