CREATE EXTERNAL FUNCTION

Syntax

SQL

CREATE [OR REPLACE] EXTERNAL FUNCTION [<database_name>.]function_name ([<parameter_list>])
RETURNS <return_type>
[DEFINER = CURRENT_USER]
AS REMOTE SERVICE "<service_endpoint>"
FORMAT <data_format> [USING DELIMITED BATCHES] [LINK [database_name.]<connection_link_name>];

parameter_list:
    <variable_name> <data_type> [<data_type_modifier> [...n]] [, ...]

data_type_modifier:
    DEFAULT <default_value> | NOT NULL | NULL | COLLATE collation_name

Return Type

return_type: A scalar value or a TABLE value. If a TABLE value, includes a list of <column name> <data_type> values such as TABLE(a TEXT, b INT).

Arguments

function_name:

The name of the function. Function names must be unique within a database, and cannot be duplicates of names for other tables, views, UDFs, TVFs, user-defined aggregate functions (UDAFs), or stored procedures.

You can also specify <database_name> and <function_name> together by replacing <function_name> with <database_name>.<function_name> instead of defaulting to the current context database. For example, you can write the following:
SQL
```
CREATE EXTERNAL FUNCTION db.some_func(a int)
...
```
data_format: One of the following: JSON or ROWDAT_1. For more information, refer to Data Formats.
connection_link_name: The name of a connection link, which is a secure link that stores connection details (credentials and configurations) for an HTTP connection. Refer to Configuring and Using Connection Links for more details.
DEFINER = current_user: (Optional) When specified, a special flag is stored in the metadata table which indicates if a different user is required to run the function.

Remarks

Note

SingleStore does not start, stop, or otherwise maintain the service that hosts the external function.

A call to an external UDF can be made anywhere in a SQL statement where an expression is allowed. For example:

SQL
```
SELECT example_external_udf(100,'foo');
```

SQL

SELECT * FROM t
WHERE example_external_udf(100,'foo') = 'bar';

SQL

SELECT a, COUNT(*) FROM t
GROUP BY a
HAVING a > example_external_udf(100,'foo');

A call to an external TVF can be made anywhere in a SQL statement where a table reference is allowed, including joining to a table. For example:
- SQL
```
SELECT * FROM example_external_tvf(100,'foo');
```
- SQL
```
SELECT * FROM t, example_external_tvf(100,'foo') AS e
WHERE t.id = e.id;
```

Enable External Functions

External functions must be enabled before they can be used. To enable external functions, set the enable_external_functions engine variable to either of the following as required:

Value	Description
`ON`	Enables the use of all external functions.
`ALLOWLIST`	Enables the use of all the external functions in a specified list. When `enable_external_functions` is set to `ALLOWLIST`, you must configure the `external_functions_allowlist` engine variable and specify the names of external functions to enable (allow). You can use wildcards in the names. For example, to enable the use of any external function located at `http://host1.example.com` and the functions named `lowercase` and `uppercase` located at `http://host2.example.com`, set the engine variable `external_functions_allowlist` to: `"{'endpoints' : ['http://host1.example.com/*', 'http://host2.example.com/uppercase','http://host2.example.com/lowercase']}"`

Value

Description

ON

Enables the use of all external functions.

ALLOWLIST

Enables the use of all the external functions in a specified list. When enable_external_functions is set to ALLOWLIST, you must configure the external_functions_allowlist engine variable and specify the names of external functions to enable (allow). You can use wildcards in the names.

For example, to enable the use of any external function located at http://host1.example.com and the functions named lowercase and uppercase located at http://host2.example.com, set the engine variable external_functions_allowlist to:

"{'endpoints' : ['http://host1.example.com/*', 'http://host2.example.com/uppercase','http://host2.example.com/lowercase']}"

By default, enable_external_functions is set to OFF.

Batching of Data Sent from SingleStore to an External Function

A batch contains multiple rows of data (as defined in the external function declaration). A row_id is added to each row of data in the batch.

Batching of data sent from SingleStore to an external TVF

SingleStore always sends data in batches to an external TVF.

Batching of data sent from SingleStore to an external UDF

External UDFs are rewritten by SingleStore to external TVFs to allow external UDFs to be pushed down to a table scan to support batch processing of inputs.

However, not all external UDFs can be rewritten as TVFs. Only UDFs contained in the projection list (the columns that are selected by a query) or the WHERE clause of a SELECT can be rewritten as external TVFs. Furthermore, these external UDFs must not receive aggregate functions, window functions, or nested SELECT statements as input.

The value of the engine variable batch_external_function controls the rewrites of external UDFs to external TVFs for batching. The possible values are as follows:

ALWAYS: Any query, which contains an external UDF that SingleStore cannot convert to an external TVF, will not be executed. External UDFs used in all other types of queries will run and will be rewritten to TVFs.
AUTO: SingleStore rewrites all external UDFs that it can to external TVFs for batching. Any external UDFs that cannot be rewritten will be called one row at a time.
NEVER: All external UDFs will be called without batching (i.e., one row at a time).

Send Delimited Batch Sets

Specify the USING DELIMITED BATCHES clause to keep the connection open after all the batches have been processed by SingleStore and accept a final batch from the external function. For example:

SQL

CREATE EXTERNAL FUNCTION extFunc(a TEXT)
   RETURNS TEXT AS REMOTE SERVICE 'http://<svchost>/<endpoint>'
   FORMAT ROWDAT_1 USING DELIMITED BATCHES;

By default, once SingleStore processes all the rows in the batch, it sends a final batch of zero size and closes the connection to the external function. However, the external function may sometimes send additional rows after all the rows have been processed. Enabling delimited batch sets allows the external function to acknowledge that all the rows have been received and send additional rows within the last response if required.

Implement Delimited Batch Sets in External Functions

When USING DELIMITED BATCHES is specified in CREATE EXTERNAL FUNCTION statements, SingleStore appends the batchdelim=true parameter to the function name in the URI, which indicates that delimited batch sets are enabled for this function. External functions can check this suffix to confirm if the function supports delimited batches.

If delimited batch sets are enabled for an external function, SingleStore waits for a response after sending the final zero-size batch. Hence, the external function must send a response after receiving the last batch of zero size as follows:

If there are no additional rows to send, the external function must respond with a HTTP status message and a response size of 0.
If additional rows exist, the row_id of the rows sent in the response must match the row_id of the original rows in the target SingleStore table. SingleStore matches the rows in the last batch with the corresponding rows in the table using the row_id.

Benefits of Batching

Batching reduces network and CPU utilization.
Batching does increase memory utilization. However, the benefits of the decrease in network and CPU utilization far outweigh the increase in memory utilization.
A global variable which assists with batching is external_functions_batch_size. It controls the batch size, in rows, for calls to the external process that services external functions. It has a default size of 512 rows. Most applications should simply use the default, but if the overhead of calling the external process becomes significant, you may wish to increase the batch size.

Data Formats

External functions support two data formats, JSON and ROWDAT_1. Specify the format to use in a CREATE [OR REPLACE] EXTERNAL FUNCTION statement. The selected format specifies how the input row(s) sent to the external function and output row(s) returned from the external function are serialized.

`JSON`

If batching is used, the data format is serialized as a JSON array of arrays as follows:

{"data":
    [[<row id>, <argument 1>, <argument 2>, …],
    [<row id>, <argument 1>, <argument 2>, …],
    ...
]}

If batching is not used, the data format is serialized as a JSON array of one array. The one array contains the row.

{"data":
    [[<row id>, <argument 1>, <argument 2>, …]]
}

Following is a description of the fields, in order, of the JSON sub-array:

<row id>: A row id, which is an integer greater than or equal to 0. The row id is generated for the rows sent as input to the external function. The output rows are listed in ascending order. There can be repeating row ids.
<argument n>: The serialization format supports three types:
- integer
- double
- variable-length string
- JSON
All other types are serialized as strings.

Note

Because a single row or multiple rows (a batch) may be sent as input to an external function, the function should be written to handle both cases.

`ROWDAT_1`

ROWDAT_1 is a SingleStore binary format. It is used for serializing data that is the input to and the output from an external function. The data is in a rowstore format which contains one row, or multiple rows if batching is used.

Following is a description of the fields, in order, in each row:

A row id, which is an integer greater than or equal to 0. The row id in a given row must be larger than the row id in the previous row. Multiple rows may contain the same row id.
The remaining fields contain the arguments to the external function. The serialization format supports three types:
- 64-bit signed integer
- 64-bit double (IEEE 754)
- length-encoded variable-length string
The serialization format is little-endian. A single byte denoting whether a value is NULL (1: NULL, 0: not-NULL) precedes all type values (except the row id which cannot be NULL).

All other types are serialized as string.

Note

Because a single row or multiple rows (a batch) may be sent as input to an external function, the function should be written to handle both cases.

CREATE [OR REPLACE] EXTERNAL FUNCTION ... AS REMOTE

Use the AS REMOTE option to create an external function, which is a function that is accessible through an HTTP service. Following are the specifics of the HTTP request and response format.

HTTP request

Request method

A standard HTTP POST request is used.

Standard HTTP request headers

The following standard HTTP headers are used:

accept/content-type: application/json or application/octet-stream
user-agent: SingleStore/<S2 version>

Custom HTTP request headers

The following custom HTTP headers are used:

s2-ef-version: <data format version>
s2-ef-name: <name of the external function>

HTTP response

When a remote function is successful, it returns a HTTP response with status code 200. A HTTP response with status code other than 200 is treated as error, and the response body contains the error message.

SingleStore Transaction Semantics Are Not Enforced Inside of External Functions

Warning

Any update to state made by an external function is not part of the calling SingleStore transaction. If the calling transaction rolls back, it does not roll back updates made by an external function.

For example, suppose an application runs a SQL query which calls an external function. The function runs and increments a variable v held in an external process by 1, from 100 to 101, and returns v. After the function completes, but before the transaction commits, the application running the transaction calls ROLLBACK. The state of the variable v in the external process remains at 101; it is not rolled back to 100.

Using HTTP Connection Links

An HTTP connection link can be used when creating an external function. Using a connection link is more secure than specifying credentials and other sensitive connection details directly in the command. Examples are provided at the end of this topic.

HTTP connections have the following limitations:

Only external functions can use them.
They cannot contain the following header keys: s2-ef-..., Accept, Connection, Content-Length, Content-Type, Upgrade, User-Agent.

Example: Create a HTTP Link Using the CREDENTIALS and CONFIG Clauses

The following example shows how to create an HTTP link, which you can use in a CREATE [OR REPLACE] EXTERNAL FUNCTION statement.

SQL

CREATE LINK HTTP_link AS HTTP
CREDENTIALS '{"headers":{"authorization":"Bearer <token>"}}'
CONFIG '{"headers":{"version":"2.3"}}';

Example: Create an HTTP Link Using Only the CREDENTIALS Clause

The following example shows how to create an HTTP link, which you can use in a CREATE [OR REPLACE] EXTERNAL FUNCTION statement.

SQL

CREATE LINK HTTP_link AS HTTP
CREDENTIALS '{"headers":{"authorization":"Bearer <token>", "version":"2.3"}}';

Encrypting HTTP Connection Links

To create an encrypted HTTP connection (encrypts the traffic between the database and the remote endpoint), set the following security-specific CONFIG/CREDENTIALS JSON:

{
  "security": {
    "security.protocol": "ssl",
    "ssl.key.location": "/path/to/key.pem",
    "ssl.key.password": "<password_for_private_key>",
    "ssl.certificate.location": "/path/to/cert.pem"
    "ssl.ca.location": "/path/to/ca.pem"
  }
}

Where:

security.protocol: must be "ssl". No other protocols are supported/implemented.
ssl.key.location: required. Specifies the path to the private key that will be used during SSL negotiation with a remote server.
ssl.key.password: optional. Used if ssl.key.location points to a password-protected private key.
ssl.certificate.location: required. Specifies the path to the certificate with the signature of the public part of the key during the SSL handshake.
ssl.ca.location: optional. If specified, the SingleStore Helios engine will verify the peer’s certificate against the CA bundle loaded from the provided path.

You can omit the headers object in the JSON when configuring encrypted connections.

Note

When configuring encrypted HTTP connections with your SingleStore Heliosworkspaces you can use the default keys and certificates, or you can upload your own. Which method you choose depends on how strict the remote endpoint is.

If the remote endpoint needs to uniquely identify each workspace/entity that tries to connect, upload your own certificates.

If the remote endpoint is just checking whether a connecting workspace is a SingleStore Heliosworkspace (no matter what user) use the default settings.

If the remote endpoint does not care about the client's origin, use the simplified configuration, with CA only (as shown in "l_public", below).

-- Default Values
{
  "security": {
    "security.protocol": "ssl",
    "ssl.key.location": "/etc/memsql/ssl/server-key.pem",
    "ssl.key.password": "<password_for_private_key>",
    "ssl.certificate.location": "/etc/memsql/ssl/server-cert.pem"
    "ssl.ca.location": "/etc/memsql/extra/<.crt file>"
  }
}

-- Locations for certificates manually uploaded via the Cloud Portal

{
  "security": {
    "security.protocol": "ssl",
    "ssl.key.location": "/etc/memsql/extra/<.crt file>",
    "ssl.key.password": "<password_for_private_key>",
    "ssl.certificate.location": "/etc/memsql/extra/<.crt file>"
    "ssl.ca.location": "/etc/memsql/extra/<.crt file>"
  }
}

See the SingleStore Heliosexamples.

Example: Encrypt Traffic Between Database and Remote Endpoint

SQL

CREATE LINK link_1 AS HTTP
CREDENTIALS '{"security":{"ssl.key.location":"<path_to_private_key>/server-key.pem"}}'
CONFIG '{"security":{
  "security.protocol":"ssl",
  "ssl.certificate.location":"<path_to_pem>/server-cert.pem",
  "ssl.ca.location":"/<path_to_ca_bundle>/ca-cert.pem"}}';

CREATE OR REPLACE EXTERNAL FUNCTION 'ef1' (input INT) RETURNS TEXT AS REMOTE SERVICE
  'https://localhost:8123/func1' FORMAT JSON LINK link_1;

SELECT ef1(42);

Example: Create a Basic Encrypted HTTP Link Using Only the CONFIG Clause

The following example shows how to create a basic encrypted HTTP link using only the CA bundle path in the security settings in the CONFIG clause. You can create an HTTP link using only the CA bundle path if the authentication of the SingleStore Helios engine is not required, and only encryption and remote certificate verification are needed. You do not need to specify "security.protocol":"ssl" in this case.

SQL

CREATE LINK link_2 AS HTTP
CREDENTIALS ''
CONFIG '{"security":{"ssl.ca.location":"<path_to_pe>/ca-cert.pem"}}';

Examples: SingleStore Helios

For the following examples, assume there are 4 HTTP servers running at "example.com" on ports 8881-8884. Each has different SSL requirements for inbound connections:

8881: Server uses a private key and a legitimate certificate from a well-known CA for the example.com domain. The client's certificate is optional, verified by the SingleStore Helios's onboard CentOS 7 standard CA bundle. This setup resembles a common configuration for any HTTPS website: for example, just like visiting https://example.com:8881 from your web browser.

8882: Server uses a private key and a legitimate certificate from a well-known CA for the example.com domain. The client's certificate is required and verified against singlestore_bundle.pem -- which SingleStore publicly distributes and is already included in SingleStore Helios. This means that the server will only accept inbound connections from SingleStore Helios (or SingleStore Helios) with the default keys.

8883: Server is running in full private mutual auth mode. The private key is signed with an internal CA for the server. The client's certificate is required and verified against the internal SingleStore Helios CA. Use this configuration to control how certificates are issued and use those certificates to identify a specific workspace in your SingleStore Helios instance. This way, only connections from desired peers are accepted.

8884: Server uses a private key and legitimate certificate from a well-known CA for the example.com domain. The client's certificate is required and verified against the internal CA. Use this configuration to control how certificates are issued and use those certificates to identify a specific workspace in your SingleStore Helios instance. This way, only connections from desired peers are accepted.

The links:

l_public: port 8881. Simplified configuration - CA only. The client sets the certificate bundle that accepts well-known CAs. For this case, a CA bundle is uploaded to Cloud Portal with the resulting filepath: /etc/memsql/extra/ca-93057.crt.

SQL

CREATE LINK l_public AS HTTP
  credentials ''
  config '{
    "security":{"ssl.ca.location":"/etc/memsql/extra/ca-93057.crt"}
    }';

CREATE OR REPLACE EXTERNAL FUNCTION 'ef_public'(input int) RETURNS text AS REMOTE SERVICE
    'https://example.com:8881/func1' FORMAT JSON LINK l_public;

l_public_s2: port 8882. As above, the client sets the certificate bundle for well-known CAs. To prove that the connection originates from S2MS, the key and certificate are set from the location available in every SingleStore Heliosworkspace. This key-cert pair is not unique within a single AWS/Azure/GCP region.

SQL

CREATE LINK l_public_s2 AS HTTP
    credentials '{
      "security":{"ssl.key.location":"/etc/memsql/ssl/server-key.pem"}
      }'
    config '{"security":{
      "security.protocol":"ssl",
      "ssl.certificate.location":"/etc/memsql/ssl/server-cert.pem",
      "ssl.ca.location":"/etc/memsql/extra/ca-93057.crt"}
      }';
      
CREATE OR REPLACE EXTERNAL FUNCTION 'ef_public_s2'(input int) RETURNS text AS REMOTE SERVICE
    'https://example.com:8882/func1' FORMAT JSON LINK l_public_s2;

l_private: port 8883. For this example, a combined certificate and private key is uploaded to SingleStore Helios as /etc/memsql/extra/ca-12c5d.crt. Here's an example of a combined file where a key is pasted into a certificate file:

-----BEGIN CERTIFICATE-----
MIIDODCCAiACAQEwDQYJKoZIhvcNAQELBQAwYDELMAkGA1UEBhMCVVMxCzAJBgN
[...]
vp9IHlbbNgAk4DpWvP8zYlAraVAClS0/sWAwhwkyU4OkB3VB8ti6KPBlc0V8ZoO
-----END CERTIFICATE-----


-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: AES-128-CBC,3EB18A38696F0637E720801EA2A1FAEF

azQ5ApyXYoL4SsUFjPJEld4qKYl4p8rVKyiZ7RHihhmtct6jgFOD8h6R3GvifdD
[...]
peT4q+YCrys36Z7ZH/aYJbGzQ305XYvstlL5BK5pKVxTxRucgIGRMa56gbMGdCK
-----END RSA PRIVATE KEY-----

Warning

Please note that the private key is protected (encrypted) with a passphrase in the example above. Sensitive data should not be transferred in plain text.

The certificate is signed with a private "CA for database". Note that the same file location is used for ssl.key.location and ssl.cert.location. Additionally, a separate "CA for remote" certificate is uploaded as /etc/memsql/extra/ca-4860e.crt and used as ssl.ca.location. The SingleStore Helios engine will verify the remote HTTPS server's certificate against the given CA. Also note that "CA for database" and "CA for remote" may be the same within a single company.

SQL

CREATE LINK l_private AS HTTP
    credentials '{"security":{
      "ssl.key.location":"/etc/memsql/extra/ca-12c5d.crt",
      "ssl.key.password":"muchsecret"}
      }'
    config '{"security":{
      "security.protocol":"ssl",
      "ssl.certificate.location":"/etc/memsql/extra/ca-12c5d.crt",
      "ssl.ca.location":"/etc/memsql/extra/ca-4860e.crt"}
      }';

CREATE OR REPLACE EXTERNAL FUNCTION 'ef_private'(input int) RETURNS text AS REMOTE SERVICE
    'https://example.com:8883/func1' FORMAT JSON LINK l_private;

l_private_le: port 8884. This is a combination of the three previous examples. The remote HTTPS server verifies the client against the internal CA; however, the SingleStore Helios engine expects a certificate from a well-known CA.

SQL

CREATE LINK l_private_le AS HTTP
    credentials '{"security":{
      "ssl.key.location":"/etc/memsql/extra/ca-12c5d.crt",
      "ssl.key.password":"muchsecret"}
      }'
    config '{"security":{
      "security.protocol":"ssl",
      "ssl.certificate.location":"/etc/memsql/extra/ca-12c5d.crt",
      "ssl.ca.location":"/etc/memsql/extra/ca-93057.crt"}
      }';

CREATE OR REPLACE EXTERNAL FUNCTION 'ef_private_le'(input int) RETURNS text AS REMOTE SERVICE
    'https://example.com:8884/func1' FORMAT JSON LINK l_private_le;

Examples

Example: An External UDF

Suppose you are writing an external function that implements an elementary grammar checker. The checker scans one paragraph at a time. The checker corrects each paragraph by adding missing periods at the end of sentences and capitalizes the first letter of sentences (if the first letter is lowercase). Also, the checker optionally replaces digits, that are standalone words, in the sentence with the corresponding numbers spelled out. If no corrections to a paragraph are needed, the checker outputs the paragraph, unchanged.

Create the paragraphs table which stores the paragraphs to be checked and whether to check the digits in each paragraph.

SQL

CREATE TABLE paragraphs(paragraph TEXT, check_digits BOOL);

The table contains the following data.

+---------------------------------------------------------------+----------------+
| paragraph                                                     | check_digits   |
+---------------------------------------------------------------+----------------+
| this paragraph has 1 sentence                                 |              1 |
| this paragraph has 2 sentences. this is the second sentence   |              0 |
| This paragraph has one sentence and doesn't need corrections. |              1 |
+-------------------------------------------------------------+------------------+

Create the definition for an external function that checks a paragraph for grammar mistakes. If 1 is sent as input to the check_digits parameter, then digits are included as part of the grammar check.

SQL

CREATE EXTERNAL FUNCTION check_paragraph(paragraph TEXT, check_digits BOOL)
RETURNS TEXT
AS REMOTE SERVICE '<endpoint_URL>'
FORMAT JSON;

Implement the body of the external function and store the implementation on the HTTP server. An example implementation using an AWS Lambda function written in Python follows. Note that the implementation of the helper functions convert_numbers_to_words, capitalize_sentences and add_missing_periods_to_sentences is not shown.

Python

import json

def lambda_handler(event, context):
    request = json.loads(event['body'])
    response = []
    for row_id, paragraph, check_digits in request["data"]:
        if check_digits:
            paragraph = convert_numbers_to_words(paragraph)
        paragraph = capitalize_sentences(paragraph)
        paragraph = add_missing_periods_to_sentences(paragraph)
        response.append([row_id, paragraph])
    return { 'statusCode': 200, 'body': json.dumps({"data": response}) }

Query the paragraphs table and call the external function as part of the query.

SQL

SELECT check_paragraph(paragraph, check_digits) as corrected_paragraph,
check_digits FROM paragraphs;

The external UDF is rewritten to an external TVF; the query calls the external function only once with the following input:

{"data":
    [[0, "this paragraph has 1 sentence", 1],
    [1, "this paragraph has 2 sentences. this is the second sentence", 0],
    [2, "This paragraph has one sentence and doesn't need corrections.", 1]]
}

After processing the input data, the external function returns the following data to the calling query. The data contains the corrected paragraphs. Paragraphs not needing any corrections are also included.

{"data":
    [[0, "This paragraph has one sentence."],
    [1, "This paragraph has 2 sentences. This is the second sentence."],
    [2, "This paragraph has one sentence and doesn't need corrections."]]
}

The query returns the following output.

+----------------------------------------------------------------+--------------+
| corrected_paragraph                                            | check_digits |
+----------------------------------------------------------------+--------------+
| This paragraph has one sentence.                               |            1 |
| This paragraph has 2 sentences. This is the second sentence.   |            0 |
| This paragraph has one sentence and doesn't need corrections.  |            1 |
+----------------------------------------------------------------+--------------+

Example: External UDF using DEFINER

SQL

CREATE EXTERNAL FUNCTION ext_func()
RETURNS TEXT DEFINER ='root'@'%'
AS REMOTE SERVICE  '...' FORMAT JSON;

To view a list of the functions and see the definer used:

SQL

SHOW FUNCTIONS;

+------------------------+------------------+----------+-------------+-----------------+------+
| Functions_in_func_proc | Function Type    | Definer  | Data Format | Runtime Type    | Link |
+------------------------+------------------+----------+-------------+-----------------+------+
| ext_func               | External User    | root@%   | JSON        | Remote Service  |      |
|                        | Defined Function |          |             |                 |      |
+------------------------+------------------+----------+-------------+-----------------+------+

Example: An External TVF

Suppose you want to read the contents of an external file and store each line in a row in a table.

Create the definition for the external function, which gets the contents of a file that is passed as input.

SQL

CREATE EXTERNAL FUNCTION get_file_lines(filename TEXT)
RETURNS TABLE(line TEXT)
AS REMOTE SERVICE '<endpoint_URL>'
FORMAT JSON;

Implement the body of the external function and store the implementation on the HTTP server. The implementation is not shown here.

Query the external function:

SQL

SELECT line FROM get_file_lines("simple_file.txt");

When the external function runs, it returns a JSON array of arrays, which contains the data in simple_file.txt:

{"data":
    [[0, "This is the first sentence."],
    [0, "This is the second sentence."],
    [0, "This is the third sentence."]]
}

When the query that calls the external function completes, the query returns the following output.

SQL

+------------------------------+
| line                         |
+------------------------------+
| This is the first sentence.  |
| This is the second sentence. |
| This is the third sentence.  |
+------------------------------+

Implementing High-Performance Parallel External Functions

Fast, large-batch processing can be accomplished, when in an on-premises configuration, by implementing high-performance parallel external functions. This is done by configuring every node in your SingleStore system to act as a parallel, multi-threaded application server.

By using this approach, rows are batched when sent to a UDF, cross-process calls are not done for every row, and data is not sent across the network. This allows for very fast per-row processing speed. Also, this scales the amount of CPU used for external function computation proportionally to the sizes of your cluster. This avoids potential CPU bottlenecks that could occur if all calls are directed to an external URL (depending on how your networking and the application service to run functions are designed). Other advantages are: any code can be utilized and legacy code and DLLs can be run without needing to recompile the DLLs.

Example: Using Remote Service

The default service type is REMOTE and the endpoint is expected to be the URL of a remote HTTP service. The following example uses an AWS endpoint as the remote service:

SQL

CREATE OR REPLACE EXTERNAL FUNCTION uppercase_string(string_col TEXT)
     RETURNS TEXT    	
     AS REMOTE SERVICE 	
     'https://fnc.execute-api.us-east-1.amazonaws.com/prod/upcase_string' 
     FORMAT JSON;

Example: Using localhost as Remote Service

SQL

CREATE EXTERNAL FUNCTION ext_udf2(col_a INT, col_b INT)
     RETURNS TABLE(col_1 TEXT, col_2 BOOL)
     AS REMOTE SERVICE 'http://localhost:3305/cool_table_func'
     FORMAT JSON;

Last modified: September 26, 2025

CREATE EXTERNAL FUNCTION

On this page

Syntax

Return Type

Arguments

Remarks

Enable External Functions

Batching of Data Sent from SingleStore to an External Function

Batching of data sent from SingleStore to an external TVF

Batching of data sent from SingleStore to an external UDF

Send Delimited Batch Sets

Implement Delimited Batch Sets in External Functions

Benefits of Batching

Data Formats

`JSON`

`ROWDAT_1`

CREATE [OR REPLACE] EXTERNAL FUNCTION ... AS REMOTE

HTTP request

Request method

Standard HTTP request headers

Custom HTTP request headers

HTTP response

SingleStore Transaction Semantics Are Not Enforced Inside of External Functions

Using HTTP Connection Links

Example: Create a HTTP Link Using the CREDENTIALS and CONFIG Clauses

Example: Create an HTTP Link Using Only the CREDENTIALS Clause

Encrypting HTTP Connection Links

Example: Encrypt Traffic Between Database and Remote Endpoint

Example: Create a Basic Encrypted HTTP Link Using Only the CONFIG Clause

Examples: SingleStore Helios

Examples

Example: An External UDF

Example: External UDF using DEFINER

Example: An External TVF

Implementing High-Performance Parallel External Functions

Example: Using Remote Service

Example: Using localhost as Remote Service

Was this article helpful?

On this page

Was this article helpful?