CREATE AGGREGATE
Creates a user-defined aggregate function (UDAF). A UDAF is a callable routine that accepts input parameters, executes programmatic logic in the function body, and returns a scalar-type value.
SingleStoreDB also supports Wasm-based UDAFs.
Syntax
CREATE [OR REPLACE] AGGREGATE function_name ( [parameter_list] )
RETURNS { data_type [data_type_modifier] }
WITH STATE data_type
INITIALIZE WITH udf_function_name
ITERATE WITH udf_function_name
MERGE WITH udf_function_name
TERMINATE WITH udf_function_name
DEFINER = 'user'@
parameter_list:
data_type [data_type_modifier [, ...] ] [, ...]
data_type_modifier:
DEFAULT default_value | NOT NULL | NULL | COLLATE collation_name
Syntax for Wasm-based UDAFs
CREATE [OR REPLACE] AGGREGATE function_name ( [parameter_list] ) RETURNS { data_type [data_type_modifier] } WITH STATE <data_type | HANDLE> AS WASM FROM content_src [WITH WIT FROM content_src] INITIALIZE WITH udf_function_name ITERATE WITH udf_function_name MERGE WITH udf_function_name TERMINATE WITH udf_function_name [ SERIALIZE WITH udf_function_name DESERIALIZE WITH udf_function_name ] DEFINER = 'user'@ parameter_list: data_type [data_type_modifier [, ...] ] [, ...] data_type_modifier: DEFAULT default_value | NOT NULL | NULL | COLLATE collation_name contentSrc: BASE64 '"' <base64> '" | HTTP '"' <url> '"' <cred> <config> | LINK <linkName> <connStr> | S3 <s3_config> | AZURE <azure_config> | GCS <gcs_config>
Arguments
OR REPLACE
If specified, replaces a UDAF if one already exists with the same name.
function_name
The name of the function. By their nature, UDAF names may override existing builtin aggregate function names for scalar data types, such as SUM()
. UDAF names cannot be use the same name as stored procedures, tables, views, user-defined scalar-value functions (UDFs), or user-defined table-valued functions (TVFs).
You can also specify database_name
and function_name
together by replacing function_name
with database_name.function_name
instead of specifying the database in USING database_name
. For example, you can write the following:
CREATE AGGREGATE db.some_func(int) ...
Function names are not case-sensitive. For details on case-sensitivity, refer to the Database Object Case-Sensitivity topic.
WITH STATE
The STATE
type of a UDAF may be a scalar type, ARRAY
type, or a RECORD
type.
The STATE
type HANDLE
is only supported with Wasm-based UDAFs. When using the HANDLE
state, you must specify the SERIALIZE
and DESERIALIZE
functions.
parameter_list
Input parameters are optional. Any number of input parameters can be specified, and each must be delimited by a comma (,
). Each input parameter is specified only by its data type and optional modifier.
UDAFs only allow scalar data types as input parameters. See the Data Types topic for more information.
The following example shows how to declare a single input parameter:
CREATE AGGREGATE my_sum(BIGINT) ...
The following example demonstrates how to declare more than one input parameters that also specify a data type modifier:
CREATE AGGREGATE multi_param_example(INT, VARCHAR(255) NOT NULL COLLATE utf8_bin, DECIMAL(19,4)) ...
udf_function_name
The name of each UDF function to execute for the INITIALIZE WITH
, ITERATE WITH
, MERGE WITH
, and TERMINATE WITH
clauses. The INITIALIZE
function takes in no arguments, and it returns a STATE
data type. The ITERATE
function takes in a STATE
data type and the input parameter data type, and it returns a STATE
data type. If the UDAF has n
parameters, the ITERATE
function will take in n+1
arguments, with the first argument being the STATE
type. The MERGE
function takes in two STATE
data types, and it returns a STATE
data type. The TERMINATE
function takes in a STATE
data type, and it returns the type specified in the RETURNS
clause.
MySQL Client Delimiters
When creating a UDF using a MySQL-compatible client connected to SingleStoreDB Cloud, you must change the client delimiter to ensure that the function definition is correctly passed to the server and then set it back to a semicolon after the alternate delimiter is no longer needed. See the MySQL Client Delimiters topic for details on MySQL client delimiters.
Security and Permissions
The invoker of a UDAF must have EXECUTE
permissions on the UDAF. Also, the UDAF’s definer must have EXECUTE
permissions on each of the four UDFs that the UDAF uses.
Using optional parameter DEFINER
UDFs can be created or run based on the DEFINER
. The DEFINER
is equivalent to the current user. When the DEFINER
optional parameter is used a special flag is stored in the metadata table which indicates if a different user is required to run the procedure.
Remarks
This command causes implicit commits. See COMMIT for more information.
Wasm-based UDAFs
The life-cycle of the HANDLE
state must be managed in the Wasm code. The implicit contract states that you need to create a unique handle for each call to the INITIALIZE
function. At the same time, ITERATE
calls can freely update the state associated with a handle or return a new handle. However, any state that is no longer accessible by the handle must be cleaned up, i.e., for any handle that is not returned, its state must be cleaned up by the functions. This applies to INITIALIZE
, MERGE
, TERMINATE
, SERIALIZE
, and DESERIALIZE
functions
Here's an example. Two states are passed to the MERGE
function. The user can reuse any of the provided states or create a new one. The following pseudo-code blocks demonstrate these use cases:
-- Reuse one of the existing states def merge(a, b): a = a + b delete b return a -- Return a new state def merge(a, b): c = a + b; delete a delete b return c
Where a
, b
, and c
are synonymous to states. In this example, any state that is no longer in use, is deleted (cleaned up).
Because each Wasm UDF runs in its own in-process sandbox, the UDAF needs to import all the Wasm functions (INITIALIZE
, ITERATE
, MERGE
, TERMINATE
, SERIALIZE
, and DESERIALIZE
) from the same Wasm module in a single CREATE AGGREGATE
statement. Use the AS WASM
clause in the function definition to import the Wasm function.
For information on how to import Wasm functions, refer to Create Wasm UDFs/Create Wasm TVFs. Because Wasm functions are sandboxed, combining Wasm functions and PSQL functions in the same CREATE AGGREGATE
statement is not supported.
Example - Create UDAF
The following example creates a new UDAF named avg_udaf
, which uses a RECORD
state type and has the same behavior as the builtin AVG
function. Note that before we can create the UDAF, each of the prerequisite user-defined scalar functions (UDFs) must be created, as the UDAF definition depends on their existence.
Create UDF Dependencies
The avg_udaf
example UDAF depends on the following UDFs:
DELIMITER // CREATE FUNCTION avg_init() RETURNS RECORD(s BIGINT, c BIGINT) AS BEGIN RETURN ROW(0, 0); END // DELIMITER ;
DELIMITER // CREATE FUNCTION avg_iter(state RECORD(s BIGINT, c BIGINT), value BIGINT) RETURNS RECORD(s BIGINT, c BIGINT) AS BEGIN RETURN ROW(state.s + value, state.c + 1); END // DELIMITER ;
DELIMITER // CREATE FUNCTION avg_merge(state1 RECORD(s BIGINT, c BIGINT), state2 RECORD(s BIGINT, c BIGINT)) RETURNS RECORD(s BIGINT, c BIGINT) AS BEGIN RETURN row(state1.s + state2.s, state1.c + state2.c); END // DELIMITER ;
DELIMITER // CREATE FUNCTION avg_terminate(state RECORD(s BIGINT, c BIGINT)) RETURNS BIGINT AS BEGIN RETURN state.s / state.c; END // DELIMITER ;
Create UDAF Example
Once the UDF dependencies have been created, you can create the UDAF. Execute the following statement:
CREATE AGGREGATE avg_udaf(BIGINT) RETURNS BIGINT WITH STATE RECORD(s BIGINT, c BIGINT) INITIALIZE WITH avg_init ITERATE WITH avg_iter MERGE WITH avg_merge TERMINATE WITH avg_terminate;
After the UDAF has been successfully created, execute the following commands to try it:
CREATE TABLE t (i BIGINT); INSERT INTO t VALUES (1), (2), (3), (4), (5); SELECT avg_udaf(i) FROM t; **** +-------------+ | avg_udaf(i) | +-------------+ | 3 | +-------------+ 1 row in set
Create aggregate using Definer
CREATE AGGREGATE avg_udaf(BIGINT) RETURNS BIGINT WITH STATE RECORD(s BIGINT, c BIGINT) INITIALIZE WITH avg_init ITERATE WITH avg_iter MERGE WITH avg_merge TERMINATE WITH avg_terminate DEFINER = 'user'@'%';
To view a list of the functions and see the definer used:
SHOW FUNCTIONS; **** +------------------------+--------------------------------+---------+-------------+---------------+-----+ | Functions_in_func_proc | Function Type | Definer | Data Format | Runtime Type | Link| +------------------------+--------------------------------+---------+-------------+---------------+-----+ | avg_init | User Defined Function | root@% | | PSQL | | | avg_iter | User Defined Function | root@% | | PSQL | | | avg_merge | User Defined Function | root@% | | PSQL | | | avg_terminate | User Defined Function | root@% | | PSQL | | +------------------------+--------------------------------+---------+-------------+---------------+-----+