SELECT

Retrieves data from a table.

Syntax

[with_clause]
SELECT
[ALL | DISTINCT | DISTINCTROW]
select_expr [[AS] alias], ...
[COLLATE collation]
[INTO variable [, ...]]
[ FROM table_references
[WHERE expr]
[GROUP BY [{{col_name | expr | position}, ...} | ALL]]
| extended_grouping_expr}]
[HAVING expr]
[ORDER BY [{col_name | expr | position} [ASC | DESC ][ NULLS {FIRST | LAST}], ... | ALL]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
[FOR UPDATE]
]
[{ INTO OUTFILE 'file_name' |
INTO FS 'destination_directory/file_name' [WITH COMPRESSION] |
INTO S3 bucket/target CONFIG configuration_json CREDENTIALS credentials_json [WITH COMPRESSION] |
INTO HDFS '<hdfs://<namenode DNS> | <IP address>:<port>/<directory>' [ CONFIG configuration_json ] |
INTO GCS bucket/path CONFIG configuration_json CREDENTIALS credentials_json [WITH COMPRESSION] |
INTO KAFKA kafka_topic_endpoint [kafka_configuration] [kafka_credentials] [KAFKA KEY <column_name>] |
INTO AZURE "container/blob-prefix" CREDENTIALS credentials_json [WITH COMPRESSION]
}
[format_options]
]
[INTO LINK [db name.]connection_name 'backup path']
format_options:
csv_options | external_format_options
csv_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
external_format_options:
FORMAT PARQUET
table_references:
table_factor | join_table
table_factor:
tbl_name [[AS] alias] [sample_ratio_clause]
| (subquery) [[AS] alias]

Remarks

  • The join_table clause is documented in JOIN and Subqueries.

  • The join and using clause is documented in JOIN and USING.

  • COLLATE specifies the collation of a string literal.

    SELECT "My string literal" COLLATE utf8mb4_unicode_ci;
  • CONFIG and CREDENTIALS can be specified in either order (CONFIG followed by CREDENTIALS or CREDENTIALS followed by CONFIG). For configuration examples refer BACKUP DATABASE

  • format_options clauses are documented in the Format Options section.

  • sample_ratio_clause is documented in the SELECT … WITH (SAMPLE_RATIO = <value>) section.

  • with_clause is documented on the WITH (Common Table Expressions) page.

  • If you are querying against a columnstore table with a FULLTEXT index defined, see MATCH and HIGHLIGHT for syntax and examples on how to do full-text search queries.

  • Non-aggregated data can also be transformed into a pivot table output format. See PIVOT for syntax and examples.

  • extended_grouping_expr clauses include CUBE and ROLLUP. See CUBE and ROLLUP for syntax and examples.

  • A subquery does not require an alias, assuming that removing the alias does not create ambiguity.

  • In a transaction, you can read from multiple databases.

  • You can specify a table named DUAL as a placeholder table for use with queries that do not reference a table. DUAL contains one logical row. This convenience is provided for situations where you may be required to include a FROM clause in every SQL statement, even statements that do not need one in order to function. It can also help when porting applications from other database products that support DUAL.

    SELECT CURRENT_DATE from DUAL;
    +--------------+
    | CURRENT_DATE |
    +--------------+
    | 2023-04-03   |
    +--------------+

    SELECT CURRENT_DATE from DUAL is identical to SELECT CURRENT_DATE.

    SELECT CURRENT_DATE;
    +--------------+
    | CURRENT_DATE |
    +--------------+
    | 2023-04-03   |
    +--------------+
  • A column expression can reference a column alias. For example, in the query SELECT a + b as c, c + 10 FROM t;, the column expression c + 10 references the column alias c.

  • A WHERE clause can reference a column alias. For example, in the query SELECT a + b as c FROM t WHERE c < 100;, the WHERE clause references the column alias c.

  • A HAVING clause can reference a column alias. For example, in the query SELECT a + 50 as b,COUNT(*) FROM t GROUP BY a HAVING b < 1000;, the HAVING clause references the column alias b.

  • Queries using the WHERE. . .IN clause with multiple columns is not supported.

    SELECT * FROM <table_name_1> WHERE (column_name_1, column_name_2) IN (SELECT column_name_1, column_name_2 FROM <table_name_2>;)
  • The following table and dataset were used for the examples

    CREATE TABLE assets(
    asset_id INT,
    asset_type VARCHAR(50),
    asset_desc VARCHAR(50),
    asset_value DECIMAL(6,2),
    emp_id NUMERIC (5));
    INSERT INTO assets VALUES('1049','laptop','mac_book_pro','1999.00','14'),
    ('49', 'cell_phone', 'iphone_12','879.00','102'),
    ('1100', 'laptop', 'mac_book_pro','1999.00','298'),
    ('2037', 'laptop', 'mac_book_air_M2','1199.00', '399'),
    ('58', 'cell_phone', 'iphone_12', '879.00','410'),
    ('130', 'cell_phone', 'iphone_13', '699','110'),
    ('210', 'laptop', 'mac_book_pro','2500.00','312'),
    ('111', 'laptop', 'mac_book_pro','2100.00', '089'),
    ('099', 'laptop', 'mac_book_air_M1','999','075'),
    ('140', 'cell_phone', 'iphone_13_pro','999.00', '263'),
    ('2100', 'laptop', 'mac_book_pro_M2', '2500.00', '691'),
    ('160', 'cell_phone', 'iphone_14_pro_max','1200.00', '691'),
    ('2110', 'laptop', 'mac_book_pro_M2', '2500.00', '817'),
    ('2120', 'laptop', 'mac_book_pro_M2', '2500.00', NULL),
    ('150', 'cell_phone', 'iphone_14_pro_','1100.00', NULL);
  • NULLS FIRST and NULLS LAST added to an ORDER BY clause sorts NULL values to the beginning or end of the results set. Ascending order is assumed for the non-NULL values. DESC can be added to the clause to order the non-NULL values in descending order.

    SELECT * FROM assets ORDER BY emp_id NULLS FIRST;
    +----------+------------+-------------------+-------------+--------+
    | asset_id | asset_type | asset_desc        | asset_value | emp_id |
    +----------+------------+-------------------+-------------+--------+
    |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
    |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
    |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
    |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
    |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
    |       49 | cell_phone | iphone_12         |      879.00 |    102 |
    |      130 | cell_phone | iphone_13         |      699.00 |    110 |
    |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
    |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
    |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
    |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
    |       58 | cell_phone | iphone_12         |      879.00 |    410 |
    |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
    |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
    |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
    +----------+------------+-------------------+-------------+--------+
    SELECT * FROM assets ORDER BY emp_id DESC NULLS FIRST;
    +----------+------------+-------------------+-------------+--------+
    | asset_id | asset_type | asset_desc        | asset_value | emp_id |
    +----------+------------+-------------------+-------------+--------+
    |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
    |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
    |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
    |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
    |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
    |       58 | cell_phone | iphone_12         |      879.00 |    410 |
    |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
    |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
    |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
    |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
    |      130 | cell_phone | iphone_13         |      699.00 |    110 |
    |       49 | cell_phone | iphone_12         |      879.00 |    102 |
    |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
    |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
    |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
    +----------+------------+-------------------+-------------+--------+
    SELECT * FROM assets ORDER BY emp_id DESC NULLS LAST;
    +----------+------------+-------------------+-------------+--------+
    | asset_id | asset_type | asset_desc        | asset_value | emp_id |
    +----------+------------+-------------------+-------------+--------+
    |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
    |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
    |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
    |       58 | cell_phone | iphone_12         |      879.00 |    410 |
    |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
    |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
    |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
    |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
    |      130 | cell_phone | iphone_13         |      699.00 |    110 |
    |       49 | cell_phone | iphone_12         |      879.00 |    102 |
    |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
    |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
    |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
    |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
    |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
    +----------+------------+-------------------+-------------+--------+
  • ORDER BY ALL sorts by all the columns in the same order they are written. ASC/DESC may be used but it will apply to all the columns.

    SELECT asset_id, asset_type, asset_desc FROM assets ORDER BY ALL;
    +----------+------------+-------------------+
    | asset_id | asset_type | asset_desc        |
    +----------+------------+-------------------+
    |       49 | cell_phone | iphone_12         |
    |       58 | cell_phone | iphone_12         |
    |       99 | laptop     | mac_book_air_M1   |
    |      111 | laptop     | mac_book_pro      |
    |      130 | cell_phone | iphone_13         |
    |      140 | cell_phone | iphone_13_pro     |
    |      150 | cell_phone | iphone_14_pro_    |
    |      160 | cell_phone | iphone_14_pro_max |
    |      210 | laptop     | mac_book_pro      |
    |     1049 | laptop     | mac_book_pro      |
    |     1100 | laptop     | mac_book_pro      |
    |     2037 | laptop     | mac_book_air_M2   |
    |     2100 | laptop     | mac_book_pro_M2   |
    |     2110 | laptop     | mac_book_pro_M2   |
    |     2120 | laptop     | mac_book_pro_M2   |
    +----------+------------+-------------------+
  • The GROUP BY clause is used with a SELECT statement to organize data into GROUPS. Aggregates are applied within those groups. One row of output is produced for each unique combination of values for the specified grouping expressions.

    SELECT asset_type, SUM(asset_value) AS total_value
    FROM assets GROUP BY asset_type;
    +------------+-------------+
    | asset_type | total_value |
    +------------+-------------+
    | cell_phone |     5756.00 |
    | laptop     |    18296.00 |
    +------------+-------------+
  • The GROUP BY ALL clause causes the query to group by all non-aggregate expressions in the SELECT list, so all the columns to be grouped by do not have to be entered explicitly.

    SELECT asset_id, asset_type, asset_desc, SUM(asset_value)
    FROM assets GROUP BY asset_id, asset_type, asset_desc;
    SELECT asset_id, asset_type, asset_desc, SUM(asset_value)
    FROM assets GROUP BY ALL;
    +----------+------------+-------------------+------------------+
    | asset_id | asset_type | asset_desc        | SUM(asset_value) |
    +----------+------------+-------------------+------------------+
    |      160 | cell_phone | iphone_14_pro_max |          1200.00 |
    |     2100 | laptop     | mac_book_pro_M2   |          2500.00 |
    |     2110 | laptop     | mac_book_pro_M2   |          2500.00 |
    |      140 | cell_phone | iphone_13_pro     |           999.00 |
    |       99 | laptop     | mac_book_air_M1   |           999.00 |
    |      210 | laptop     | mac_book_pro      |          2500.00 |
    |      150 | cell_phone | iphone_14_pro_    |          1100.00 |
    |       58 | cell_phone | iphone_12         |           879.00 |
    |       49 | cell_phone | iphone_12         |           879.00 |
    |     1049 | laptop     | mac_book_pro      |          1999.00 |
    |     2120 | laptop     | mac_book_pro_M2   |          2500.00 |
    |     1100 | laptop     | mac_book_pro      |          1999.00 |
    |      130 | cell_phone | iphone_13         |           699.00 |
    |     2037 | laptop     | mac_book_air_M2   |          1199.00 |
    |      111 | laptop     | mac_book_pro      |          2100.00 |
    +----------+------------+-------------------+------------------+
  • SELECT ... INTO a 3rd party file storage system (S3, HDFS, Azure Blob, GSC, or Kafka) requires SELECT, FILE WRITE , and OUTBOUND privileges.

    Without the required privileges, an error will be generated:

    SELECT ... INTO:
    singlestore> SELECT * FROM t1
    INTO S3 'testing/output'
    CONFIG '{"region":"us-east-1"}'
    CREDENTIALS '{"aws_access_key_id":"your_access_key_id","aws_secret_access_key":"your_secret_access_key"}';
    ERROR 1142 (42000): OUTBOUND command denied to user 'user1'@'%' for table 't1'

    Note

    Users will still be able to execute SELECT ... FROM; but only if outbound connectivity is not created.

    Here's how to set the privileges for a user to be allowed to execute this command:

    SELECT ... INTO:
    singlestore> GRANT SELECT, FILE WRITE, OUTBOUND ON *.* to user1;

    The OUTBOUND privilege will be displayed in the SHOW GRANTS command:

    SHOW GRANTS FOR user1;
    +-----------------------------------------------------------+
    | Grants for user1@%                                        |
    +-----------------------------------------------------------+
    | GRANT SELECT, FILE WRITE, OUTBOUND ON *.* TO 'user1'@'%   |
    +-----------------------------------------------------------+
  • Below is an S3 example for SELECT * FROM <TABLE> INTO OUTFILE using compression.

    SELECT * FROM <table_name> into S3 's3://<bucket_name>/<filename>'
    CONFIG '{"region":"us-east-1"}'
    CREDENTIALS
    {"aws_access_key_id": "<xxxxxxxx>",
    "aws_secret_access_key": "<xxxxxxxxxxxxxxx>"}'
    FIELDS TERMINATED BY ','
    LINES TERMINATED BY '\n'
    WITH COMPRESSION GZIP;

Aggregations for Expression Syntax

SingleStore supports these Aggregate Functions for expression syntax in SELECT statements:

All aggregate functions exclude NULLs from their computations. For example, SELECT COUNT(c) FROM t returns the number of rows in t where c is not NULL. SELECT AVG(c) FROM t returns the average of the non-NULL c’s in t.

COUNT(*) is equivalent to COUNT(1). It counts every row matching the query, even rows whose column data contain nothing but NULLs.

SELECT … LIMIT

Syntax

LIMIT {[offset,] row_count | row_count OFFSET offset}

Remarks

  • The LIMIT clause constrains the number of rows returned by the SELECT statement.

  • Both the arguments must be non-negative integer constants.

  • The row_count specifies the number of rows to return from the beginning of the result set, and the offset specifies the offset of the first row to return.

  • The offset of the first row in a table is 0 (not 1).

Examples

SELECT * FROM hrRec;
+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Adam      | Gale      | Brooklyn |     40 |
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
| Todd      | Bridges   | DC       |     30 |
| Ron       | Fletcher  | New York |     23 |
+-----------+-----------+----------+--------+
SELECT * FROM hrRec LIMIT 2;
+-----------+----------+----------+--------+
| FirstName | LastName | City     | Tenure |
+-----------+----------+----------+--------+
| Adam      | Gale     | Brooklyn |     40 |
| Samantha  | Beck     | New York |     44 |
+-----------+----------+----------+--------+
SELECT * FROM hrRec LIMIT 1,2;
+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
+-----------+-----------+----------+--------+
SELECT * FROM hrRec LIMIT 2 OFFSET 1;
+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
+-----------+-----------+----------+--------+

SELECT … FOR UPDATE

The SELECT ... FOR UPDATE command is intended to be used inside of a transaction. This command takes write locks on the rows returned from the SELECT query and the locks are held until the end of the transaction. Other transactions are blocked and cannot write to the locked rows until the locks are released.

SingleStore recommends the following when using SELECT ... FOR UPDATE:

  • SingleStore recommends to commit or abort the transaction immediately so that the locks are released. Too many locks in a transaction can result in a huge queue of transactions waiting on the locked rows.

  • To limit the number of rows that are locked, SingleStore recommends to use a column with unique values in the WHERE clause of the SELECT statement, for example the PRIMARY KEY column.

Example

The following example uses the Orders table:

DESCRIBE Orders;
+--------------+-------------+------+------+---------+-------+
| Field        | Type        | Null | Key  | Default | Extra |
+--------------+-------------+------+------+---------+-------+
| OrderNumber  | int(11)     | NO   | PRI  | NULL    |       |
| Status       | varchar(20) | NO   |      | NULL    |       |
| CustomerName | char(20)    | YES  |      | NULL    |       |
+--------------+-------------+------+------+---------+-------+
SELECT * FROM Orders;
+-------------+------------+--------------+
| OrderNumber | Status     | CustomerName |
+-------------+------------+--------------+
|           1 | Delivered  | John         |
|           3 | In Transit | Bon          |
|           2 | Delivered  | Kerry        |
|           4 | Delivered  | Tom          |
+-------------+------------+--------------+

The following transaction locks the row where OrderNumber is 3, using the FOR UPDATE clause. The row will remain locked as long as the transaction is open.

BEGIN;
SELECT * FROM Orders WHERE OrderNumber = 3 FOR UPDATE;
+-------------+------------+--------------+
| OrderNumber | Status     | CustomerName |
+-------------+------------+--------------+
|           3 | In Transit | Bon          |
+-------------+------------+--------------+

Now, execute the following query in a different connection:

UPDATE Orders SET Status = "Delivered" WHERE OrderNumber=3;
ERROR 1205 (HY000): Leaf Error (127.0.0.1:3307): Lock wait timeout exceeded; try restarting transaction.  Lock owned by connection id 77, query `open idle transaction`

The above query returns an error since the rows are locked by the previous transaction. To remove the lock on the rows, commit or rollback the open transaction. See the Query Errors topic for more information on locked rows.

JOIN and Subqueries

Syntax

SingleStore Helios supports the following JOIN and subquery syntax for the table_reference part of SELECT statements:

join_table:
table_reference {LEFT | RIGHT | FULL} [OUTER] JOIN table_factor join_condition
| table_reference [INNER | CROSS] JOIN table_factor [join_condition]
| table_reference NATURAL {LEFT | RIGHT} [OUTER] JOIN table_factor
| table_reference STRAIGHT_JOIN table_factor [join_condition]
| table_reference, table_factor
join_condition:
ON conditional_expr

Remarks

  • STRAIGHT_JOIN forces tables to be joined in the order in which they are listed in the FROM clause

  • FULL OUTER JOIN requires the join condition to be an equality.

  • SingleStore supports joining tables across databases:

    CREATE DATABASE test1;
    USE test1;
    CREATE TABLE t1(id INT, col1 VARCHAR(10));
    CREATE DATABASE test2;
    USE test2;
    CREATE TABLE t2(id INT, col1 VARCHAR(10));
    SELECT test1.t1.*, test2.t2.* FROM test1.t1 JOIN test2.t2 ON t1.id = t2.id;

Examples

SELECT * FROM my_MemSQL_table WHERE col = 1;
SELECT COUNT(*), user_name, page_url from clicks, users, pages
-> WHERE clicks.user_id = users.user_id AND pages.page_id = clicks.page_id
-> GROUP BY users.user_id, pages.page_id
-> ORDER BY COUNT(*) DESC;
+- ---------+- ----------+- -------------------------------+
| COUNT(*) | user_name | page_url |
+- ---------+- ----------+- -------------------------------+
| 5 | jake | memsql.com |
| 2 | john | http://www.singlestore.com/download |
| 1 | jake | docs.singlestore.com |
| 1 | jake | memsql.com |
| 1 | jake | http://www.singlestore.com/download |
+- ---------+- ----------+- -------------------------------+
5 rows in set (0.00 sec)
SELECT t1.*, t2.* FROM t1 FULL OUTER JOIN t2 ON t1.a = t2.a;

Nested Scalar Sub-Selects

SELECT statements can be nested in SingleStore Helios queries.

SELECT ... [SELECT ...[SELECT [...]]]

Remarks

  • For scalar sub-selects, sub-select queries must not return more than one row.

  • The maximum allowed depth of nested sub-select queries is 40.

  • Sub-selects are not supported inside GROUP BY/ORDER BY/HAVING clauses, for nested sub-select queries of depth > 2.

Examples

The following examples show the use of nested sub-selects.

SELECT cust_id FROM customers
WHERE EXISTS
( SELECT order_id FROM orders
WHERE order_id IN
( SELECT id FROM transaction WHERE count > 5)
);
DELETE FROM records
WHERE
id = ( SELECT order_id
FROM orders
WHERE order_date > ( SELECT CURRENT_DATE() + 30)
);

JOIN and USING

Syntax

Using a JOIN clause with the USING clause will match only one column when there are more than one columns that match.

SELECT <table1_name>.<column_name> AS <column_name>
FROM <table_name> JOIN <table_name> USING (<column_name>);

Remarks

  • You can utilize the USING clause instead of the ON clause in JOIN operations with an explicit JOIN clause.

Examples

TABLE 1:
CREATE TABLE assets(
asset_id int,
asset_type varchar(50),
asset_desc varchar(50),
emp_id numeric (5)
);
TABLE 2:
CREATE TABLE
employees(
emp_id numeric (5),
emp_name varchar(75)
);
SELECT employees.emp_name AS employees, asset_desc as assets, assets.asset_type
    FROM assets JOIN employees USING (emp_id);
+------------------------------------+
|employees |assets      |asset_type  |
+------------------------------------+
|T_Willams |macbook pro | laptop     |
+------------------------------------+
|A_Young   |iphone 12   | cell phone |
+------------------------------------+
|D_Karras  |macbook air | laptop     |
+------------------------------------+

SELECT … INTO <variable>

SELECT ... INTO variable statement is used to initialize variables in stored procedures, anonymous code blocks, and session variables. It allows you to query a single value, a single column, or multiple columns from one or more tables and store the selected values into variable(s).

For information about creating user defined variables for use outside of stored procedures, see User-Defined Variables.

Remarks

  • The SELECT ... INTO statement must return only a single result row.

  • The number of columns/expressions in the SELECT query must be the same as the number of variables being initialized in the INTO list.

  • SELECT ... INTO variable statements must be used inside PSQL procedure blocks.

  • The INTO variable clause can only be used once inside a SELECT query.

  • A SELECT ... INTO statement cannot be used inside of a sub-select query.

  • The variables in SELECT ... INTO statements must be declared with scalar data types.

  • SELECT ... INTO variable and SELECT ... INTO OUTFILE/S3/KAFKA cannot be used in the same query.

  • If the SELECT ... INTO statement returns 0 rows, SingleStore throws an error: ER_INTO_VARIABLES_NO_ROWS. To accommodate for this error, you can specify an EXCEPTION in the DECLARE block of a stored procedure (see Example 3 below).

Examples

Note: The following examples use the hrRec table.

DESC hrRec;
+-----------+-------------+------+------+---------+-------+
| Field     | Type        | Null | Key  | Default | Extra |
+-----------+-------------+------+------+---------+-------+
| FirstName | varchar(20) | YES  |      | NULL    |       |
| LastName  | varchar(20) | YES  |      | NULL    |       |
| City      | varchar(20) | YES  |      | NULL    |       |
| Tenure    | int(11)     | YES  |      | NULL    |       |
+-----------+-------------+------+------+---------+-------+

Example 1

The following example queries multiple columns from a single row and stores them in variables.

DELIMITER //
CREATE OR REPLACE PROCEDURE into_var ()
AS
DECLARE
fname VARCHAR(20); lname VARCHAR(20); city VARCHAR(20); ten INT;
BEGIN
SELECT FirstName, LastName, City, Tenure
INTO fname, lname, city, ten
FROM hrRec
WHERE Tenure > 40;
ECHO SELECT CONCAT(fname, " ", lname) AS "Name", city AS "City", ten AS "Tenure";
END //
DELIMITER ;
CALL into_var();
+---------------+----------+--------+
| Name          | City     | Tenure |
+---------------+----------+--------+
| Samantha Beck | New York |     44 |
+---------------+----------+--------+

Example 2

The following example queries multiple aggregate functions and stores their values in variables.

DELIMITER //
CREATE OR REPLACE PROCEDURE into_var () AS
DECLARE
row_c INT; sum_t INT;
BEGIN
SELECT COUNT(*), SUM(Tenure) INTO row_c, sum_t FROM hrRec;
ECHO SELECT row_c AS "Row Count", sum_t AS "Tenure";
END //
DELIMITER ;
CALL into_var();
+-----------+--------+
| Row Count | Tenure |
+-----------+--------+
|         5 |    170 |
+-----------+--------+

Alternatively, you can query the values in a dynamic query and store them in variables, like

DELIMITER //
CREATE OR REPLACE PROCEDURE into_var () AS
DECLARE
row_c INT; sum_t INT; qry VARCHAR(50);
BEGIN
qry = "SELECT COUNT(*), SUM(Tenure) FROM hrRec";
EXECUTE IMMEDIATE qry INTO row_c, sum_t;
ECHO SELECT row_c AS "Row Count", sum_t AS "Tenure";
END //
DELIMITER ;
CALL into_var();
+-----------+--------+
| Row Count | Tenure |
+-----------+--------+
|         5 |    170 |
+-----------+--------+

Example 3

The following example shows how to assign a value to a variable if the SELECT ... INTO statement returns 0 rows.

DELIMITER //
CREATE OR REPLACE PROCEDURE cityCount () AS
DECLARE cnum INT;
BEGIN
SELECT City INTO cnum FROM hrRec WHERE Tenure > 50;
ECHO SELECT cnum;
END //
DELIMITER ;
CALL cityCount();
ERROR 2439 (HY000): Unhandled exception
Type: ER_INTO_VARIABLES_NO_ROWS (2439)
Message: Query with 'INTO VARIABLES' clause returned zero rows whereas expected 1 row

Add an exception to the stored procedure.

DELIMITER //
CREATE OR REPLACE PROCEDURE cityCount () AS
DECLARE cnum INT;
BEGIN
SELECT City INTO cnum FROM hrRec WHERE Tenure > 50;
ECHO SELECT cnum;
EXCEPTION
WHEN ER_INTO_VARIABLES_NO_ROWS THEN
cnum = 'No city found!';
ECHO SELECT cnum;
END //
DELIMITER ;
CALL cityCount();
+----------------+
| cnum           |
+----------------+
| No city found! |
+----------------+

You can also add an exception for a scenario where the SELECT ... INTO statement returns more than one row. For example,

EXCEPTION
WHEN ER_SUBQUERY_NO_1_ROW THEN
ECHO SELECT 'SELECT ... INTO statement returned more than 1 row.';

SELECT ... ALL DISTINCT DISTINCTROW

Return rows with or without duplicates.

Syntax

SELECT [ALL | DISTINCT | DISTINCTROW] FROM table_reference

Remarks

  • The ALL clause returns all the matching rows including duplicates.

    The DISTINCT or DISTINCTROW clause returns the matching rows but eliminates the duplicate rows from the output. DISTINCTROW is a synonym of DISTINCT.

Examples

SELECT * FROM Product;
+------------+------------+----------+
| Product_id | Brand_name | City     |
+------------+------------+----------+
|          1 | Nike       | London   |
|          3 | Nike       | New York |
|          2 | Adidas     | Paris    |
|          3 | Puma       | Spain    |
+------------+------------+----------+
SELECT ALL Brand_name FROM Product;
+------------+
| Brand_name |
+------------+
| Nike       |
| Nike       |
| Adidas     |
| Puma       |
+------------+
SELECT DISTINCT Brand_name FROM Product;
+------------+
| Brand_name |
+------------+
| Nike       |
| Adidas     |
| Puma       |
+------------+

SELECT ... INTO AZURE

SELECT ... INTO AZURE performs a distributed SELECT into an Azure container. The CREDENTIALS clause is required and its value is defined in credentials_json as follows:

credentials_json:
'{"account_name": "your_account_name_here",
"account_key": "your_account_key_here"
}'

The WITH COMPRESSION GZIP clause, if specified, must be included after the CREDENTIALS clause. If WITH COMPRESSION GZIP is specified, the results of the SELECT query are written to the Azure container in compressed gzip files. A compressed file with the .gz extension is created for each partition.

The WITH COMPRESSION clause can be specified instead of the WITH COMPRESSION GZIP clause. Both clauses are equivalent. However, if other compression formats are supported in the future, WITH COMPRESSION may use a compression format other than gzip.

format_options , if specified, must be included after the CREDENTIALS clause, or after the WITH COMPRESSION clause (if it is specified). The format_options are similar to the parsing options used with LOAD DATA. See Format Options for more information.

format_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]

As an alternative to using a SELECT ... INTO AZURE statement where you specify the CREDENTIALS clause, you can use a SELECT ... INTO LINK statement, where you reference a connection link. For more information, see Configuring and Using Connection Links.

SELECT … INTO HDFS

SELECT INTO ... HDFS writes the result of a SELECT query to HDFS. The configuration options for SELECT INTO ... HDFS are similar to the HDFS Pipelines syntax.

Example

The following example demonstrates how to write the contents of a table to HDFS at a specified path.

SELECT * FROM stockTo
INTO HDFS 'hdfs://hadoop-namenode:8020/stock_dir/records_file.csv'

SELECT ... INTO LINK writes the results of a SELECT query to S3, GCS, HDFS, or Kafka using a connection link. To use this command, you only need to know the connection link name, not the connection details and configuration. However, you require the SHOW LINK permission, provided by your administrator, to use a connection link.

Note

SASL OAUTHBEARER is supported with CREATE PIPELINE but not with SELECT INTO KAFKA.

Example

The following example writes the contents of the table t1, to the S3 bucket at the specified path, using the S3 connection link S3con stored in the db1 database:

USE db1;
SELECT * FROM t1 INTO LINK S3con 'testing/output';

Note: The connection link S3Con should already exist in db1. Refer to the CREATE LINK topic for details on creating a connection link.

SELECT … INTO S3

SELECT ... INTO S3 performs a distributed select into a S3 bucket. The CREDENTIALS clause is required and its value is defined in credentials_json as follows.

configuration_json:
'{"region":"your_region" [, "multipart_chunk_size_mb":<size_in_MB>]}'
credentials_json:
'{"aws_access_key_id": "replace_with_your_access_key_id",
"aws_secret_access_key": "replace_with_your_secret_access_key",
["aws_session_token": "replace_with_your_temp_session_token",]
}'

The WITH COMPRESSION GZIP clause, if specified, must be included after the CREDENTIALS clause. If WITH COMPRESSION GZIP is specified, the results of the SELECT query are written to the S3 bucket in compressed gzip files. A compressed file with the .gz extension is created for each partition.

The WITH COMPRESSION clause can be specified instead of the WITH COMPRESSION GZIP clause. Both clauses are equivalent. However, if other compression formats are supported in the future, WITH COMPRESSION may use a compression format other than gzip.

format_options , if specified, must be included after the CREDENTIALS clause, or after the WITH COMPRESSION clause (if it is specified). The format_options are similar to the parsing options used with LOAD DATA. See Format Options for more information.

format_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]

Remarks

Warning

The S3 bucket needs to be created before running this command.

  • The multipart_chunk_size_mb must be in the range of [5..500]. By default, the chunk size is 5 MB. A larger chunk size allows users to upload large files without going over Amazon’s limitation on maximum number of parts per upload. Although, a larger chunk size increases the chance of a network error during the upload to S3. If a chunk fails to upload, SingleStore Helios retries uploading it until the limit set on the number of retries by AWS is reached.

    Note

    Each partition will use multipart_chunk_size_mb MB(s) of additional memory.

  • The output of SELECT ... INTO S3 is stored with the content type binary/octet-stream in the S3 bucket.

  • If the insert select logic determines that the SELECT ... INTO S3 query can be run as distributed, the query will be pushed down to each leaf and executed on each partition. The name of each object will be:

    <bucket/target>_<partition ID>
  • If the insert select logic determines that the SELECT ... INTO S3 query can only be on the aggregator because it contains aggregations, ORDER BY, GROUP BY, etc. then the query will be run on each leaf but the result will be collected on the aggregator and then output to S3. The object name will just be:

    <bucket/target>
  • The SELECT query will validate if the <bucket/target> or <bucket/target>_ already exists on the S3 bucket first and fail if any of the object(s) already exist.

  • As an alternative to using a SELECT ... INTO S3 statement where you specify the CONFIG and CREDENTIALS clauses, you can use a SELECT ... INTO LINK statement, where you reference a connection link. For more information, see Configuring and Using Connection Links.

Examples

The following simple example shows how to save all rows in table t1 to an S3 bucket using an AWS access key.

SELECT *
FROM t1
INTO S3 'testing/output'
CONFIG '{"region":"us-east-1"}'
CREDENTIALS '{"aws_access_key_id":"your_access_key_id","aws_secret_access_key":"your_secret_access_key"}'

The following example uses the format options to output the data in CSV format.

SELECT *
FROM t
INTO S3 'tmp/a'
CONFIG '{"region":"us-east-1"}'
CREDENTIALS '{"aws_access_key_id":"your_access_key_id","aws_secret_access_key":"your_secret_access_key"}'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'

SELECT … INTO GCS

SELECT ... INTO GCS performs a distributed select into a Google Cloud Storage (GCS) bucket.The CONFIG clause is not required, and can be excluded or left empty as defined in the configuration_json that follows. The CREDENTIALS clause is required and its value is defined in the credentials_json that follows.

configuration_json:
'{}'
credentials_json:
'{"access_id": "replace_with_your_google_access_key",
"secret_key": "replace_with_your_google_secret_key"
}'

The WITH COMPRESSION GZIP clause, if specified, must be included after the CREDENTIALS clause. If WITH COMPRESSION GZIP is specified, the results of the SELECT query are written to the GCS bucket in compressed gzip files. A compressed file with the .gz extension is created for each partition.

The WITH COMPRESSION clause can be specified instead of the WITH COMPRESSION GZIP clause. Both clauses are equivalent. However, if other compression formats are supported in the future, WITH COMPRESSION may use a compression format other than gzip.

format_options , if specified, must be included after the CREDENTIALS clause, or after the WITH COMPRESSION clause (if it is specified). The format_options are similar to the parsing options used with LOAD DATA. See Format Options for more information.

format_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]

Remarks

  • Unlike other filesystem options, with SELECT ... INTO GCS, data written by a single partition is not divided into chunks. However, there may still be multiple partitions and thus multiple files (one per partition) will appear in GCS.

  • The maximum object size supported by GCS is 5 TB; this limit also applies to SELECT ... INTO GCS.

  • The CONFIG clause may optionally specify an endpoint_url.

  • The CREDENTIALS clause is required.

  • We support only HMAC keys.

  • The CREDENTIALS clause should be a JSON object with two fields:

access_id: usually a 24 or 60 character alphanumeric string, which is linked to the Google account, typically all uppercase and starts with GOOG.

secret_key: usually a 40 character Base-64 encoded string that is linked to a specific access_id.

As an alternative to using a SELECT ... INTO GCS statement where you specify the CONFIG and CREDENTIALS clauses, you can use a SELECT ... INTO LINK statement, where you reference a connection link. For more information, see Configuring and Using Connection Links.

Examples

The following simple example shows how to save all rows in table table_name to a GCS bucket using a Google access key, and outputs them as a CSV.

CREATE TABLE table_name (column_name INT);
INSERT INTO table_name VALUES (1), (2), (3);
SELECT *
FROM table_name
INTO GCS 'bucket/path'
CREDENTIALS '{"access_id": "replace_with_your_google_access_key", "secret_key": "replace_with_your_google_secret_key"}' FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';

SELECT … INTO KAFKA …

SELECT ... INTO KAFKA ... runs a SELECT query, constructs a Kafka message for each row in the result set, and publishes the messages to a Kafka topic.

When SELECT ... INTO KAFKA ... constructs a Kafka message, it includes every column value in the result set's row and separates the column values by a delimiter. To configure the delimiter and other message formatting settings, use the Format Options clauses.

kafka_configuration:
CONFIG 'string'
kafka_credentials:
CREDENTIALS 'string'
format_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[TERMINATED BY 'string']
[STARTING BY 'string']
]
kafka_topic_endpoint:
host:port[, ...]/topic

Arguments

kafka_configuration

Optional. The Kafka configuration properties, specified in JSON format, that are used while SingleStore Helios publishes messages to a topic. The configuration properties are the same properties that are stored in the server.properties file on each Kafka broker.

kafka_credentials

Optional. The credentials, in JSON format, used to connect to Kafka.

format_options

Optional. Clauses that specify how a Kafka message is formatted when it is constructed from a row in the SELECT result set. See the Format Options section for more information.

See an example of how to use the clauses.

kafka_topic_endpoint

The list of Kafka brokers, followed by the topic to which SingleStore Helios will publish messages. For each broker in the list, specify its host and port.

KAFKA KEY <column_name>

Optional. Specify the provided column name as the Kafka key.

KAFKA KEY should be used only for kafka targets.

The Kafka key column name must be a valid column name in the database table.

The Kafka key column name must be the same as the first column name of the SELECT INTO statement.

Remarks

  • When possible, SingleStore Helios queries leaf nodes directly, bypassing the aggregator. This allows SingleStore Helios to send data directly from the leaves to Kafka partitions.

  • SingleStore Helios constructs a Kafka message as an array of bytes.

  • As an alternative to using a SELECT ... INTO KAFKA statement where you specify the CONFIG and CREDENTIALS clauses, you can use a SELECT ... INTO LINK statement, where you reference a connection link. For more information, see Configuring and Using Connection Links.

Examples

Example: Specifying the Kafka Message Format

The following query uses the FIELDS and LINES clauses to format the Kafka messages that are constructed from the SELECT result set.

SELECT col1, col2, col3 FROM t
ORDER BY col1
INTO KAFKA 'host.example.com:9092/test-topic'
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY "\t"
LINES TERMINATED BY '}' STARTING BY '{';

Suppose the result set returned by SELECT col1, col2, col3 FROM t ORDER BY col1 is:

+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| a | b | c |
| d | e | f |
| g | h\ti | j |
+------+------+------+

This result set will be converted to three Kafka messages having the following format:

Message 1: {a,b,c}

Message 2: {d,e,f}

Message 3: {g,h<tab character>i,j}

<tab character> will be replaced by the number of spaces that your tab is set to.

Note

If a SELECT ... INTO ... KAFKA query does not include the FIELDS... and LINES... clauses, the default Kafka message formatting settings are used. The default settings would look something like this:

SELECT col1, col2, col3 FROM t
ORDER BY col1
INTO KAFKA 'host.example.com:9092/test-topic'
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
LINES TERMINATED BY '\n' STARTING BY '';

Example: Publishing Kafka Messages to Confluent Cloud

SELECT text FROM t INTO
KAFKA '<Confluent Cloud workspace endpoint>/test-topic'
CONFIG '{"sasl.username": "<CLUSTER_API_KEY>",
"sasl.mechanism": "PLAIN",
"security.protocol": "SASL_SSL",
"ssl.ca.location": "/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem"}'
CREDENTIALS '{"sasl.password": "<CLUSTER_API_SECRET>"}';

Prior to running the SELECT ... INTO statement, you must upload an SSL certificate to the SingleStore Helios. This is explained in the next paragraph.

Use the following steps to enable TLS/SSL encryption between SingleStore Helios and Kafka.

  1. From the Workspaces page on the Cloud Portal, click on the workspace on which to enable TLS/SSL connections.

  2. Click the Security tab at the top of the page.

  3. Click on the Upload Certificate button to upload your CA certificate. This will make it available to all the nodes and will allow you to secure outbound connections via TLS/SSL.

Example: Specifying One Kafka Broker

The following example imports the data in the column text from table t into the Kafka topic test-topic. One Kafka broker is specified; it is located on port 9092 at host.example.com.

SELECT text FROM t INTO
KAFKA 'host.example.com:9092/test-topic'

Example: Specifying Multiple Kafka Brokers

The following example imports the data in the column text from table t into the Kafka topic test-topic. Three Kafka brokers are specified, located on port 9092 , at host1.example.com, host2.example.com and host3.example.com.