SELECT

Retrieves data from a table.

Syntax

[with_clause]
SELECT
[ALL | DISTINCT | DISTINCTROW]
select_expr [[AS] alias], ...
[COLLATE collation]
[INTO variable [, ...]]
[ FROM table_references
[WHERE expr]
[GROUP BY [{{col_name | expr | position}, ...} | ALL]]
| extended_grouping_expr}]
[HAVING expr]
[ORDER BY [{col_name | expr | position} [ASC | DESC ][ NULLS {FIRST | LAST}], ... | ALL]]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
[FOR UPDATE]
]
[{ INTO OUTFILE 'file_name' |
INTO FS 'destination_directory/file_name' [WITH COMPRESSION] |
INTO S3 bucket/target CONFIG configuration_json CREDENTIALS credentials_json [WITH COMPRESSION] |
INTO HDFS '<hdfs://<namenode DNS> | <IP address>:<port>/<directory>' [ CONFIG configuration_json ] |
INTO GCS bucket/path CONFIG configuration_json CREDENTIALS credentials_json [WITH COMPRESSION] |
INTO KAFKA kafka_topic_endpoint [kafka_configuration] [kafka_credentials] [KAFKA KEY <column_name>] |
INTO AZURE "container/blob-prefix" CREDENTIALS credentials_json [WITH COMPRESSION]
}
[format_options]
]
[INTO LINK [db name.]connection_name 'backup path']
format_options:
csv_options | external_format_options
csv_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
external_format_options:
FORMAT PARQUET
table_references:
table_factor | join_table
table_factor:
tbl_name [[AS] alias] [sample_ratio_clause]
| (subquery) [[AS] alias]

Remarks

  • The join_table clause is documented in JOIN and Subqueries.

  • The join and using clause is documented in JOIN and USING.

  • The LATERAL keyword can be placed before any subquery (also known as a lateral derived table) used as an inner table within a join. Refer to Lateral Join for details.

  • COLLATE specifies the collation of a string literal.

    SELECT "My string literal" COLLATE utf8mb4_unicode_ci;
  • CONFIG and CREDENTIALS can be specified in either order (CONFIG followed by CREDENTIALS or CREDENTIALS followed by CONFIG). For configuration examples refer BACKUP DATABASE

  • format_options clauses are documented in the Format Options section.

  • sample_ratio_clause is documented in the SELECT … WITH (SAMPLE_RATIO = <value>) section.

  • with_clause is documented on the WITH (Common Table Expressions) page.

  • If you are querying against a columnstore table with a FULLTEXT index defined, see MATCH and HIGHLIGHT for syntax and examples on how to do full-text search queries.

  • Non-aggregated data can also be transformed into a pivot table output format. See PIVOT for syntax and examples.

  • extended_grouping_expr clauses include CUBE and ROLLUP. See CUBE and ROLLUP for syntax and examples.

  • A subquery does not require an alias, assuming that removing the alias does not create ambiguity.

  • In a transaction, you can read from multiple databases.

  • This command must be run on the master aggregator or a child aggregator node (see Cluster Management Commands).

  • You can specify a table named DUAL as a placeholder table for use with queries that do not reference a table. DUAL contains one logical row. This convenience is provided for situations where you may be required to include a FROM clause in every SQL statement, even statements that do not need one in order to function. It can also help when porting applications from other database products that support DUAL.

    SELECT CURRENT_DATE from DUAL;
    +--------------+
    | CURRENT_DATE |
    +--------------+
    | 2023-04-03   |
    +--------------+

    SELECT CURRENT_DATE from DUAL is identical to SELECT CURRENT_DATE.

    SELECT CURRENT_DATE;
    +--------------+
    | CURRENT_DATE |
    +--------------+
    | 2023-04-03   |
    +--------------+
  • A column expression can reference a column alias. For example, in the query SELECT a + b as c, c + 10 FROM t;, the column expression c + 10 references the column alias c.

  • A WHERE clause can reference a column alias. For example, in the query SELECT a + b as c FROM t WHERE c < 100;, the WHERE clause references the column alias c.

  • A HAVING clause can reference a column alias. For example, in the query SELECT a + 50 as b,COUNT(*) FROM t GROUP BY a HAVING b < 1000;, the HAVING clause references the column alias b.

  • Queries using the WHERE. . .IN clause with multiple columns is not supported.

    SELECT * FROM <table_name_1> WHERE (column_name_1, column_name_2) IN (SELECT column_name_1, column_name_2 FROM <table_name_2>;)
  • The following table and dataset were used for the examples

    CREATE TABLE assets(
    asset_id INT,
    asset_type VARCHAR(50),
    asset_desc VARCHAR(50),
    asset_value DECIMAL(6,2),
    emp_id NUMERIC (5));
    INSERT INTO assets VALUES('1049','laptop','mac_book_pro','1999.00','14'),
    ('49', 'cell_phone', 'iphone_12','879.00','102'),
    ('1100', 'laptop', 'mac_book_pro','1999.00','298'),
    ('2037', 'laptop', 'mac_book_air_M2','1199.00', '399'),
    ('58', 'cell_phone', 'iphone_12', '879.00','410'),
    ('130', 'cell_phone', 'iphone_13', '699','110'),
    ('210', 'laptop', 'mac_book_pro','2500.00','312'),
    ('111', 'laptop', 'mac_book_pro','2100.00', '089'),
    ('099', 'laptop', 'mac_book_air_M1','999','075'),
    ('140', 'cell_phone', 'iphone_13_pro','999.00', '263'),
    ('2100', 'laptop', 'mac_book_pro_M2', '2500.00', '691'),
    ('160', 'cell_phone', 'iphone_14_pro_max','1200.00', '691'),
    ('2110', 'laptop', 'mac_book_pro_M2', '2500.00', '817'),
    ('2120', 'laptop', 'mac_book_pro_M2', '2500.00', NULL),
    ('150', 'cell_phone', 'iphone_14_pro_','1100.00', NULL);
  • NULLS FIRST and NULLS LAST added to an ORDER BY clause sorts NULL values to the beginning or end of the results set. Ascending order is assumed for the non-NULL values. DESC can be added to the clause to order the non-NULL values in descending order.

    SELECT * FROM assets ORDER BY emp_id NULLS FIRST;
    +----------+------------+-------------------+-------------+--------+
    | asset_id | asset_type | asset_desc        | asset_value | emp_id |
    +----------+------------+-------------------+-------------+--------+
    |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
    |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
    |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
    |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
    |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
    |       49 | cell_phone | iphone_12         |      879.00 |    102 |
    |      130 | cell_phone | iphone_13         |      699.00 |    110 |
    |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
    |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
    |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
    |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
    |       58 | cell_phone | iphone_12         |      879.00 |    410 |
    |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
    |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
    |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
    +----------+------------+-------------------+-------------+--------+
    SELECT * FROM assets ORDER BY emp_id DESC NULLS FIRST;
    +----------+------------+-------------------+-------------+--------+
    | asset_id | asset_type | asset_desc        | asset_value | emp_id |
    +----------+------------+-------------------+-------------+--------+
    |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
    |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
    |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
    |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
    |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
    |       58 | cell_phone | iphone_12         |      879.00 |    410 |
    |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
    |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
    |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
    |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
    |      130 | cell_phone | iphone_13         |      699.00 |    110 |
    |       49 | cell_phone | iphone_12         |      879.00 |    102 |
    |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
    |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
    |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
    +----------+------------+-------------------+-------------+--------+
    SELECT * FROM assets ORDER BY emp_id DESC NULLS LAST;
    +----------+------------+-------------------+-------------+--------+
    | asset_id | asset_type | asset_desc        | asset_value | emp_id |
    +----------+------------+-------------------+-------------+--------+
    |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
    |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
    |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
    |       58 | cell_phone | iphone_12         |      879.00 |    410 |
    |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
    |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
    |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
    |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
    |      130 | cell_phone | iphone_13         |      699.00 |    110 |
    |       49 | cell_phone | iphone_12         |      879.00 |    102 |
    |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
    |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
    |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
    |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
    |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
    +----------+------------+-------------------+-------------+--------+
  • ORDER BY ALL sorts by all the columns in the same order they are written. ASC/DESC may be used but it will apply to all the columns.

    SELECT asset_id, asset_type, asset_desc FROM assets ORDER BY ALL;
    +----------+------------+-------------------+
    | asset_id | asset_type | asset_desc        |
    +----------+------------+-------------------+
    |       49 | cell_phone | iphone_12         |
    |       58 | cell_phone | iphone_12         |
    |       99 | laptop     | mac_book_air_M1   |
    |      111 | laptop     | mac_book_pro      |
    |      130 | cell_phone | iphone_13         |
    |      140 | cell_phone | iphone_13_pro     |
    |      150 | cell_phone | iphone_14_pro_    |
    |      160 | cell_phone | iphone_14_pro_max |
    |      210 | laptop     | mac_book_pro      |
    |     1049 | laptop     | mac_book_pro      |
    |     1100 | laptop     | mac_book_pro      |
    |     2037 | laptop     | mac_book_air_M2   |
    |     2100 | laptop     | mac_book_pro_M2   |
    |     2110 | laptop     | mac_book_pro_M2   |
    |     2120 | laptop     | mac_book_pro_M2   |
    +----------+------------+-------------------+
  • The GROUP BY clause is used with a SELECT statement to organize data into GROUPS. Aggregates are applied within those groups. One row of output is produced for each unique combination of values for the specified grouping expressions.

    SELECT asset_type, SUM(asset_value) AS total_value
    FROM assets GROUP BY asset_type;
    +------------+-------------+
    | asset_type | total_value |
    +------------+-------------+
    | cell_phone |     5756.00 |
    | laptop     |    18296.00 |
    +------------+-------------+
  • The GROUP BY ALL clause causes the query to group by all non-aggregate expressions in the SELECT list, so all the columns to be grouped by do not have to be entered explicitly.

    SELECT asset_id, asset_type, asset_desc, SUM(asset_value)
    FROM assets GROUP BY asset_id, asset_type, asset_desc;
    SELECT asset_id, asset_type, asset_desc, SUM(asset_value)
    FROM assets GROUP BY ALL;
    +----------+------------+-------------------+------------------+
    | asset_id | asset_type | asset_desc        | SUM(asset_value) |
    +----------+------------+-------------------+------------------+
    |      160 | cell_phone | iphone_14_pro_max |          1200.00 |
    |     2100 | laptop     | mac_book_pro_M2   |          2500.00 |
    |     2110 | laptop     | mac_book_pro_M2   |          2500.00 |
    |      140 | cell_phone | iphone_13_pro     |           999.00 |
    |       99 | laptop     | mac_book_air_M1   |           999.00 |
    |      210 | laptop     | mac_book_pro      |          2500.00 |
    |      150 | cell_phone | iphone_14_pro_    |          1100.00 |
    |       58 | cell_phone | iphone_12         |           879.00 |
    |       49 | cell_phone | iphone_12         |           879.00 |
    |     1049 | laptop     | mac_book_pro      |          1999.00 |
    |     2120 | laptop     | mac_book_pro_M2   |          2500.00 |
    |     1100 | laptop     | mac_book_pro      |          1999.00 |
    |      130 | cell_phone | iphone_13         |           699.00 |
    |     2037 | laptop     | mac_book_air_M2   |          1199.00 |
    |      111 | laptop     | mac_book_pro      |          2100.00 |
    +----------+------------+-------------------+------------------+
  • Below is an S3 example for SELECT * FROM <TABLE> INTO OUTFILE using compression.

    SELECT * FROM <table_name> into S3 's3://<bucket_name>/<filename>'
    CONFIG '{"region":"us-east-1"}'
    CREDENTIALS
    {"aws_access_key_id": "<xxxxxxxx>",
    "aws_secret_access_key": "<xxxxxxxxxxxxxxx>"}'
    FIELDS TERMINATED BY ','
    LINES TERMINATED BY '\n'
    WITH COMPRESSION GZIP;

Aggregations for Expression Syntax

SingleStore supports these Aggregate Functions for expression syntax in SELECT statements:

All aggregate functions exclude NULLs from their computations. For example, SELECT COUNT(c) FROM t returns the number of rows in t where c is not NULL. SELECT AVG(c) FROM t returns the average of the non-NULL c’s in t.

COUNT(*) is equivalent to COUNT(1). It counts every row matching the query, even rows whose column data contain nothing but NULLs.

SELECT … LIMIT

Syntax

LIMIT {[offset,] row_count | row_count OFFSET offset}

Remarks

  • The LIMIT clause constrains the number of rows returned by the SELECT statement.

  • Both the arguments must be non-negative integer constants.

  • The row_count specifies the number of rows to return from the beginning of the result set, and the offset specifies the offset of the first row to return.

  • The offset of the first row in a table is 0 (not 1).

Examples

SELECT * FROM hrRec;
+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Adam      | Gale      | Brooklyn |     40 |
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
| Todd      | Bridges   | DC       |     30 |
| Ron       | Fletcher  | New York |     23 |
+-----------+-----------+----------+--------+
SELECT * FROM hrRec LIMIT 2;
+-----------+----------+----------+--------+
| FirstName | LastName | City     | Tenure |
+-----------+----------+----------+--------+
| Adam      | Gale     | Brooklyn |     40 |
| Samantha  | Beck     | New York |     44 |
+-----------+----------+----------+--------+
SELECT * FROM hrRec LIMIT 1,2;
+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
+-----------+-----------+----------+--------+
SELECT * FROM hrRec LIMIT 2 OFFSET 1;
+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
+-----------+-----------+----------+--------+

SELECT … FOR UPDATE

The SELECT ... FOR UPDATE command is intended to be used inside of a transaction. This command takes write locks on the rows returned from the SELECT query and the locks are held until the end of the transaction. Other transactions are blocked and cannot write to the locked rows until the locks are released.

SingleStore recommends the following when using SELECT ... FOR UPDATE:

  • SingleStore recommends to commit or abort the transaction immediately so that the locks are released. Too many locks in a transaction can result in a huge queue of transactions waiting on the locked rows.

  • To limit the number of rows that are locked, SingleStore recommends to use a column with unique values in the WHERE clause of the SELECT statement, for example the PRIMARY KEY column.

Example

The following example uses the Orders table:

DESCRIBE Orders;
+--------------+-------------+------+------+---------+-------+
| Field        | Type        | Null | Key  | Default | Extra |
+--------------+-------------+------+------+---------+-------+
| OrderNumber  | int(11)     | NO   | PRI  | NULL    |       |
| Status       | varchar(20) | NO   |      | NULL    |       |
| CustomerName | char(20)    | YES  |      | NULL    |       |
+--------------+-------------+------+------+---------+-------+
SELECT * FROM Orders;
+-------------+------------+--------------+
| OrderNumber | Status     | CustomerName |
+-------------+------------+--------------+
|           1 | Delivered  | John         |
|           3 | In Transit | Bon          |
|           2 | Delivered  | Kerry        |
|           4 | Delivered  | Tom          |
+-------------+------------+--------------+

The following transaction locks the row where OrderNumber is 3, using the FOR UPDATE clause. The row will remain locked as long as the transaction is open.

BEGIN;
SELECT * FROM Orders WHERE OrderNumber = 3 FOR UPDATE;
+-------------+------------+--------------+
| OrderNumber | Status     | CustomerName |
+-------------+------------+--------------+
|           3 | In Transit | Bon          |
+-------------+------------+--------------+

Now, execute the following query in a different connection:

UPDATE Orders SET Status = "Delivered" WHERE OrderNumber=3;
ERROR 1205 (HY000): Leaf Error (127.0.0.1:3307): Lock wait timeout exceeded; try restarting transaction.  Lock owned by connection id 77, query `open idle transaction`

The above query returns an error since the rows are locked by the previous transaction. To remove the lock on the rows, commit or rollback the open transaction. See the Query Errors topic for more information on locked rows.

JOIN and Subqueries

Syntax

SingleStore supports the following JOIN and subquery syntax for the table_reference part of SELECT statements:

join_table:
           table_reference {LEFT | RIGHT | FULL} [OUTER] JOIN table_factor join_condition
         | table_reference [INNER | CROSS] JOIN table_factor [join_condition]
         | table_reference NATURAL {LEFT | RIGHT} [OUTER] JOIN table_factor
         | table_reference STRAIGHT_JOIN table_factor [join_condition]
         | table_reference, table_factor
         | table_reference LEFT [OUTER] JOIN LATERAL (subquery) [[AS] alias] join_condition
         | table_reference [INNER | CROSS] JOIN LATERAL (subquery) [[AS] alias] [join_condition]
         | table_reference, LATERAL (subquery) [[AS] alias]


join_condition:
         ON conditional_expr

Remarks

  • This command must be run on the master aggregator or a child aggregator node (see Cluster Management Commands)

  • STRAIGHT_JOIN forces tables to be joined in the order in which they are listed in the FROM clause

  • FULL OUTER JOIN requires the join condition to be an equality.

  • SingleStore supports joining tables across databases:

    CREATE DATABASE test1;
    USE test1;
    CREATE TABLE t1(id INT, col1 VARCHAR(10));
    CREATE DATABASE test2;
    USE test2;
    CREATE TABLE t2(id INT, col1 VARCHAR(10));
    SELECT test1.t1.*, test2.t2.* FROM test1.t1 JOIN test2.t2 ON t1.id = t2.id;

Examples

SELECT * FROM my_MemSQL_table WHERE col = 1;
SELECT COUNT(*), user_name, page_url from clicks, users, pages
-> WHERE clicks.user_id = users.user_id AND pages.page_id = clicks.page_id
-> GROUP BY users.user_id, pages.page_id
-> ORDER BY COUNT(*) DESC;
+- ---------+- ----------+- -------------------------------+
| COUNT(*) | user_name | page_url |
+- ---------+- ----------+- -------------------------------+
| 5 | jake | memsql.com |
| 2 | john | http://www.singlestore.com/download |
| 1 | jake | docs.singlestore.com |
| 1 | jake | memsql.com |
| 1 | jake | http://www.singlestore.com/download |
+- ---------+- ----------+- -------------------------------+
5 rows in set (0.00 sec)
SELECT t1.*, t2.* FROM t1 FULL OUTER JOIN t2 ON t1.a = t2.a;

Nested Scalar Sub-Selects

SELECT statements can be nested in SingleStore queries.

SELECT ... [SELECT ...[SELECT [...]]]

Remarks

  • For scalar sub-selects, sub-select queries must not return more than one row.

  • The maximum allowed depth of nested sub-select queries is 40.

  • Sub-selects are not supported inside GROUP BY/ORDER BY/HAVING clauses, for nested sub-select queries of depth > 2.

Examples

The following examples show the use of nested sub-selects.

SELECT cust_id FROM customers
WHERE EXISTS
( SELECT order_id FROM orders
WHERE order_id IN
( SELECT id FROM transaction WHERE count > 5)
);
DELETE FROM records
WHERE
id = ( SELECT order_id
FROM orders
WHERE order_date > ( SELECT CURRENT_DATE() + 30)
);

Lateral Join

Lateral join allows a subquery in the FROM clause of a SQL query to reference another table in that same FROM clause. Typically such references are not allowed in SQL; by using the LATERAL keyword such references are allowed.

For example, the following query will generate an error because the subquery (SELECT b FROM t2 WHERE b = t1.a) references t1 which is not allowed.

SELECT *
FROM t1 LEFT JOIN (SELECT b FROM t2 WHERE b = t1.a) AS t3 ...;

Adding the LATERAL keyword before the subquery, as shown below, makes the reference to t1 allowed and the following query is legal. This query is called a lateral join, the subquery (SELECT b from t2 WHERE b = t1.a) is referred to as a lateral derived table, and the clause b = t1.a is referred to as the correlating condition.

SELECT *
FROM t1 LEFT JOIN LATERAL (SELECT b FROM t2 WHERE b = t1.a) AS t3 ...;

Syntax

Lateral joins use the same syntax as joins with the LATERAL keyword inserted before the lateral derived table as shown in the queries above. The syntax for lateral join is below.

join_table:
table_reference LEFT [OUTER] JOIN LATERAL (subquery) [[AS] alias] join_condition
| table_reference [INNER | CROSS] JOIN LATERAL (subquery) [[AS] alias] [join_condition]
| table_reference, LATERAL (subquery) [[AS] alias]
join_condition:
ON conditional_expr

The (subquery) [[AS] alias] clause is referred to as the lateral derived table.

Remarks

The following are restrictions on lateral join:

  • Lateral derived tables can appear only in FROM clauses. A comma-separated list of tables or a join specification with JOIN, INNER JOIN, CROSS JOIN, LEFT [OUTER] JOIN are supported.

  • The correlating condition can appear only in the WHERE clause of the lateral derived table.

  • The correlating condition must be an equality condition (uses =).

  • The correlating condition may use AND, but may not use OR.

  • Lateral derived tables may not use GROUP BY.

  • If a lateral derived table references an aggregate function, the function's aggregation query cannot be the one that owns the FROM clause in which the lateral derived table occurs.

  • If the lateral derived table has an ORDER BY … LIMIT clause, then the LIMIT must be >0.

  • TABLE functions are implicitly executed as a lateral join. The LATERAL keyword is not allowed before TABLE functions.

Examples

These examples use a table of artists and a table of albums by those artists. For simplicity we assume that each album is created by a single artist.

The statements below create artists and albums tables and insert data into those tables. Albums are linked to artists through the artist_id. Sales are in units of millions of dollars and the data is courtesy of List of best-selling albums - Wikipedia.

CREATE TABLE artists(id INT, name TEXT);
CREATE TABLE albums(album_id INT, title TEXT,
genre ENUM("R&B", "Soft Rock", "Jazz"),
artist_id INT, sales DOUBLE);
INSERT INTO artists VALUES
(1, "Fleetwood Mac"),
(2, "Whitney Huston"),
(3, "The Eagles"),
(4, "Norah Jones"),
(5, "ABBA");
INSERT INTO albums VALUES
(1, "The Bodyguard", "R&B", 2, 45),
(2, "Their Greatest Hits (1971-1975)", "Soft Rock", 3, 44),
(3, "Hotel California", "Soft Rock", 3, 42),
(4, "Rumours", "Soft Rock", 1, 40),
(5, "Come Away with Me", "Jazz", 4, 28);

Example 1 - INNER JOIN - Maximum Album Sales by Artist

The following query calculates, for each artist, the sales of their highest-selling album. This query can be written with LATERAL or with a GROUP BY as shown below.

The query below uses a comma-separated list of tables to specify the join.

SELECT AR.name, AL1.max_sales
FROM artists AR,
LATERAL (SELECT MAX(sales) as max_sales
FROM albums WHERE artist_id = AR.id) AS AL1
ORDER BY AL1.max_sales DESC;
+----------------+-----------+
| name           | max_sales |
+----------------+-----------+
| Whitney Huston |        45 |
| The Eagles     |        44 |
| Fleetwood Mac  |        40 |
| Norah Jones    |        28 |
+----------------+-----------+

LATERAL can also be used with a JOIN specification as shown in the query below.

SELECT AR.name, AL1.max_sales
FROM artists AR
JOIN LATERAL (SELECT MAX(sales) as max_sales
FROM albums
WHERE artist_id = AR.id) AS AL1
ORDER BY AL1.max_sales DESC;
+----------------+-----------+
| name           | max_sales |
+----------------+-----------+
| Whitney Huston |        45 |
| The Eagles     |        44 |
| Fleetwood Mac  |        40 |
| Norah Jones    |        28 |
+----------------+-----------+

The query below shows the same query with a GROUP BY instead of a LATERAL.

SELECT name, MAX(sales)
FROM artists JOIN albums ON artists.id = albums.artist_id
GROUP BY artists.id
ORDER BY sales DESC;
+----------------+------------+
| name           | MAX(sales) |
+----------------+------------+
| Whitney Huston |         45 |
| The Eagles     |         44 |
| Fleetwood Mac  |         40 |
| Norah Jones    |         28 |
+----------------+------------+

Example 2 - LEFT JOIN - Include Artists with No Album Sales Data

The queries in Example 1 found the maximum album sales for each artist, but did not include artists without album sales data. Including those artists requires a LEFT JOIN.

The following query uses LEFT JOIN LATERAL to find the maximum sales of any album by an artist including artists for which there is no album sales data. This query cannot be done with a GROUP BY. This query can also be written with the syntax LEFT OUTER JOIN.

LEFT JOIN requires a join condition, thus an ON clause is required despite the fact that the join is specified in the lateral derived table. Thus this query uses an empty join clause (ON = TRUE). The syntax INNER JOIN LATERAL could also be in this query.

SELECT AR.name, AL1.max_sales
FROM artists AR
LEFT JOIN LATERAL (SELECT sales, MAX(sales) as max_sales
FROM albums WHERE artist_id = AR.id) AS AL1 ON TRUE
ORDER BY AL1.max_sales DESC;
+----------------+-----------+
| name           | max_sales |
+----------------+-----------+
| Whitney Huston |        45 |
| The Eagles     |        44 |
| Fleetwood Mac  |        40 |
| Norah Jones    |        28 |
| ABBA           |      NULL |
+----------------+-----------+

Example 3 - Most Sales by Artist - With Album Name

The queries in Example 1 found the maximum album sales for each artist, but did not include the title of the album with those sales. The query below shows the artist, the album, and the sales for the album with the maximum sales for that artist. If two albums have equivalent maximum sales for an artist, both will be listed.

SELECT AR.name AS artist, AL2.title, AL1.max_sales
FROM artists AR
JOIN LATERAL (SELECT MAX(sales) as max_sales
FROM albums
WHERE artist_id = AR.id) AS AL1 ON TRUE
JOIN LATERAL (SELECT title
FROM albums
WHERE artist_id = AR.id
AND sales = AL1.max_sales) AS AL2 ON TRUE
ORDER BY AL1.max_sales DESC;
+----------------+---------------------------------+-----------+
| artist         | title                           | max_sales |
+----------------+---------------------------------+-----------+
| Whitney Huston | The Bodyguard                   |        45 |
| The Eagles     | Their Greatest Hits (1971-1975) |        44 |
| Fleetwood Mac  | Rumours                         |        40 |
| Norah Jones    | Come Away with Me               |        28 |
+----------------+---------------------------------+-----------+

Lateral joins are implemented with query rewrites. The command EXPLAIN DUPLICATE can be used to show the rewritten query. Below is the rewritten query that finds artist, album, and max sales. This query includes several subqueries. Using LATERAL can often simplify a query.

EXPLAIN DUPLICATE
SELECT AR.name AS artist, AL2.title, AL1.max_sales
FROM artists AR
JOIN LATERAL (SELECT MAX(sales) as max_sales
FROM albums
WHERE artist_id = AR.id) AS AL1 ON TRUE
JOIN LATERAL (SELECT title
FROM albums
WHERE artist_id = AR.id
AND sales = AL1.max_sales) AS AL2 ON TRUE
ORDER BY AL1.max_sales DESC;
SELECT `artists_4`.`name` AS `artist`, `SUBQ_VWW_2_6`.`title` AS `title`, 
       `SUBQ_VWW_1_5`.`max_sales` AS `max_sales` 
FROM (( ( 
  SELECT WITH(NO_MERGE_THIS_SELECT=1) `artists_4_0`.`name` AS `name`,     
        `artists_4_0`.`id` AS `id` 
  FROM  `test`.`artists` as `artists_4_0`  ) AS `artists_4` 
  JOIN  (
    SELECT WITH(NO_MERGE_THIS_SELECT=1) `SUBQ_VWW_1_5_0`.
      `max_sales` AS `max_sales`, `SUBQ_VWW_1_5_0`.`artist_id` AS `artist_id` 
    FROM  ( 
      SELECT `albums_2`.`artist_id` AS `artist_id`, 
             MAX(`albums_2`.`sales`) AS `max_sales` 
      FROM  `test`.`albums` as `albums_2`  GROUP BY 1 ) AS `SUBQ_VWW_1_5_0` ) 
                                                    AS `SUBQ_VWW_1_5`) 
   JOIN  ( 
    SELECT WITH(NO_MERGE_THIS_SELECT=1) `SUBQ_VWW_2_6_0`.`title` AS `title`,
       `SUBQ_VWW_2_6_0`.`artist_id` AS `artist_id`, 
       `SUBQ_VWW_2_6_0`.`sales` AS `sales` 
    FROM  `test`.`albums` as `SUBQ_VWW_2_6_0`  ) AS `SUBQ_VWW_2_6`) 
WHERE ((`SUBQ_VWW_2_6`.`artist_id` = `artists_4`.`id`) 
 AND (`SUBQ_VWW_2_6`.`sales` = `SUBQ_VWW_1_5`.`max_sales`) 
 AND (`SUBQ_VWW_1_5`.`artist_id` = `artists_4`.`id`) 
 AND (`SUBQ_VWW_1_5`.`artist_id` = `SUBQ_VWW_2_6`.`artist_id`)) 
ORDER BY 3 DESC

Example 4 - Album with the Most Sales by Genre

This example attempts to find the albums with the most sales by genre. Two queries which might be tried to answer this question, but which do not work, are shown below followed by a use of LATERAL which does work to answer this question. This exemplifies situations in which a user should consider using LATERAL.

The following is an attempt to find albums with the most sales by genre using a GROUP BY. This query is legal, but it generates only the max sales by genre, it does not include the album titles.

SELECT genre, MAX(sales)
FROM albums
GROUP BY genre;
+-----------+------------+
| genre     | MAX(sales) |
+-----------+------------+
| R&B       |         45 |
| Jazz      |         28 |
| Soft Rock |         44 |
+-----------+------------+

To try to add the album titles, the query below adds the album title to the SELECT statement. If using ANSI SQL this query will generate an error as the title is not in the GROUP BY attributes. If ANSI SQL is not used, this query will generate a warning and may generate unexpected results.

SELECT genre, title, MAX(sales)
FROM albums
GROUP BY genre;

Query run with strict ANSI SQL and the error message produced.

SET SQL_MODE = ANSI;
SELECT genre, title, MAX(sales)
FROM albums
GROUP BY genre;
ERROR 2295 (HY000): Column 'albums.title' from project list 
is used outside an aggregate function and does not appear in 
the GROUP BY clause. This is illegal because there may be 
multiple possible values for the field. If it is in fact 
uniquely determined, or if you wish to use one of the many 
possible values, you can use the ANY_VALUE aggregate function.

Query run without strict ANSI SQL and the warnings produced. The warnings can be seen with SHOW WARNINGS; however, those warnings only occur the first time the query is run.

SELECT genre, title, MAX(sales)
FROM albums
GROUP BY genre;
SHOW WARNINGS;
Warning | 2295 | Column 'albums.title' from project list is used 
outside an aggregate function and does not appear in the GROUP BY 
clause. This is illegal because there may be multiple possible values 
for the field. If it is in fact uniquely determined, or if you wish to 
use one of the many possible values, you can use the ANY_VALUE aggregate 
function.

Finally, the query below uses JOIN LATERAL to find the albums with the most sales by genre. The table reference AL must be used in the inner queries to refer to the albums table instance in the outer query.

SELECT AL2.genre, AL2.title, AL1.max_sales
FROM albums AL
JOIN LATERAL (SELECT MAX(sales) as max_sales
FROM albums
WHERE genre = AL.genre) AS AL1 ON TRUE
JOIN LATERAL (SELECT title, genre
FROM albums
WHERE title = AL.title
AND genre = AL.genre
AND sales = AL1.max_sales) AS AL2 ON TRUE
ORDER BY AL1.max_sales DESC;
+-----------+---------------------------------+-----------+
| genre     | title                           | max_sales |
+-----------+---------------------------------+-----------+
| R&B       | The Bodyguard                   |        45 |
| Soft Rock | Their Greatest Hits (1971-1975) |        44 |
| Jazz      | Come Away with Me               |        28 |
+-----------+---------------------------------+-----------+

If the dataset was larger and the request was to find the top 10 best selling albums per genre, that query will require a LATERAL or alternatively a window function.

Example 5 - TABLE

When the TABLE keyword is used, there is an implicit lateral join. As a result, the LATERAL keyword is not allowed to be used with TABLE. The first query below which joins a table t with a TABLE expression is allowed; the second query will return an error.

SELECT * FROM t, TABLE(JSON_TO_ARRAY('[1,2,3]'));
SELECT * FROM t, LATERAL TABLE(JSON_TO_ARRAY('[1,2,3]'));

Example 6 - RIGHT JOIN

RIGHT JOIN LATERAL is not supported and will produce the following error.

SELECT artists.name, AL1.max_sales
FROM artists
RIGHT JOIN LATERAL (SELECT sales, MAX(sales) as max_sales
FROM albums WHERE artist_id = artists.id) AS AL1 ON TRUE
ORDER BY AL1.max_sales DESC;
ERROR 1054 (42S22): Unknown column 'artists' in 'previous table'

Example 7 - Access to Multiple Levels Up

Lateral joins can reference fields from more than one level up in the query tree. The query below is the same as in Example 1 except that it outputs sales in dollars instead of millions of dollars. In this query the clause AR.id references a table two levels up in the query tree.

SELECT AR.name AS artist, AL2.max_sales
FROM artists AR
JOIN LATERAL
(SELECT FORMAT(AL1.max_sales_in_mil*1000000,0) as max_sales
FROM
(SELECT MAX(sales) as max_sales_in_mil
FROM albums
WHERE artist_id = AR.id) AS AL1
) AS AL2 ON TRUE
ORDER BY AL2.max_sales DESC;
+----------------+------------+
| artist         | max_sales  |
+----------------+------------+
| Whitney Huston | 45,000,000 |
| The Eagles     | 44,000,000 |
| Fleetwood Mac  | 40,000,000 |
| Norah Jones    | 28,000,000 |
+----------------+------------+

JOIN and USING

Syntax

Using a JOIN clause with the USING clause will match only one column when there are more than one columns that match.

SELECT <table1_name>.<column_name> AS <column_name>
FROM <table_name> JOIN <table_name> USING (<column_name>);

Remarks

  • You can utilize the USING clause instead of the ON clause in JOIN operations with an explicit JOIN clause.

Examples

TABLE 1:
CREATE TABLE assets(
asset_id int,
asset_type varchar(50),
asset_desc varchar(50),
emp_id numeric (5)
);
TABLE 2:
CREATE TABLE
employees(
emp_id numeric (5),
emp_name varchar(75)
);
SELECT employees.emp_name AS employees, asset_desc as assets, assets.asset_type
    FROM assets JOIN employees USING (emp_id);
+------------------------------------+
|employees |assets      |asset_type  |
+------------------------------------+
|T_Willams |macbook pro | laptop     |
+------------------------------------+
|A_Young   |iphone 12   | cell phone |
+------------------------------------+
|D_Karras  |macbook air | laptop     |
+------------------------------------+

SELECT … INTO <variable>

SELECT ... INTO variable statement is used to initialize variables in stored procedures, anonymous code blocks, and session variables. It allows you to query a single value, a single column, or multiple columns from one or more tables and store the selected values into variable(s).

For information about creating user defined variables for use outside of stored procedures, see User-Defined Variables.

Remarks

  • The SELECT ... INTO statement must return only a single result row.

  • The number of columns/expressions in the SELECT query must be the same as the number of variables being initialized in the INTO list.

  • SELECT ... INTO variable statements must be used inside PSQL procedure blocks.

  • The INTO variable clause can only be used once inside a SELECT query.

  • A SELECT ... INTO statement cannot be used inside of a sub-select query.

  • The variables in SELECT ... INTO statements must be declared with scalar data types.

  • SELECT ... INTO variable and SELECT ... INTO OUTFILE/S3/KAFKA cannot be used in the same query.

  • If the SELECT ... INTO statement returns 0 rows, SingleStore throws an error: ER_INTO_VARIABLES_NO_ROWS. To accommodate for this error, you can specify an EXCEPTION in the DECLARE block of a stored procedure (see Example 3 below).

Examples

Note: The following examples use the hrRec table.

DESC hrRec;
+-----------+-------------+------+------+---------+-------+
| Field     | Type        | Null | Key  | Default | Extra |
+-----------+-------------+------+------+---------+-------+
| FirstName | varchar(20) | YES  |      | NULL    |       |
| LastName  | varchar(20) | YES  |      | NULL    |       |
| City      | varchar(20) | YES  |      | NULL    |       |
| Tenure    | int(11)     | YES  |      | NULL    |       |
+-----------+-------------+------+------+---------+-------+

Example 1

The following example queries multiple columns from a single row and stores them in variables.

DELIMITER //
CREATE OR REPLACE PROCEDURE into_var ()
AS
DECLARE
fname VARCHAR(20); lname VARCHAR(20); city VARCHAR(20); ten INT;
BEGIN
SELECT FirstName, LastName, City, Tenure
INTO fname, lname, city, ten
FROM hrRec
WHERE Tenure > 40;
ECHO SELECT CONCAT(fname, " ", lname) AS "Name", city AS "City", ten AS "Tenure";
END //
DELIMITER ;
CALL into_var();
+---------------+----------+--------+
| Name          | City     | Tenure |
+---------------+----------+--------+
| Samantha Beck | New York |     44 |
+---------------+----------+--------+

Example 2

The following example queries multiple aggregate functions and stores their values in variables.

DELIMITER //
CREATE OR REPLACE PROCEDURE into_var () AS
DECLARE
row_c INT; sum_t INT;
BEGIN
SELECT COUNT(*), SUM(Tenure) INTO row_c, sum_t FROM hrRec;
ECHO SELECT row_c AS "Row Count", sum_t AS "Tenure";
END //
DELIMITER ;
CALL into_var();
+-----------+--------+
| Row Count | Tenure |
+-----------+--------+
|         5 |    170 |
+-----------+--------+

Alternatively, you can query the values in a dynamic query and store them in variables, like

DELIMITER //
CREATE OR REPLACE PROCEDURE into_var () AS
DECLARE
row_c INT; sum_t INT; qry VARCHAR(50);
BEGIN
qry = "SELECT COUNT(*), SUM(Tenure) FROM hrRec";
EXECUTE IMMEDIATE qry INTO row_c, sum_t;
ECHO SELECT row_c AS "Row Count", sum_t AS "Tenure";
END //
DELIMITER ;
CALL into_var();
+-----------+--------+
| Row Count | Tenure |
+-----------+--------+
|         5 |    170 |
+-----------+--------+

Example 3

The following example shows how to assign a value to a variable if the SELECT ... INTO statement returns 0 rows.

DELIMITER //
CREATE OR REPLACE PROCEDURE cityCount () AS
DECLARE cnum INT;
BEGIN
SELECT City INTO cnum FROM hrRec WHERE Tenure > 50;
ECHO SELECT cnum;
END //
DELIMITER ;
CALL cityCount();
ERROR 2439 (HY000): Unhandled exception
Type: ER_INTO_VARIABLES_NO_ROWS (2439)
Message: Query with 'INTO VARIABLES' clause returned zero rows whereas expected 1 row

Add an exception to the stored procedure.

DELIMITER //
CREATE OR REPLACE PROCEDURE cityCount () AS
DECLARE cnum INT;
BEGIN
SELECT City INTO cnum FROM hrRec WHERE Tenure > 50;
ECHO SELECT cnum;
EXCEPTION
WHEN ER_INTO_VARIABLES_NO_ROWS THEN
cnum = 'No city found!';
ECHO SELECT cnum;
END //
DELIMITER ;
CALL cityCount();
+----------------+
| cnum           |
+----------------+
| No city found! |
+----------------+

You can also add an exception for a scenario where the SELECT ... INTO statement returns more than one row. For example,

EXCEPTION
WHEN ER_SUBQUERY_NO_1_ROW THEN
ECHO SELECT 'SELECT ... INTO statement returned more than 1 row.';

SELECT ... ALL DISTINCT DISTINCTROW

Return rows with or without duplicates.

Syntax

SELECT [ALL | DISTINCT | DISTINCTROW] FROM table_reference

Remarks

  • The ALL clause returns all the matching rows including duplicates.

    The DISTINCT or DISTINCTROW clause returns the matching rows but eliminates the duplicate rows from the output. DISTINCTROW is a synonym of DISTINCT.

Examples

SELECT * FROM Product;
+------------+------------+----------+
| Product_id | Brand_name | City     |
+------------+------------+----------+
|          1 | Nike       | London   |
|          3 | Nike       | New York |
|          2 | Adidas     | Paris    |
|          3 | Puma       | Spain    |
+------------+------------+----------+
SELECT ALL Brand_name FROM Product;
+------------+
| Brand_name |
+------------+
| Nike       |
| Nike       |
| Adidas     |
| Puma       |
+------------+