# SELECT

Retrieves data from a table.

## Syntax

```sql
[with_clause]
SELECT
    [ALL | DISTINCT | DISTINCTROW]
    select_expr [[AS] alias], ...
    [COLLATE collation]
    [INTO variable [, ...]]
    [ FROM table_references
          [WHERE expr]
          [GROUP BY [{{col_name | expr | position}, ...} | ALL]]
            | extended_grouping_expr}]
          [HAVING expr]
          [ORDER BY [{col_name | expr | position} [ASC | DESC ][ NULLS {FIRST | LAST}], ... | ALL]]
          [LIMIT {[offset,] row_count | row_count OFFSET offset}]
          [FOR UPDATE]
    ]
    [{INTO OUTFILE 'file_name' |
        INTO FS 'destination_directory/file_name' [format_options] [WITH COMPRESSION] |
        INTO S3 bucket/target CONFIG configuration_json CREDENTIALS credentials_json [format_options] [WITH COMPRESSION] |
        INTO HDFS '<hdfs://<namenode DNS> | <IP address>:<port>/<directory>' [ CONFIG configuration_json ] [format_options] |
        INTO GCS bucket/path CONFIG configuration_json CREDENTIALS credentials_json [format_options] [WITH COMPRESSION] |
        INTO KAFKA kafka_topic_endpoint [kafka_configuration] [kafka_credentials] [KAFKA KEY <column_name>]  [format_options] |
        INTO AZURE "container/blob-prefix" CREDENTIALS credentials_json [format_options] [WITH COMPRESSION]
     }
    ]
    [INTO LINK [db name.]connection_name 'backup path']
format_options:
    csv_options | external_format_options
      csv_options:
        [{FIELDS | COLUMNS}
          [TERMINATED BY 'string']
          [[OPTIONALLY] ENCLOSED BY 'char']
          [ESCAPED BY 'char']
        ]
        [LINES
          [STARTING BY 'string']
          [TERMINATED BY 'string']
        ]

      external_format_options:
        FORMAT PARQUET

table_references:
         table_factor | join_table

table_factor:
           tbl_name [[AS] alias] [sample_ratio_clause]
         | (subquery) [[AS] alias]

```

## Remarks

* The `join_table` clause is documented in [JOIN and Subqueries](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#join-and-subqueries-2.md).
* The `join` and `using` clause is documented in [JOIN and USING](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#join-and-using-2.md).
* The `LATERAL` keyword can be placed before any subquery (also known as a lateral derived table) used as an inner table within a join. Refer to [Lateral Join](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#lateral-join-2.md) for details.
* `COLLATE` specifies the collation of a string literal.
  ```sql
  SELECT "My string literal" COLLATE utf8mb4_unicode_ci;
  ```
* `CONFIG` and `CREDENTIALS` can be specified in either order (`CONFIG` followed by `CREDENTIALS` or `CREDENTIALS` followed by `CONFIG`). For configuration examples refer [BACKUP DATABASE](https://docs.singlestore.com/db/v9.1/reference/sql-reference/operational-commands/backup-database.md)
* `format_options` clauses are documented in the [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-8.md) section.
* `sample_ratio_clause` is documented in the [SELECT … WITH (SAMPLE\_RATIO = \<value>)](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#select-with-sample-ratio-value-2.md) section.
* `with_clause` is documented on the [WITH (Common Table Expressions)](https://docs.singlestore.com/db/v9.1/query-data/advanced-query-topics/with-common-table-expressions.md) page.
* If you are querying against a columnstore table with a [FULLTEXT index](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match.md) defined, see [MATCH](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/match.md) and [HIGHLIGHT](https://docs.singlestore.com/db/v9.1/reference/sql-reference/full-text-search-functions/highlight.md) for syntax and examples on how to do full-text search queries.
* Non-aggregated data can also be transformed into a pivot table output format. See [PIVOT](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/pivot.md) for syntax and examples.
* `extended_grouping_expr` clauses include CUBE and ROLLUP. See [CUBE and ROLLUP](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/cube-and-rollup.md) for syntax and examples.
* A subquery does not require an alias, assuming that removing the alias does not create ambiguity.
* In a transaction, you can read from multiple databases.
* This command must be run on the master aggregator or a child aggregator node (see [Cluster Management Commands](https://docs.singlestore.com/db/v9.1/reference/sql-reference/cluster-management-commands.md)).
* You can specify a table named `DUAL` as a placeholder table for use with queries that do not reference a table. `DUAL` contains one logical row. This convenience is provided for situations where you may be required to include a FROM clause in every SQL statement, even statements that do not need one in order to function. It can also help when porting applications from other database products that support `DUAL`.
  ```sql
  SELECT CURRENT_DATE from DUAL;

  ```
  ```output

  +--------------+
  | CURRENT_DATE |
  +--------------+
  | 2023-04-03   |
  +--------------+

  ```
  `SELECT CURRENT_DATE from DUAL` is identical to `SELECT CURRENT_DATE`.
  ```sql
  SELECT CURRENT_DATE;

  ```
  ```output

  +--------------+
  | CURRENT_DATE |
  +--------------+
  | 2023-04-03   |
  +--------------+

  ```
* A column expression can reference a column alias. For example, in the query `SELECT a + b as c, c + 10 FROM t;`, the column expression `c + 10` references the column alias `c`.
* A `WHERE` clause can reference a column alias. For example, in the query `SELECT a + b as c FROM t WHERE c < 100;`, the `WHERE` clause references the column alias `c`.
* A `HAVING` clause can reference a column alias. For example, in the query `SELECT a + 50 as b,COUNT(*) FROM t GROUP BY a HAVING b < 1000;`, the `HAVING` clause references the column alias `b`.
* Queries using the `WHERE. . .IN` clause with multiple columns is not supported.
  ```sql
  SELECT * FROM <table_name_1> WHERE (column_name_1, column_name_2) IN (SELECT column_name_1, column_name_2 FROM <table_name_2>;)
  ```
* The following table and dataset were used for the examples
  ```sql
  CREATE TABLE assets(
    asset_id INT,
    asset_type VARCHAR(50),
    asset_desc VARCHAR(50),
    asset_value DECIMAL(6,2),
    emp_id NUMERIC (5));

  INSERT INTO assets VALUES('1049','laptop','mac_book_pro','1999.00','14'),
  ('49', 'cell_phone', 'iphone_12','879.00','102'),
  ('1100', 'laptop', 'mac_book_pro','1999.00','298'),
  ('2037', 'laptop', 'mac_book_air_M2','1199.00', '399'),
  ('58', 'cell_phone', 'iphone_12', '879.00','410'),
  ('130', 'cell_phone', 'iphone_13', '699','110'),
  ('210', 'laptop', 'mac_book_pro','2500.00','312'),
  ('111', 'laptop', 'mac_book_pro','2100.00', '089'),
  ('099', 'laptop', 'mac_book_air_M1','999','075'),
  ('140', 'cell_phone', 'iphone_13_pro','999.00', '263'),
  ('2100', 'laptop', 'mac_book_pro_M2', '2500.00', '691'),
  ('160', 'cell_phone', 'iphone_14_pro_max','1200.00', '691'),
  ('2110', 'laptop', 'mac_book_pro_M2', '2500.00', '817'),
  ('2120', 'laptop', 'mac_book_pro_M2', '2500.00', NULL),
  ('150', 'cell_phone', 'iphone_14_pro_','1100.00', NULL);

  ```
* `NULLS FIRST` and `NULLS LAST` added to an `ORDER BY` clause sorts NULL values to the beginning or end of the results set. Ascending order is assumed for the non-NULL values. `DESC` can be added to the clause to order the non-NULL values in descending order.
  ```sql
  SELECT * FROM assets ORDER BY emp_id NULLS FIRST;


  ```
  ```output

  +----------+------------+-------------------+-------------+--------+
  | asset_id | asset_type | asset_desc        | asset_value | emp_id |
  +----------+------------+-------------------+-------------+--------+
  |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
  |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
  |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
  |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
  |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
  |       49 | cell_phone | iphone_12         |      879.00 |    102 |
  |      130 | cell_phone | iphone_13         |      699.00 |    110 |
  |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
  |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
  |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
  |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
  |       58 | cell_phone | iphone_12         |      879.00 |    410 |
  |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
  |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
  |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
  +----------+------------+-------------------+-------------+--------+
  ```
  ```sql
  SELECT * FROM assets ORDER BY emp_id DESC NULLS FIRST;


  ```
  ```output

  +----------+------------+-------------------+-------------+--------+
  | asset_id | asset_type | asset_desc        | asset_value | emp_id |
  +----------+------------+-------------------+-------------+--------+
  |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
  |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
  |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
  |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
  |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
  |       58 | cell_phone | iphone_12         |      879.00 |    410 |
  |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
  |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
  |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
  |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
  |      130 | cell_phone | iphone_13         |      699.00 |    110 |
  |       49 | cell_phone | iphone_12         |      879.00 |    102 |
  |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
  |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
  |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
  +----------+------------+-------------------+-------------+--------+
  ```
  ```sql
  SELECT * FROM assets ORDER BY emp_id DESC NULLS LAST;


  ```
  ```output

  +----------+------------+-------------------+-------------+--------+
  | asset_id | asset_type | asset_desc        | asset_value | emp_id |
  +----------+------------+-------------------+-------------+--------+
  |     2110 | laptop     | mac_book_pro_M2   |     2500.00 |    817 |
  |     2100 | laptop     | mac_book_pro_M2   |     2500.00 |    691 |
  |      160 | cell_phone | iphone_14_pro_max |     1200.00 |    691 |
  |       58 | cell_phone | iphone_12         |      879.00 |    410 |
  |     2037 | laptop     | mac_book_air_M2   |     1199.00 |    399 |
  |      210 | laptop     | mac_book_pro      |     2500.00 |    312 |
  |     1100 | laptop     | mac_book_pro      |     1999.00 |    298 |
  |      140 | cell_phone | iphone_13_pro     |      999.00 |    263 |
  |      130 | cell_phone | iphone_13         |      699.00 |    110 |
  |       49 | cell_phone | iphone_12         |      879.00 |    102 |
  |      111 | laptop     | mac_book_pro      |     2100.00 |     89 |
  |       99 | laptop     | mac_book_air_M1   |      999.00 |     75 |
  |     1049 | laptop     | mac_book_pro      |     1999.00 |     14 |
  |      150 | cell_phone | iphone_14_pro_    |     1100.00 |   NULL |
  |     2120 | laptop     | mac_book_pro_M2   |     2500.00 |   NULL |
  +----------+------------+-------------------+-------------+--------+
  ```
* `ORDER BY ALL` sorts by all the columns in the same order they are written. ASC/DESC may be used but it will apply to all the columns.
  ```sql
  SELECT asset_id, asset_type, asset_desc FROM assets ORDER BY ALL;


  ```
  ```output

  +----------+------------+-------------------+
  | asset_id | asset_type | asset_desc        |
  +----------+------------+-------------------+
  |       49 | cell_phone | iphone_12         |
  |       58 | cell_phone | iphone_12         |
  |       99 | laptop     | mac_book_air_M1   |
  |      111 | laptop     | mac_book_pro      |
  |      130 | cell_phone | iphone_13         |
  |      140 | cell_phone | iphone_13_pro     |
  |      150 | cell_phone | iphone_14_pro_    |
  |      160 | cell_phone | iphone_14_pro_max |
  |      210 | laptop     | mac_book_pro      |
  |     1049 | laptop     | mac_book_pro      |
  |     1100 | laptop     | mac_book_pro      |
  |     2037 | laptop     | mac_book_air_M2   |
  |     2100 | laptop     | mac_book_pro_M2   |
  |     2110 | laptop     | mac_book_pro_M2   |
  |     2120 | laptop     | mac_book_pro_M2   |
  +----------+------------+-------------------+


  ```
* The `GROUP BY` clause is used with a `SELECT` statement to organize data into `GROUPS`. Aggregates are applied within those groups. One row of output is produced for each unique combination of values for the specified grouping expressions.
  ```sql
  SELECT asset_type, SUM(asset_value) AS total_value
    FROM assets GROUP BY asset_type;


  ```
  ```output

  +------------+-------------+
  | asset_type | total_value |
  +------------+-------------+
  | cell_phone |     5756.00 |
  | laptop     |    18296.00 |
  +------------+-------------+

  ```
* The `GROUP BY ALL` clause causes the query to group by all non-aggregate expressions in the SELECT list, so all the columns to be grouped by do not have to be entered explicitly.
  ```sql
  SELECT asset_id, asset_type, asset_desc, SUM(asset_value) 
    FROM assets GROUP BY asset_id, asset_type, asset_desc;

  SELECT asset_id, asset_type, asset_desc, SUM(asset_value) 
    FROM assets GROUP BY ALL;


  ```
  ```output

  +----------+------------+-------------------+------------------+
  | asset_id | asset_type | asset_desc        | SUM(asset_value) |
  +----------+------------+-------------------+------------------+
  |      160 | cell_phone | iphone_14_pro_max |          1200.00 |
  |     2100 | laptop     | mac_book_pro_M2   |          2500.00 |
  |     2110 | laptop     | mac_book_pro_M2   |          2500.00 |
  |      140 | cell_phone | iphone_13_pro     |           999.00 |
  |       99 | laptop     | mac_book_air_M1   |           999.00 |
  |      210 | laptop     | mac_book_pro      |          2500.00 |
  |      150 | cell_phone | iphone_14_pro_    |          1100.00 |
  |       58 | cell_phone | iphone_12         |           879.00 |
  |       49 | cell_phone | iphone_12         |           879.00 |
  |     1049 | laptop     | mac_book_pro      |          1999.00 |
  |     2120 | laptop     | mac_book_pro_M2   |          2500.00 |
  |     1100 | laptop     | mac_book_pro      |          1999.00 |
  |      130 | cell_phone | iphone_13         |           699.00 |
  |     2037 | laptop     | mac_book_air_M2   |          1199.00 |
  |      111 | laptop     | mac_book_pro      |          2100.00 |
  +----------+------------+-------------------+------------------+

  ```
* The following is an example for `SELECT * FROM <TABLE> INTO S3` using compression.
  ```sql
  SELECT * FROM <table_name> INTO S3 's3://<bucket_name>/<filename>'
  CONFIG '{"region":"us-east-1"}'
  CREDENTIALS
  {"aws_access_key_id": "<xxxxxxxx>",
    "aws_secret_access_key": "<xxxxxxxxxxxxxxx>"}'
  FIELDS TERMINATED BY ','
  LINES TERMINATED BY '\n'
  WITH COMPRESSION GZIP;
  ```

## Aggregations for Expression Syntax

SingleStore supports these [Aggregate Functions](https://docs.singlestore.com/db/v9.1/reference/sql-reference/aggregate-functions.md) for expression syntax in `SELECT` statements:

All aggregate functions exclude NULLs from their computations. For example, SELECT COUNT(c) FROM t returns the number of rows in t where c is not NULL. SELECT AVG(c) FROM t returns the average of the non-NULL c’s in t.

`COUNT(*)` is equivalent to `COUNT(1)`. It counts every row matching the query, even rows whose column data contain nothing but NULLs.

## SELECT … LIMIT

## Syntax

```
LIMIT {[offset,] row_count | row_count OFFSET offset}

```

## Remarks

* The `LIMIT` clause constrains the number of rows returned by the `SELECT` statement.
* Both the arguments must be non-negative integer constants.
* The `row_count` specifies the number of rows to return from the beginning of the result set, and the `offset` specifies the offset of the first row to return.
* The offset of the first row in a table is **0** (not 1).

## Examples

```sql
SELECT * FROM hrRec;

```

```output

+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Adam      | Gale      | Brooklyn |     40 |
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
| Todd      | Bridges   | DC       |     30 |
| Ron       | Fletcher  | New York |     23 |
+-----------+-----------+----------+--------+

```

```sql
SELECT * FROM hrRec LIMIT 2;

```

```output

+-----------+----------+----------+--------+
| FirstName | LastName | City     | Tenure |
+-----------+----------+----------+--------+
| Adam      | Gale     | Brooklyn |     40 |
| Samantha  | Beck     | New York |     44 |
+-----------+----------+----------+--------+

```

```sql
SELECT * FROM hrRec LIMIT 1,2;

```

```output

+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
+-----------+-----------+----------+--------+

```

```sql
SELECT * FROM hrRec LIMIT 2 OFFSET 1;

```

```output

+-----------+-----------+----------+--------+
| FirstName | LastName  | City     | Tenure |
+-----------+-----------+----------+--------+
| Samantha  | Beck      | New York |     44 |
| Clara     | Wakefield | DC       |     24 |
+-----------+-----------+----------+--------+

```

## SELECT … FOR UPDATE

The `SELECT ... FOR UPDATE` command is intended to be used inside of a transaction. This command takes write locks on the rows returned from the `SELECT` query and the locks are held until the end of the transaction. Other transactions are blocked and cannot write to the locked rows until the locks are released.

SingleStore recommends the following when using `SELECT ... FOR UPDATE`:

* SingleStore recommends to commit or abort the transaction immediately so that the locks are released. Too many locks in a transaction can result in a huge queue of transactions waiting on the locked rows.
* To limit the number of rows that are locked, SingleStore recommends to use a column with unique values in the `WHERE` clause of the `SELECT` statement, for example the `PRIMARY KEY` column.

## Example

The following example uses the `Orders` table:

```sql
DESCRIBE Orders;

```

```output

+--------------+-------------+------+------+---------+-------+
| Field        | Type        | Null | Key  | Default | Extra |
+--------------+-------------+------+------+---------+-------+
| OrderNumber  | int(11)     | NO   | PRI  | NULL    |       |
| Status       | varchar(20) | NO   |      | NULL    |       |
| CustomerName | char(20)    | YES  |      | NULL    |       |
+--------------+-------------+------+------+---------+-------+

```

```sql
SELECT * FROM Orders;

```

```output

+-------------+------------+--------------+
| OrderNumber | Status     | CustomerName |
+-------------+------------+--------------+
|           1 | Delivered  | John         |
|           3 | In Transit | Bon          |
|           2 | Delivered  | Kerry        |
|           4 | Delivered  | Tom          |
+-------------+------------+--------------+

```

The following transaction locks the row where `OrderNumber` is `3`, using the `FOR UPDATE` clause. The row will remain locked as long as the transaction is open.

```sql
BEGIN;
SELECT * FROM Orders WHERE OrderNumber = 3 FOR UPDATE;

```

```output

+-------------+------------+--------------+
| OrderNumber | Status     | CustomerName |
+-------------+------------+--------------+
|           3 | In Transit | Bon          |
+-------------+------------+--------------+

```

Now, execute the following query in a different connection:

```sql
UPDATE Orders SET Status = "Delivered" WHERE OrderNumber=3;

```

```output

ERROR 1205 (HY000): Leaf Error (127.0.0.1:3307): Lock wait timeout exceeded; try restarting transaction.  Lock owned by connection id 77, query `open idle transaction`

```

The above query returns an error since the rows are locked by the previous transaction. To remove the lock on the rows, [commit](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/commit.md) or [rollback](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/rollback.md) the open transaction. See the [Query Errors](https://docs.singlestore.com/db/v9.1/reference/troubleshooting-reference/query-errors/error-1205-hy-000-lock-wait-timeout-exceeded-try-restarting-transaction.md) topic for more information on locked rows.

## JOIN and Subqueries

## Syntax

SingleStore supports the following `JOIN` and subquery syntax for the `table_reference` part of `SELECT` statements:

```sql
join_table:
           table_reference {LEFT | RIGHT | FULL} [OUTER] JOIN table_factor join_condition
         | table_reference [INNER | CROSS] JOIN table_factor [join_condition]
         | table_reference NATURAL {LEFT | RIGHT} [OUTER] JOIN table_factor
         | table_reference STRAIGHT_JOIN table_factor [join_condition]
         | table_reference, table_factor
         | table_reference LEFT [OUTER] JOIN LATERAL (subquery) [[AS] alias] join_condition
         | table_reference [INNER | CROSS] JOIN LATERAL (subquery) [[AS] alias] [join_condition]
         | table_reference, LATERAL (subquery) [[AS] alias]


join_condition:
         ON conditional_expr

```

## Remarks

* This command must be run on the master aggregator or a child aggregator node (see [Cluster Management Commands](https://docs.singlestore.com/db/v9.1/reference/sql-reference/cluster-management-commands.md))
* `STRAIGHT_JOIN` forces tables to be joined in the order in which they are listed in the `FROM` clause
* `FULL OUTER JOIN` requires the join condition to be an equality.&#x20;
* SingleStore supports joining tables across databases:
  ```sql
  CREATE DATABASE test1;
  USE test1;
  CREATE TABLE t1(id INT, col1 VARCHAR(10));

  CREATE DATABASE test2;
  USE test2;
  CREATE TABLE t2(id INT, col1 VARCHAR(10));

  SELECT test1.t1.*, test2.t2.* FROM test1.t1 JOIN test2.t2 ON t1.id = t2.id;

  ```

## Examples

```sql
SELECT * FROM my_MemSQL_table WHERE col = 1;


SELECT COUNT(*), user_name, page_url from clicks, users, pages
 ->     WHERE clicks.user_id = users.user_id AND pages.page_id = clicks.page_id
 ->     GROUP BY users.user_id, pages.page_id
 ->     ORDER BY COUNT(*) DESC;
 +- ---------+- ----------+- ---------------------------------+
 | COUNT(*) | user_name | page_url                            |
 +- ---------+- ----------+- ---------------------------------+
 |        5 | jake      | memsql.com                          |
 |        2 | john      | http://www.singlestore.com/download |
 |        1 | jake      | docs.singlestore.com                |
 |        1 | jake      | memsql.com                          |
 |        1 | jake      | http://www.singlestore.com/download |
 +- ---------+- ----------+- ---------------------------------+
 5 rows in set (0.00 sec)

```

```sql
SELECT t1.*, t2.* FROM t1 FULL OUTER JOIN t2 ON t1.a = t2.a;

```

## Nested Scalar Sub-Selects

`SELECT` statements can be nested in SingleStore queries.

```sql
SELECT ... [SELECT ...[SELECT [...]]]

```

## Remarks

* For scalar sub-selects, sub-select queries must not return more than one row.
* The maximum allowed depth of nested sub-select queries is 40.
* Sub-selects are not supported inside `GROUP BY/ORDER BY/HAVING` clauses, for nested sub-select queries of depth > 2.

## Examples

The following examples show the use of nested sub-selects.

```sql
SELECT cust_id FROM customers
WHERE EXISTS
  ( SELECT order_id FROM orders
    WHERE order_id IN
      ( SELECT id FROM transaction WHERE count > 5)
  );

```

```sql
DELETE FROM records
WHERE
  id = ( SELECT order_id
         FROM orders
         WHERE order_date > ( SELECT CURRENT_DATE() + 30)
       );

```

## Lateral Join

Lateral join allows a subquery in the `FROM` clause of a SQL query to reference another table in that same `FROM` clause. Typically such references are not allowed in SQL; by using the `LATERAL` keyword such references are allowed.

For example, the following query will generate an error because the subquery `(SELECT b FROM t2 WHERE b = t1.a)` references `t1` which is not allowed.

```sql
SELECT * 
FROM t1 LEFT JOIN (SELECT b FROM t2 WHERE b = t1.a) AS t3 ...;
```

Adding the `LATERAL` keyword before the subquery, as shown below, makes the reference to `t1` allowed and the following query is legal. This query is called a lateral join, the subquery `(SELECT b from t2 WHERE b = t1.a)` is referred to as a lateral derived table, and the clause `b = t1.a` is referred to as the correlating condition.

```sql
SELECT * 
FROM t1 LEFT JOIN LATERAL (SELECT b FROM t2 WHERE b = t1.a) AS t3 ...;
```

## Syntax

Lateral joins use the same syntax as joins with the `LATERAL` keyword inserted before the lateral derived table as shown in the queries above. The syntax for lateral join is below.

```sql
join_table:
          table_reference LEFT [OUTER] JOIN LATERAL (subquery) [[AS] alias] join_condition
         | table_reference [INNER | CROSS] JOIN LATERAL (subquery) [[AS] alias] [join_condition]
         | table_reference, LATERAL (subquery) [[AS] alias]

join_condition:
         ON conditional_expr

```

The `(subquery) [[AS] alias]` clause is referred to as the lateral derived table.

## Remarks

The following are restrictions on lateral join:

* Lateral derived tables can appear only in `FROM` clauses. A comma-separated list of tables or a join specification with `JOIN`, `INNER JOIN`, `CROSS JOIN`, `LEFT [OUTER] JOIN` are supported.
* The correlating condition can appear only in the `WHERE` clause of the lateral derived table.
* The correlating condition must be an equality condition (uses =).
* The correlating condition may use `AND`, but may not use `OR`.
* Lateral derived tables may not use `GROUP BY`.
* If a lateral derived table references an aggregate function, the function's aggregation query cannot be the one that owns the `FROM` clause in which the lateral derived table occurs.
* If the lateral derived table has an `ORDER BY … LIMIT` clause, then the `LIMIT` must be `>0`.
* `TABLE` functions are implicitly executed as a lateral join. The `LATERAL` keyword is not allowed before `TABLE` functions.

## Examples

These examples use a table of artists and a table of albums by those artists. For simplicity we assume that each album is created by a single artist.

The statements below create `artists` and `albums` tables and insert data into those tables. Albums are linked to artists through the `artist_id`. Sales are in units of millions of dollars and the data is courtesy of [List of best-selling albums - Wikipedia](https://en.wikipedia.org/wiki/List_of_best-selling_albums).

```sql
CREATE TABLE artists(id INT, name TEXT);

CREATE TABLE albums(album_id INT, title TEXT, 
                    genre ENUM("R&B", "Soft Rock", "Jazz"),  
                    artist_id INT, sales DOUBLE);

INSERT INTO artists VALUES
(1, "Fleetwood Mac"), 
(2, "Whitney Huston"),
(3, "The Eagles"),
(4, "Norah Jones"),
(5, "ABBA");

INSERT INTO albums VALUES 
(1, "The Bodyguard", "R&B", 2, 45),
(2, "Their Greatest Hits (1971-1975)", "Soft Rock", 3, 44),
(3, "Hotel California", "Soft Rock", 3, 42),
(4, "Rumours", "Soft Rock", 1, 40), 
(5, "Come Away with Me", "Jazz", 4, 28);
```

## Example 1 - INNER JOIN - Maximum Album Sales by Artist

The following query calculates, for each artist, the sales of their highest-selling album. This query can be written with `LATERAL` or with a `GROUP BY` as shown below.

The query below uses a comma-separated list of tables to specify the join.

```sql
SELECT AR.name, AL1.max_sales
FROM artists AR, 
LATERAL (SELECT MAX(sales) as max_sales
          FROM albums WHERE artist_id = AR.id) AS AL1 
ORDER BY AL1.max_sales DESC;


```

```output

+----------------+-----------+
| name           | max_sales |
+----------------+-----------+
| Whitney Huston |        45 |
| The Eagles     |        44 |
| Fleetwood Mac  |        40 |
| Norah Jones    |        28 |
+----------------+-----------+

```

`LATERAL` can also be used with a `JOIN` specification as shown in the query below.

```sql
SELECT AR.name, AL1.max_sales
FROM artists AR 
JOIN LATERAL (SELECT MAX(sales) as max_sales
              FROM albums 
              WHERE artist_id = AR.id) AS AL1
ORDER BY AL1.max_sales DESC;


```

```output

+----------------+-----------+
| name           | max_sales |
+----------------+-----------+
| Whitney Huston |        45 |
| The Eagles     |        44 |
| Fleetwood Mac  |        40 |
| Norah Jones    |        28 |
+----------------+-----------+
```

The query below shows the same query with a `GROUP BY` instead of a `LATERAL`.

```sql
SELECT name, MAX(sales)
FROM artists JOIN albums ON artists.id = albums.artist_id 
GROUP BY artists.id
ORDER BY sales DESC;

```

```output

+----------------+------------+
| name           | MAX(sales) |
+----------------+------------+
| Whitney Huston |         45 |
| The Eagles     |         44 |
| Fleetwood Mac  |         40 |
| Norah Jones    |         28 |
+----------------+------------+
```

## Example 2 - LEFT JOIN - Include Artists with No Album Sales Data

The queries in Example 1 found the maximum album sales for each artist, but did not include artists without album sales data. Including those artists requires a `LEFT JOIN`.

The following query uses `LEFT JOIN LATERAL` to find the maximum sales of any album by an artist including artists for which there is no album sales data. This query cannot be done with a `GROUP BY`. This query can also be written with the syntax `LEFT OUTER JOIN`.

`LEFT JOIN` requires a join condition, thus an `ON` clause is required despite the fact that the join is specified in the lateral derived table. Thus this query uses an empty join clause (`ON = TRUE`). The syntax `INNER JOIN LATERAL` could also be in this query.

```sql
SELECT AR.name, AL1.max_sales
FROM artists AR 
LEFT JOIN LATERAL (SELECT sales, MAX(sales) as max_sales
          FROM albums WHERE artist_id = AR.id) AS AL1 ON TRUE
ORDER BY AL1.max_sales DESC;


```

```output


+----------------+-----------+
| name           | max_sales |
+----------------+-----------+
| Whitney Huston |        45 |
| The Eagles     |        44 |
| Fleetwood Mac  |        40 |
| Norah Jones    |        28 |
| ABBA           |      NULL |
+----------------+-----------+
```

## Example 3 - Most Sales by Artist - With Album Name

The queries in Example 1 found the maximum album sales for each artist, but did not include the title of the album with those sales. The query below shows the artist, the album, and the sales for the album with the maximum sales for that artist. If two albums have equivalent maximum sales for an artist, both will be listed.

```sql
SELECT AR.name AS artist, AL2.title, AL1.max_sales
FROM artists AR
 JOIN LATERAL (SELECT MAX(sales) as max_sales
          FROM albums 
          WHERE artist_id = AR.id) AS AL1 ON TRUE
 JOIN LATERAL (SELECT title 
          FROM albums 
          WHERE artist_id = AR.id 
          AND sales = AL1.max_sales) AS AL2 ON TRUE
ORDER BY AL1.max_sales DESC;


```

```output


+----------------+---------------------------------+-----------+
| artist         | title                           | max_sales |
+----------------+---------------------------------+-----------+
| Whitney Huston | The Bodyguard                   |        45 |
| The Eagles     | Their Greatest Hits (1971-1975) |        44 |
| Fleetwood Mac  | Rumours                         |        40 |
| Norah Jones    | Come Away with Me               |        28 |
+----------------+---------------------------------+-----------+

```

Lateral joins are implemented with query rewrites. The command `EXPLAIN DUPLICATE` can be used to show the rewritten query. Below is the rewritten query that finds artist, album, and max sales. This query includes several subqueries. Using `LATERAL` can often simplify a query.

```sql
EXPLAIN DUPLICATE
SELECT AR.name AS artist, AL2.title, AL1.max_sales
FROM artists AR
 JOIN LATERAL (SELECT MAX(sales) as max_sales
          FROM albums 
          WHERE artist_id = AR.id) AS AL1 ON TRUE
 JOIN LATERAL (SELECT title 
          FROM albums 
          WHERE artist_id = AR.id 
          AND sales = AL1.max_sales) AS AL2 ON TRUE
ORDER BY AL1.max_sales DESC;


```

```output


SELECT `artists_4`.`name` AS `artist`, `SUBQ_VWW_2_6`.`title` AS `title`, 
       `SUBQ_VWW_1_5`.`max_sales` AS `max_sales` 
FROM (( ( 
  SELECT WITH(NO_MERGE_THIS_SELECT=1) `artists_4_0`.`name` AS `name`,     
        `artists_4_0`.`id` AS `id` 
  FROM  `test`.`artists` as `artists_4_0`  ) AS `artists_4` 
  JOIN  (
    SELECT WITH(NO_MERGE_THIS_SELECT=1) `SUBQ_VWW_1_5_0`.
      `max_sales` AS `max_sales`, `SUBQ_VWW_1_5_0`.`artist_id` AS `artist_id` 
    FROM  ( 
      SELECT `albums_2`.`artist_id` AS `artist_id`, 
             MAX(`albums_2`.`sales`) AS `max_sales` 
      FROM  `test`.`albums` as `albums_2`  GROUP BY 1 ) AS `SUBQ_VWW_1_5_0` ) 
                                                    AS `SUBQ_VWW_1_5`) 
   JOIN  ( 
    SELECT WITH(NO_MERGE_THIS_SELECT=1) `SUBQ_VWW_2_6_0`.`title` AS `title`,
       `SUBQ_VWW_2_6_0`.`artist_id` AS `artist_id`, 
       `SUBQ_VWW_2_6_0`.`sales` AS `sales` 
    FROM  `test`.`albums` as `SUBQ_VWW_2_6_0`  ) AS `SUBQ_VWW_2_6`) 
WHERE ((`SUBQ_VWW_2_6`.`artist_id` = `artists_4`.`id`) 
 AND (`SUBQ_VWW_2_6`.`sales` = `SUBQ_VWW_1_5`.`max_sales`) 
 AND (`SUBQ_VWW_1_5`.`artist_id` = `artists_4`.`id`) 
 AND (`SUBQ_VWW_1_5`.`artist_id` = `SUBQ_VWW_2_6`.`artist_id`)) 
ORDER BY 3 DESC
```

## Example 4 - Album with the Most Sales by Genre

This example attempts to find the albums with the most sales by genre. Two queries which might be tried to answer this question, but which do not work, are shown below followed by a use of `LATERAL` which does work to answer this question. This exemplifies situations in which a user should consider using `LATERAL`.

The following is an attempt to find albums with the most sales by genre using a `GROUP BY`. This query is legal, but it generates only the max sales by genre, it does not include the album titles.

```sql
SELECT genre, MAX(sales)
FROM 	albums
GROUP BY genre;


```

```output

+-----------+------------+
| genre     | MAX(sales) |
+-----------+------------+
| R&B       |         45 |
| Jazz      |         28 |
| Soft Rock |         44 |
+-----------+------------+
```

To try to add the album titles, the query below adds the album title to the `SELECT` statement. If using ANSI SQL this query will generate an error as the title is not in the `GROUP BY` attributes. If ANSI SQL is not used, this query will generate a warning and may generate unexpected results.

```sql
SELECT genre, title, MAX(sales)
FROM 	albums
GROUP BY genre;

```

Query run with strict ANSI SQL and the error message produced.

```sql
SET SQL_MODE = ANSI;

SELECT genre, title, MAX(sales)
FROM 	albums
GROUP BY genre;


```

```output

ERROR 2295 (HY000): Column 'albums.title' from project list 
is used outside an aggregate function and does not appear in 
the GROUP BY clause. This is illegal because there may be 
multiple possible values for the field. If it is in fact 
uniquely determined, or if you wish to use one of the many 
possible values, you can use the ANY_VALUE aggregate function.
```

Query run without strict ANSI SQL and the warnings produced. The warnings can be seen with `SHOW WARNINGS`; however, those warnings only occur the first time the query is run.

```sql
SELECT genre, title, MAX(sales)
FROM 	albums
GROUP BY genre;

SHOW WARNINGS;


```

```output

Warning | 2295 | Column 'albums.title' from project list is used 
outside an aggregate function and does not appear in the GROUP BY 
clause. This is illegal because there may be multiple possible values 
for the field. If it is in fact uniquely determined, or if you wish to 
use one of the many possible values, you can use the ANY_VALUE aggregate 
function.
```

Finally, the query below uses `JOIN LATERAL` to find the albums with the most sales by genre. The table reference `AL` must be used in the inner queries to refer to the `albums` table instance in the outer query.

```sql
SELECT AL2.genre, AL2.title, AL1.max_sales
FROM albums AL
 JOIN LATERAL (SELECT MAX(sales) as max_sales
          FROM albums 
          WHERE genre = AL.genre) AS AL1 ON TRUE
 JOIN LATERAL (SELECT title, genre 
          FROM albums 
          WHERE title = AL.title 
          AND genre = AL.genre
          AND sales = AL1.max_sales) AS AL2 ON TRUE
ORDER BY AL1.max_sales DESC;

```

```output

+-----------+---------------------------------+-----------+
| genre     | title                           | max_sales |
+-----------+---------------------------------+-----------+
| R&B       | The Bodyguard                   |        45 |
| Soft Rock | Their Greatest Hits (1971-1975) |        44 |
| Jazz      | Come Away with Me               |        28 |
+-----------+---------------------------------+-----------+

```

If the dataset was larger and the request was to find the top 10 best selling albums per genre, that query will require a `LATERAL` or alternatively a window function.

## Example 5 - TABLE

When the `TABLE` keyword is used, there is an implicit lateral join. As a result, the `LATERAL` keyword is not allowed to be used with `TABLE`. The first query below which joins a table `t` with a `TABLE` expression is allowed; the second query will return an error.

```sql
SELECT * FROM t, TABLE(JSON_TO_ARRAY('[1,2,3]'));

SELECT * FROM t, LATERAL TABLE(JSON_TO_ARRAY('[1,2,3]'));
```

## Example 6 - RIGHT JOIN

`RIGHT JOIN LATERAL` is not supported and will produce the following error.

```sql
SELECT artists.name, AL1.max_sales
FROM artists 
   RIGHT JOIN LATERAL (SELECT sales, MAX(sales) as max_sales
          FROM albums WHERE artist_id = artists.id) AS AL1 ON TRUE
ORDER BY AL1.max_sales DESC;


```

```output

ERROR 1054 (42S22): Unknown column 'artists' in 'previous table'

```

## Example 7 - Access to Multiple Levels Up

Lateral joins can reference fields from more than one level up in the query tree. The query below is the same as in Example 1 except that it outputs sales in dollars instead of millions of dollars. In this query the clause `AR.id` references a table two levels up in the query tree.

```sql
SELECT AR.name AS artist, AL2.max_sales
FROM artists AR
 JOIN LATERAL 
    (SELECT FORMAT(AL1.max_sales_in_mil*1000000,0) as max_sales
     FROM
       (SELECT MAX(sales) as max_sales_in_mil
          FROM albums 
          WHERE artist_id = AR.id) AS AL1
    ) AS AL2 ON TRUE
ORDER BY AL2.max_sales DESC;


```

```output

+----------------+------------+
| artist         | max_sales  |
+----------------+------------+
| Whitney Huston | 45,000,000 |
| The Eagles     | 44,000,000 |
| Fleetwood Mac  | 40,000,000 |
| Norah Jones    | 28,000,000 |
+----------------+------------+

```

## JOIN and USING

## Syntax

Using a `JOIN` clause with the `USING` clause will match only one column when there are more than one columns that match.

```sql
SELECT <table1_name>.<column_name> AS <column_name> 
    FROM <table_name> JOIN <table_name> USING (<column_name>);
```

## Remarks

* You can utilize the `USING` clause instead of the `ON` clause in `JOIN` operations with an explicit `JOIN` clause.

## Examples

```sql
TABLE 1:

CREATE TABLE assets(
asset_id int,
asset_type varchar(50),
asset_desc varchar(50),
emp_id numeric (5)
);

TABLE 2:

CREATE TABLE
employees(
emp_id numeric (5),
emp_name varchar(75)
);

SELECT employees.emp_name AS employees, asset_desc as assets, assets.asset_type
    FROM assets JOIN employees USING (emp_id);

```

```output

+------------------------------------+
|employees |assets      |asset_type  |
+------------------------------------+
|T_Willams |macbook pro | laptop     |
+------------------------------------+
|A_Young   |iphone 12   | cell phone |
+------------------------------------+
|D_Karras  |macbook air | laptop     |
+------------------------------------+

```

## SELECT … INTO \<variable>

`SELECT ... INTO variable` statement is used to initialize variables in stored procedures, anonymous code blocks, and session variables. It allows you to query a single value, a single column, or multiple columns from one or more tables and store the selected values into variable(s).

For information about creating user defined variables for use outside of stored procedures, see [User-Defined Variables](https://docs.singlestore.com/db/v9.1/reference/sql-reference/user-defined-variables.md).

## Remarks

* The `SELECT ... INTO` statement must return only a single result row.
* The number of columns/expressions in the `SELECT` query must be the same as the number of variables being initialized in the `INTO` list.
* `SELECT ... INTO variable` statements must be used inside PSQL procedure blocks.
* The `INTO variable` clause can only be used once inside a `SELECT` query.
* A `SELECT ... INTO` statement cannot be used inside of a sub-select query.
* The variables in `SELECT ... INTO` statements must be declared with scalar data types.
* `SELECT ... INTO variable` and `SELECT ... INTO OUTFILE/S3/KAFKA` cannot be used in the same query.
* If the `SELECT ... INTO` statement returns `0` rows, SingleStore throws an error: `ER_INTO_VARIABLES_NO_ROWS`. To accommodate for this error, you can specify an `EXCEPTION` in the `DECLARE` block of a stored procedure (see Example 3 below).

## Examples

**Note**: The following examples use the **hrRec** table.

```sql
DESC hrRec;

```

```output

+-----------+-------------+------+------+---------+-------+
| Field     | Type        | Null | Key  | Default | Extra |
+-----------+-------------+------+------+---------+-------+
| FirstName | varchar(20) | YES  |      | NULL    |       |
| LastName  | varchar(20) | YES  |      | NULL    |       |
| City      | varchar(20) | YES  |      | NULL    |       |
| Tenure    | int(11)     | YES  |      | NULL    |       |
+-----------+-------------+------+------+---------+-------+

```

## Example 1

The following example queries multiple columns from a single row and stores them in variables.

```sql
DELIMITER //

CREATE OR REPLACE PROCEDURE into_var ()
AS
DECLARE
fname VARCHAR(20); lname VARCHAR(20); city VARCHAR(20); ten INT;
BEGIN
SELECT FirstName, LastName, City, Tenure
 INTO fname, lname, city, ten
FROM hrRec
WHERE Tenure > 40;
ECHO SELECT CONCAT(fname, " ", lname) AS "Name", city AS "City", ten AS "Tenure";
END //

DELIMITER ;

CALL into_var();

```

```output

+---------------+----------+--------+
| Name          | City     | Tenure |
+---------------+----------+--------+
| Samantha Beck | New York |     44 |
+---------------+----------+--------+

```

## Example 2

The following example queries multiple aggregate functions and stores their values in variables.

```sql
DELIMITER //

CREATE OR REPLACE PROCEDURE into_var () AS
DECLARE
  row_c INT; sum_t INT;
BEGIN
  SELECT COUNT(*), SUM(Tenure) INTO row_c, sum_t FROM hrRec;
  ECHO SELECT row_c AS "Row Count", sum_t AS "Tenure";
END //

DELIMITER ;

CALL into_var();

```

```output

+-----------+--------+
| Row Count | Tenure |
+-----------+--------+
|         5 |    170 |
+-----------+--------+

```

Alternatively, you can query the values in a dynamic query and store them in variables, like

```sql
DELIMITER //

CREATE OR REPLACE PROCEDURE into_var () AS
DECLARE
  row_c INT; sum_t INT; qry VARCHAR(50);
BEGIN
  qry = "SELECT COUNT(*), SUM(Tenure) FROM hrRec";
  EXECUTE IMMEDIATE qry INTO row_c, sum_t;
  ECHO SELECT row_c AS "Row Count", sum_t AS "Tenure";
END //

DELIMITER ;

CALL into_var();

```

```output

+-----------+--------+
| Row Count | Tenure |
+-----------+--------+
|         5 |    170 |
+-----------+--------+

```

## Example 3

The following example shows how to assign a value to a variable if the `SELECT ... INTO` statement returns `0` rows.

```sql
DELIMITER //

CREATE OR REPLACE PROCEDURE cityCount () AS
DECLARE cnum INT;
BEGIN
  SELECT City INTO cnum FROM hrRec WHERE Tenure > 50;
  ECHO SELECT cnum;
END //

DELIMITER ;

CALL cityCount();

```

```output

ERROR 2439 (HY000): Unhandled exception
Type: ER_INTO_VARIABLES_NO_ROWS (2439)
Message: Query with 'INTO VARIABLES' clause returned zero rows whereas expected 1 row
```

Add an exception to the stored procedure.

```sql
DELIMITER //

CREATE OR REPLACE PROCEDURE cityCount () AS
DECLARE cnum INT;
BEGIN
  SELECT City INTO cnum FROM hrRec WHERE Tenure > 50;
  ECHO SELECT cnum;
EXCEPTION
  WHEN ER_INTO_VARIABLES_NO_ROWS THEN
  cnum = 'No city found!';
  ECHO SELECT cnum;
END //

DELIMITER ;

CALL cityCount();

```

```output

+----------------+
| cnum           |
+----------------+
| No city found! |
+----------------+
```

You can also add an exception for a scenario where the `SELECT ... INTO` statement returns more than one row. For example,

```sql
EXCEPTION
  WHEN ER_SUBQUERY_NO_1_ROW THEN
  ECHO SELECT 'SELECT ... INTO statement returned more than 1 row.';
```

## SELECT ... ALL DISTINCT DISTINCTROW

Return rows with or without duplicates.

## Syntax

```sql
SELECT [ALL | DISTINCT | DISTINCTROW] FROM table_reference
```

## Remarks

* The `ALL` clause returns all the matching rows including duplicates.&#x20;

  The `DISTINCT` or `DISTINCTROW` clause returns the matching rows but eliminates the duplicate rows from the output. `DISTINCTROW` is a synonym of `DISTINCT`.

## Examples

```sql
SELECT * FROM Product;

```

```output

+------------+------------+----------+
| Product_id | Brand_name | City     |
+------------+------------+----------+
|          1 | Nike       | London   |
|          3 | Nike       | New York |
|          2 | Adidas     | Paris    |
|          3 | Puma       | Spain    |
+------------+------------+----------+

```

```sql
SELECT ALL Brand_name FROM Product;

```

```output

+------------+
| Brand_name |
+------------+
| Nike       |
| Nike       |
| Adidas     |
| Puma       |
+------------+
```

```sql
SELECT DISTINCT Brand_name FROM Product;

```

```output

+------------+
| Brand_name |
+------------+
| Nike       |
| Adidas     |
| Puma       |
+------------+
```

## SELECT … INTO OUTFILE

`SELECT ... INTO OUTFILE` formats and writes the results of a `SELECT` query to a text file. The `format_options` are similar to the parsing options used with `LOAD DATA`. See [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-16.md) for more information.

```
format_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

```

## Remarks

* If a relative path that that does not start with `/` is specified, SingleStore writes the file to the directory specified in the global variable `datadir`. To specify another location, enter the absolute path to the file as the `file_name` parameter.
* The default text formatting, used when the user omits the `FIELDS` and `LINES` blocks, is to separate values with tabs (`\t`) and rows with newlines (`\n`).

## Example

The following query writes the contents of `table_name` to a file in the home directory of username. The file is formatted as a comma-separated values (CSV) file.

```sql
SELECT * FROM table_name INTO OUTFILE '/home/username/file_name.csv'
        FIELDS TERMINATED BY ','
        LINES TERMINATED BY '\n'

```

## SELECT ... INTO AZURE

`SELECT ... INTO AZURE` performs a distributed `SELECT` into an Azure container. The `CREDENTIALS` clause is required and its value is defined in `credentials_json` as follows:

```
credentials_json:
'{"account_name": "your_account_name_here",
  "account_key": "your_account_key_here"
 }'
```

If `WITH COMPRESSION GZIP` is specified, the results of the `SELECT` query are written to the Azure container in compressed gzip files. A compressed file with the `.gz` extension is created for each partition.

The `WITH COMPRESSION` clause can be specified instead of the `WITH COMPRESSION GZIP` clause. Both clauses are equivalent. However, if other compression formats are supported in the future, `WITH COMPRESSION` may use a compression format other than gzip by default.

`format_options`, if specified, must be included after the `CREDENTIALS` clause, or before the `WITH COMPRESSION` clause (if it is specified). The `format_options` are similar to the parsing options used with `LOAD DATA`. Refer to [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-9.md) for more information.

```
format_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

```

As an alternative to using a `SELECT ... INTO AZURE` statement where you specify the  `CREDENTIALS` clause, you can use a `SELECT ... INTO LINK` statement, where you reference a connection link. For more information, refer to [Configuring and Using Connection Links](https://docs.singlestore.com/db/v9.1/security/authentication/configuring-and-using-connection-links.md).

## SELECT … INTO HDFS

`SELECT INTO ... HDFS` writes the result of a `SELECT` query to HDFS. The configuration options for `SELECT INTO ... HDFS` are similar to the [HDFS Pipelines](https://docs.singlestore.com/db/v9.1/reference/sql-reference/pipelines-commands/create-pipeline.md) syntax.

## Example

The following example demonstrates how to write the contents of a table to HDFS at a specified path.

```sql
SELECT * FROM stockTo
INTO HDFS 'hdfs://hadoop-namenode:8020/stock_dir/records_file.csv'

```

## SELECT … INTO LINK

`SELECT ... INTO LINK` writes the results of a `SELECT` query to S3, GCS, HDFS, or Kafka using a [connection link](https://docs.singlestore.com/db/v9.1/security/authentication/configuring-and-using-connection-links.md). To use this command, you only need to know the connection link name, not the connection details and configuration. However, you require the `SHOW LINK` permission, provided by your administrator, to use a connection link.

> **📝 Note**: SASL OAUTHBEARER is supported with `CREATE PIPELINE` but not with `SELECT INTO KAFKA`.

## Example

The following example writes the contents of the table `t1`, to the S3 bucket at the specified path, using the S3 connection link `S3con` stored in the `db1` database:

```sql
USE db1;
SELECT * FROM t1 INTO LINK S3con 'testing/output';

```

**Note**: The connection link `S3Con` should already exist in `db1`. Refer to the [CREATE LINK](https://docs.singlestore.com/db/v9.1/reference/sql-reference/security-management-commands/create-link.md) topic for details on creating a connection link.

## SELECT … INTO FS

`SELECT ... INTO FS` works similarly to `SELECT INTO OUTFILE`, except that if the `SELECT` logic determines that the results can be computed in parallel (e.g. there is no `order by` clause, etc.), the query is pushed down to each leaf and executed on each partition, and a separate file is written for each partition. `SELECT INTO OUTFILE` always writes to a single file.

When writing to multiple files, the file names will be:

```
destination_directory/file_name_0
destination_directory/file_name_1
destination_directory/file_name_2
etc.

```

Otherwise:

```
destination_directory/file_name

```

If `WITH COMPRESSION GZIP` is specified, the results of the `SELECT` query are written to the filesystem in compressed gzip files. A compressed file with the `.gz` extension is created for each partition.

The `WITH COMPRESSION` clause can be specified instead of the `WITH COMPRESSION GZIP` clause. Both clauses are equivalent. However, if other compression formats are supported in the future, `WITH COMPRESSION` may use a compression format other than gzip by default.

The `format_options` are similar to the parsing options used with `LOAD DATA`. See [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-10.md) for more information.

```
format_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

```

## Remarks

* By default, SingleStore writes the files to the directory specified in the global variable `basedir`. To specify another location, enter the absolute path to the file as the `destination_directory` parameter.
* The default text formatting, used when the user omits the `FIELDS` and `LINES` blocks, is to separate values with tabs (`\t`) and rows with newlines (`\n`).

## Example

The following query writes the contents of `table_name` to a set of compressed files in the `/tmp` directory on each leaf node (`/tmp/a_0`, `/tmp/a_1`, `/tmp/a_2`, etc.). The files are comma-separated value (CSV) with one output file per partition (`a_0` is for partition 0, `a_1` is for partition 1, etc.).

```sql
SELECT * FROM table_name INTO FS '/tmp/a'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
WITH COMPRESSION;

```

## SELECT … INTO S3

`SELECT ... INTO S3` performs a distributed select into a S3 bucket. The `CREDENTIALS` clause is required and its value is defined in `credentials_json` as follows.

```
configuration_json:
'{"region":"your_region" [, "multipart_chunk_size_mb":<size_in_MB>]}'

credentials_json:
'{"aws_access_key_id": "replace_with_your_access_key_id",
  "aws_secret_access_key": "replace_with_your_secret_access_key",
  ["aws_session_token": "replace_with_your_temp_session_token",]
  ["role_arn":"replace_with_your_role_arn"]
}'

```

If `WITH COMPRESSION GZIP` is specified, the results of the `SELECT` query are written to the S3 bucket in compressed gzip files. A compressed file with the `.gz` extension is created for each partition.

The `WITH COMPRESSION` clause can be specified instead of the `WITH COMPRESSION GZIP` clause. Both clauses are equivalent. However, if other compression formats are supported in the future, `WITH COMPRESSION` may use a compression format other than gzip by default.

`format_options`, if specified, must be included after the `CREDENTIALS` clause, or before the `WITH COMPRESSION` clause (if it is specified). The `format_options` are similar to the parsing options used with `LOAD DATA`. Refer to [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-11.md) for more information.

```
format_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

```

To overwrite existing files in an S3 bucket, specify `ENABLE_OVERWRITE` after the `CREDENTIALS` clause in the `SELECT … INTO S3` statement. By default, overwriting existing files is disabled for AWS S3 destinations.

```sql
SELECT *
FROM t1
INTO S3 'testing/output'
CONFIG '{"region":"us-east-1"}'
CREDENTIALS '{"aws_access_key_id":"your_access_key_id","aws_secret_access_key":"your_secret_access_key"}'
ENABLE_OVERWRITE;
```

To overwrite existing files in a connection link destination/target, specify `ENABLE_OVERWRITE` after the `INTO LINK` clause. By default, overwriting existing files is disabled.

```sql
SELECT *
FROM t1
INTO LINK S3con 'testing/output'
ENABLE_OVERWRITE
Fields terminated by ‘,’
Lines terminated by ‘\n’;
```

## Remarks

> **⚠️ Warning**: The S3 bucket needs to be created before running this command.

* The `multipart_chunk_size_mb` must be in the range of \[5..500]. By default, the chunk size is `5 MB`. A larger chunk size allows users to upload large files without going over Amazon’s limitation on maximum number of parts per upload. Although, a larger chunk size increases the chance of a network error during the upload to S3. If a chunk fails to upload, SingleStore retries uploading it until the limit set on the number of retries by AWS is reached.
  > **📝 Note**: Each partition will use “multipart\_chunk\_size\_mb” MB(s) of additional memory.
* The output of `SELECT ... INTO S3` is stored with the content type binary/octet-stream in the S3 bucket.
* The `ENABLE_OVERWRITE` clause is only supported in **SELECT ... INTO {S3 | LINK }** statements.
* If the insert select logic determines that the `SELECT ... INTO S3` query can be run as distributed, the query will be pushed down to each leaf and executed on each partition. The name of each object will be:
  ```
  <bucket/target>_<partition ID>

  ```
* If the insert select logic determines that the `SELECT ... INTO S3` query can only be on the aggregator because it contains aggregations, ORDER BY, GROUP BY, etc. then the query will be run on each leaf but the result will be collected on the aggregator and then output to S3. The object name will just be:
  ```
  <bucket/target>

  ```
* It supports EKS IRSA. For more details refer [Enable EKS IRSA](https://docs.singlestore.com/db/v9.1/reference/singlestore-operator-reference/enable-eks-irsa.md).
* It supports outfiles using backup binary to access S3. It also supports EKS IRSA
* The `SELECT` query will validate if the `<bucket/target>` or `<bucket/target>_` already exists on the S3 bucket first and fail if any of the object(s) already exist.
* As an alternative to using a `SELECT ... INTO S3` statement where you specify the `CONFIG` and `CREDENTIALS` clauses, you can use a `SELECT ... INTO LINK` statement, where you reference a connection link. For more information, see [Configuring and Using Connection Links](https://docs.singlestore.com/db/v9.1/security/authentication/configuring-and-using-connection-links.md).

## Examples

The following simple example shows how to save all rows in table t1 to an S3 bucket using an AWS access key.

```sql
SELECT *
FROM t1
INTO S3 'testing/output'
CONFIG '{"region":"us-east-1"}'
CREDENTIALS '{"aws_access_key_id":"your_access_key_id","aws_secret_access_key":"your_secret_access_key"}'

```

The following example saves the result set of a `SELECT` query with a `GROUP BY` clause and sends the file in chunks to an S3 bucket using an Amazon Resource Name (ARN) for AWS Identity and Access Management (IAM).

```sql
SELECT t1.a, t2.a
FROM t1, t2
WHERE t1.a = t2.a
GROUP BY t1.a
INTO S3 'bucket_name/file_name'
CONFIG '{"region":"us-east-1", "multipart_chunk_size_mb":100}'
CREDENTIALS '{"role_arn": "arn:aws:iam::<AccountID>:role/EC2AmazonS3FullAccess"}'

```

The following example uses the [format options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-12.md) and `WITH COMPRESSION`to output the data in CSV format using compression.

```sql
SELECT *
FROM t
INTO S3 'tmp/a'
CONFIG '{"region":"us-east-1"}'
CREDENTIALS '{"aws_access_key_id":"your_access_key_id","aws_secret_access_key":"your_secret_access_key"}'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
WITH COMPRESSION;

```

## SELECT … INTO GCS

`SELECT ... INTO GCS` performs a distributed select into a Google Cloud Storage (GCS) bucket.The `CONFIG` clause is not required, and can be excluded or left empty as defined in the `configuration_json` that follows. The `CREDENTIALS` clause is required and its value is defined in the `credentials_json` that follows.

```
configuration_json:
'{}'

credentials_json:
'{"access_id": "replace_with_your_google_access_key",
  "secret_key": "replace_with_your_google_secret_key"
}'

```

The `WITH COMPRESSION` clause can be specified instead of the `WITH COMPRESSION GZIP` clause. Both clauses are equivalent. However, if other compression formats are supported in the future, `WITH COMPRESSION` may use a compression format other than gzip by default.

`format_options`, if specified, must be included after the `CREDENTIALS` clause, or before the `WITH COMPRESSION` clause (if it is specified). The `format_options` are similar to the parsing options used with `LOAD DATA`. Refer to [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-13.md) for more information.

```
format_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

```

## Remarks

* Unlike other filesystem options, with `SELECT ... INTO GCS`, data written by a single partition is not divided into chunks. However, there may still be multiple partitions and thus multiple files (one per partition) will appear in GCS.
* The maximum object size supported by GCS is 5 TB; this limit also applies to `SELECT ... INTO GCS`.
* The `CONFIG` clause may optionally specify an endpoint\_url.
* The `CREDENTIALS` clause is required.
* We support only [HMAC keys](https://cloud.google.com/storage/docs/authentication/hmackeys).
* The `CREDENTIALS` clause should be a JSON object with two fields:

`access_id`: usually a 24 or 60 character alphanumeric string, which is linked to the Google account, typically all uppercase and starts with `GOOG`.

`secret_key`: usually a 40 character Base-64 encoded string that is linked to a specific `access_id`.

As an alternative to using a `SELECT ... INTO GCS` statement where you specify the `CONFIG` and `CREDENTIALS` clauses, you can use a `SELECT ... INTO LINK` statement, where you reference a connection link. For more information, see [Configuring and Using Connection Links](https://docs.singlestore.com/db/v9.1/security/authentication/configuring-and-using-connection-links.md).

## Examples

The following simple example shows how to save all rows in table `table_name` using compression to a GCS bucket using a Google access key, and outputs them as a CSV.

```sql
CREATE TABLE table_name (column_name INT);
INSERT INTO table_name VALUES (1), (2), (3);
SELECT *
FROM table_name
INTO GCS 'bucket/path'
CREDENTIALS '{"access_id": "replace_with_your_google_access_key", "secret_key": "replace_with_your_google_secret_key"}' 
FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
WITH COMPRESSION;

```

## SELECT … INTO KAFKA …

`SELECT ... INTO KAFKA ...` runs a `SELECT` query, constructs a Kafka message for each row in the result set, and publishes the messages to a Kafka topic.

When `SELECT ... INTO KAFKA ...` constructs a Kafka message, it includes every column value in the result set's row and separates the column values by a delimiter. To configure the delimiter and other message formatting settings, use the [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-14.md) clauses.

```
kafka_configuration:
    CONFIG 'string'

kafka_credentials:
    CREDENTIALS 'string'

format_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [TERMINATED BY 'string']
        [STARTING BY 'string']
    ]

kafka_topic_endpoint:
    host:port[, ...]/topic

```

## Arguments

* `kafka_configuration`: Optional. The Kafka configuration properties, specified in JSON format, that are used while SingleStore publishes messages to a topic. The configuration properties are the same properties that are stored in the `server.properties` file on each Kafka broker.
* `kafka_credentials`: Optional. The credentials, in JSON format, used to connect to Kafka.
* `format_options`: Optional. Clauses that specify how a Kafka message is formatted when it is constructed from a row in the `SELECT` result set. See the [Format Options](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/#format-options-15.md) section for more information. Refer to [this example](https://docs.singlestore.com/#UUID-d0d5040c-0c20-90d8-5226-71ae336212f7.md) on how to use the clauses.&#x20;
* `kafka_topic_endpoint`: The list of Kafka brokers, followed by the topic to which SingleStore will publish messages. For each broker in the list, specify its host and port.&#x20;
* `KAFKA KEY <column_name>`: Optional. Specify the provided column name as the Kafka key.

  * `KAFKA KEY` must be used only for Kafka targets.&#x20;
  * The Kafka key column name must be a valid column name in the database table.
  * The Kafka key column name must be the same as the first column name of the `SELECT INTO` statement.&#x20;

## Remarks

* When possible, SingleStore queries leaf nodes directly, bypassing the aggregator. This allows SingleStore to send data directly from the leaves to Kafka partitions.
* SingleStore constructs a Kafka message as an array of bytes.
* As an alternative to using a `SELECT ... INTO KAFKA` statement where you specify the `CONFIG` and `CREDENTIALS` clauses, you can use a `SELECT ... INTO LINK` statement, where you reference a connection link. For more information, see [Configuring and Using Connection Links](https://docs.singlestore.com/db/v9.1/security/authentication/configuring-and-using-connection-links.md).

## Examples

## Example: Specifying the Kafka Message Format

The following query uses the `FIELDS` and `LINES` clauses to format the Kafka messages that are constructed from the `SELECT` result set.

```sql
SELECT col1, col2, col3 FROM t
ORDER BY col1
INTO KAFKA 'host.example.com:9092/test-topic'
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY "\t"
LINES TERMINATED BY '}' STARTING BY '{';

```

Suppose the result set returned by `SELECT col1, col2, col3 FROM t ORDER BY col1` is:

```
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| a    | b    | c    |
| d    | e    | f    |
| g    | h\ti | j    |
+------+------+------+

```

This result set will be converted to three Kafka messages having the following format:

Message 1: `{a,b,c}`

Message 2: `{d,e,f}`

Message 3: `{g,h<tab character>i,j}`

`<tab character>` will be replaced by the number of spaces that your tab is set to.

> **📝 Note**: If a `SELECT ... INTO ... KAFKA` query does not include the `FIELDS...` and `LINES..`. clauses, the default Kafka message formatting settings are used. The default settings would look something like this:```sql
> SELECT col1, col2, col3 FROM t
> ORDER BY col1
> INTO KAFKA 'host.example.com:9092/test-topic'
> FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
> LINES TERMINATED BY '\n' STARTING BY '';
> ```

## Example: Using the `CONFIG` and `CREDENTIALS` Clauses

The following example uses the `CONFIG` and `CREDENTIALS` clauses.

```sql
SELECT text FROM t INTO
KAFKA 'host.example.com:9092/test-topic'
CONFIG '{"security.protocol": "ssl",
"ssl.certificate.location": "/var/private/ssl/client_memsql_client.pem",
"ssl.key.location": "/var/private/ssl/client_memsql_client.key",
"ssl.ca.location": "/var/private/ssl/ca-cert.pem"}'
CREDENTIALS '{"ssl.key.password": "abcdefgh"}'

```

For more information on what settings to include in the `CONFIG` and `CREDENTIALS` clauses, see [Securely Connect to Kafka](https://docs.singlestore.com/db/v9.1/load-data/data-sources/load-data-from-kafka.md).

## Example: Specifying One Kafka Broker

The following example imports the data in the column `text` from table `t` into the Kafka topic `test-topic`. One Kafka broker is specified; it is located on port `9092` at `host.example.com`.

```sql
SELECT text FROM t INTO
KAFKA 'host.example.com:9092/test-topic'

```

## Example: Specifying Multiple Kafka Brokers

The following example imports the data in the column `text` from table `t` into the Kafka topic `test-topic`. Three Kafka brokers are specified, located on port `9092` , at `host1.example.com`, `host2.example.com` and `host3.example.com`.

```sql
SELECT text FROM t INTO
KAFKA 'host1.example.com:9092,host2.example.com:9092,host3.example.com:9092/test-topic'

```

## Example: Specifying a Kafka Key

For this example, suppose table kktest has three columns: column\_a, column\_b, and column\_c. In order to use column\_b as the Kafka key, the `SELECT ... INTO KAFKA` statement might look like this:

```
SELECT column_b, column_a, column_b, column_c FROM kktest INTO 
KAFKA 'localhost:1234/kafkaWithKey' KAFKA KEY "column_b";
```

## SELECT INTO ... FORMAT PARQUET

`SELECT INTO ... FORMAT PARQUET` allows SingleStore data to be exported into Apache Parquet (Parquet) format for the given SQL type. See Apache Parquet documentation for detailed information.

Some SQL types may not fit into a particular Parquet type. This is dependent on the actual stored values for that type. For data that does not fit, the user will receive suggestions for the appropriate type for storing data in Parquet format.

## Syntax

```sql
FORMAT PARQUET
```

## Remarks

* The `FORMAT PARQUET` clause will go at the end of the `SELECT` statement.
* Time and date types are stored in the engine timezone without adjusting to UTC. Below are some examples of the differences between SQL and Parquet time and date formats. If any value of SQL TIME(6) is outside the range of the Parquet TIME, the following error message may occur:
  ```
  can not save SQL TIME(6) … because parquet's TIME is within a single day
  (hour should be between 0 and 23); consider casting this column to BIGINT
  ```
* If any value of TIMESTAMP is zero, which is outside the range of the Parquet TIMESTAMP, the following error message may occur:
  ```
  can not be saved as TIMESTAMP with millisecond precision because 
  zero value is outside the scope of the parquet type; consider casting 
  this column to INT64
  ```

## Examples

The examples below have been shortened to illustrate the use of the Parquet format.  The data ingestion, creation of table2, and start pipeline are not shown.

## Example 1

Creates a table.

```sql
CREATE TABLE table1 (f_CHAR_2_5 CHAR(2), f_LONGTEXT4 LONGTEXT NOT NULL, 
f_DATETIME_6_3 DATETIME(6), f_NUMERIC_27_13__UNSIGNED2 NUMERIC(27,13) UNSIGNED, f_TINYINT1 TINYINT);

```

Uses the Parquet format.

```sql
SELECT * FROM table1 into fs ‘/tmp/parquet_files’ FORMAT PARQUET;
```

Creates the pipeline using Parquet syntax.

```sql
CREATE PIPELINE pipe1 AS LOAD DATA fs ‘/tmp/parquet_files’ 
INTO TABLE table2 FORMAT PARQUET (f_CHAR_2_5 <- %::f_CHAR_2_5, f_LONGTEXT4 <- %::f_LONGTEXT4, 
@f_DATETIME_6_3 <- %::f_DATETIME_6_3, f_NUMERIC_27_13__UNSIGNED2 <- %::f_NUMERIC_27_13__UNSIGNED2, 
f_TINYINT1 <- %::f_TINYINT1 )  SET f_DATETIME_6_3=DATE_ADD('1970-01-01', INTERVAL @f_DATETIME_6_3 MICROSECOND);
```

## Example 2

Creates a table.

```sql
CREATE TABLE table1 (f_REAL3 REAL NOT NULL, f_TIME2 TIME, f_ENUM__small___medium___large__1 ENUM('small','medium','large'));

```

Uses the Parquet format.

```sql
SELECT * FROM table1 into fs '/tmp/parquet_files' FORMAT PARQUET;
```

Creates the pipeline using Parquet syntax.

```sql
CREATE  PIPELINE pipe2 AS LOAD DATA fs '/tmp/parquet_files' INTO TABLE table2 FORMAT PARQUET 
(f_REAL3 <- %::f_REAL3, @f_TIME2 <- %::f_TIME2, f_ENUM__small___medium___large__1 <- %::f_ENUM__small___medium___large__1 )
SET f_TIME2=sec_to_time(@f_TIME2/1000);
```

## Example 3

Creates a table.

```sql
CREATE ROWSTORE TABLE table1 (f_TEXT5 TEXT NOT NULL, 
f_GEOGRAPHY4 GEOGRAPHY NOT NULL, f_VARCHAR_200_3 VARCHAR(200), 
f_DECIMAL_27_13_2 DECIMAL(27,13), f_INT_UNSIGNED1 INT UNSIGNED);

```

Uses the Parquet format.

```sql
SELECT * FROM table1 INTO OUTFILE '/tmp/parquet_files3' FORMAT PARQUET;
```

Create Link.

```sql
CREATE LINK p_link_20_56_14_557666 AS fs CREDENTIALS '{}' CONFIG '{}';
```

Creates the pipeline using Parquet syntax.

```sql
CREATE PIPELINE pipe3 AS LOAD DATA LINK p_link_20_56_14_557666
'/tmp/parquet_files3' INTO TABLE table2 FORMAT PARQUET (f_TEXT5 <- %::f_TEXT5,
f_GEOGRAPHY4 <- %::f_GEOGRAPHY4, f_VARCHAR_200_3 <- %::f_VARCHAR_200_3, 
f_DECIMAL_27_13_2 <- %::f_DECIMAL_27_13_2, f_INT_UNSIGNED1 <- %::f_INT_UNSIGNED1);
```

## Format Options

```
format_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

```

* `FIELDS TERMINATED BY 'string'`: The delimiter to use between each column value in the message. If not specified, defaults to `\t`.
* `FIELDS ENCLOSED BY 'char'`: Prefixes and suffixes each column value of the message with the “char” value. If not specified, defaults to nothing. The `OPTIONALLY` keyword does not affect the behavior of this option; it exists to maintain compatibility with MySQL.
* `FIELDS ESCAPED BY 'char'`: Translates the escape sequence in the “char” value, instead of using the literal “char” value. Translation is done for all occurrences of the “char” value in the message. If not specified, defaults to `\\`.
* `LINES TERMINATED BY 'string'`: A string to insert at the end of the message. If not specified, defaults to `\n`.
* `LINES STARTING BY 'string'`: A string to insert at the beginning of the message. If not specified, defaults to nothing.

## Specify Partition ID and Timestamp in Output Filename in SELECT INTO

Users may define placeholders for partition ID ([ORDINAL](https://docs.singlestore.com/db/v9.1/reference/information-schema-reference/cluster-component/distributed-partitions.md)) and timestamp as part of the output filename in `SELECT INTO` statements. These placeholders allow the user to create output filenames suitable for their application, and to preserve filename extensions (e.g. .parquet).&#x20;

The placeholders for partition ID (`${PARTITION_ID}`) and timestamp (`${TIME_STAMP}`) are automatically replaced with their respective values during query execution. If no placeholder is specified in the filename, the partition ID is appended to the filename.

In the query below, the `${PARTITION_ID}` placeholder is replaced with the value of the partition ID in the output filenames. If the partition ID were `789` and the format was parquet, the query and output would be as follows:

```sql
SELECT * FROM tablename INTO S3 's3://bucket/test/filename_${PARTITION_ID}.parquet';

```

```output

bucket/test/filename_789.parquet

```

Similar functionality is provided for the `${TIME_STAMP}` placeholder as shown in the examples below.

This placeholder functionality works for outputs into FS, S3, Azure, HDFS, Link, and GCS. The functionality is not applicable for Kafka.

## Examples

The examples below show how to use the placeholders.

These examples assume the existence of a table `t` and assume that `TIME_STAMP` = `123456` and that the table `t` has two partitions with partition IDs `1` and `2`.

## Example 1: Resolve Placeholders in Filenames

Placeholders are dynamically replaced in filenames with actual values to generate context-specific filenames based on the actual values of the placeholders.

```sql
SELECT * FROM t 
INTO S3 's3://bucket/test/file_${TIME_STAMP}_${PARTITION_ID}.parquet';

```

```output

bucket/test/file_123456_.parquet
bucket/test/file_123456_1.parquet
bucket/test/file_123456_2.parquet

```

## Example 2: Default Placeholder

The partition ID must always be included in the filenames. If the `${PARTITION_ID}` placeholder is not in the path, the partition ID is appended to the output filenames.

```sql
SELECT * FROM t 
INTO S3 's3://bucket/test/file_${TIME_STAMP}.parquet';

```

```output

bucket/test/file_123456.parquet
bucket/test/file_123456.parquet_1
bucket/test/file_123456.parquet_2

```

## Example 3: Paths Without Placeholders

When no placeholders are found in the file path, the partition IDs are appended to the filename. This functionality maintains consistency in file naming conventions by always including the partition ID in the output filenames.

```sql
SELECT * FROM t 
INTO S3 's3://bucket/test/file.parquet';

```

```output

bucket/test/file.parquet
bucket/test/file.parquet_1
bucket/test/file.parquet_2

```

## Example 4: Placeholders in Directory Paths

Placeholders are resolved in the filename, but not in the directory path.

When the placeholder `${PARTITION_ID}` is used in the directory path as in the query below, `${PARTITION_ID}` will not be translated to the value of the partition ID. When the placeholder `${PARTITION_ID}` is used in the filename, as is also shown in the query below, `${PARTITION_ID}` will be translated to the value of the partition ID. The same is true for the `${TIMESTAMP}` placeholder.

```sql
SELECT * FROM t 
INTO S3 's3://bucket/${PARTITION_ID}/file_${PARTITION_ID}.parquet';

```

```output

/bucket/${PARTITION_ID}/file.parquet
/bucket/${PARTITION_ID}/file_1.parquet
/bucket/${PARTITION_ID}/file_2.parquet

```

## SELECT … WITH (SAMPLE\_RATIO = \<value>)

`SELECT ... WITH (SAMPLE_RATIO = <value>)` retrieves a random sample of data from a table (specifically, a [Bernoulli sample](https://en.wikipedia.org/wiki/Bernoulli_sampling)).

```
sample_ratio_clause: WITH (SAMPLE_RATIO = ratio)

```

## Argument

**ratio**

A decimal between 0.0 and 1.0 specifying the probability that each row will be retrieved from the table. The probability is applied to each row independently.

## Remarks

* When you run multiple `SELECT` statements against the same table using `WITH (SAMPLE_RATIO = <value>)`, you will receive similar results. This is the case even when your sample ratio is different in your `SELECT` statements. For example, the sample returned for a sample ratio of `0.25` is often a subset of the sample ratio returned for `0.5`.
* Using a sample ratio that is one divided by a power of two allows `WITH (SAMPLE_RATIO = <value>)` to run more efficiently.
* You can use `WITH (SAMPLE_RATIO = <value>)` in a `SELECT` subquery.
* `SAMPLE_RATIO` is not applicable on Common Table Expressions (CTE) but on the base tables.

> **⚠️ Warning**: You can use `WITH (SAMPLE_RATIO = <value>)` with any table. It operates most efficiently when used with a rowstore table whose primary key is not a hash key, or a columnstore table with `AUTOSTATS_SAMPLING` enabled (the default). Otherwise, it will scan all rows, or all rows in a range, to do sampling.

## Examples

**Simple `SELECT` statements**

The following example populates a table with the integers `1` to `10` and selects two random samples.

```sql
DROP DATABASE IF EXISTS singlestore_docs_example;
CREATE DATABASE singlestore_docs_example;
USE singlestore_docs_example;
CREATE TABLE example_table(num INT);

INSERT INTO example_table VALUES (5), (6), (10), (11), (12),
  (14), (20), (24), (33), (40);

```

The query below returns a random sample of the `num` column. Each row in the sample has a seventy percent probability of being returned.

```sql
SELECT num FROM example_table WITH (SAMPLE_RATIO = 0.7) ORDER BY num;

```

Example output follows. You will likely see different results, as the query returns a random sample.

```
+------+
| num  |
+------+
|    6 |
|   11 |
|   14 |
|   20 |
|   24 |
|   33 |
+------+

```

The query below returns the average of a random sample of the `num` column. Each row in the sample has an eighty percent probability being included in the sample and therefore in the `AVG` computation.

```sql
SELECT AVG(num) FROM example_table WITH (SAMPLE_RATIO = 0.8);

```

Example output follows. You will likely see different results, as the query returns a random sample.

```
+----------+
| AVG(num) |
+----------+
|  18.1111 |
+----------+

```

**Using `WITH (SAMPLE_RATIO = <value>)` with `SELECT` Subqueries**

You may use `WITH (SAMPLE_RATIO = <value>)`as part of a `SELECT` subquery, as shown in the following example.

```sql
CREATE TABLE example_table_2 AS SELECT * FROM example_table
  WITH (SAMPLE_RATIO = 0.6);
SELECT num FROM example_table2 ORDER BY num;

```

Example output follows. You will likely see different results, as the query returns a random sample.

```
+------+
| num  |
+------+
|   10 |
|   12 |
|   14 |
|   24 |
|   33 |
|   40 |
+------+

```

**Using `WITH (SAMPLE_RATIO = <value>)` with Joins**

You may use `WITH (SAMPLE_RATIO = <value>)` in a `SELECT` statement that contains a join. However, you should do so carefully to ensure the join yields the results you intend. For example, consider joining a `CUSTOMER` table to an `ORDER` table using the `customer_id` field in the `ORDER` table. The placement of the `WITH (SAMPLE_RATIO = <value>)` clause within the join will greatly impact the result, as shown below.

The following query selects a sample of customers and returns the orders for those customers only.

```sql
SELECT c.name, o.order_id, o.order_total
FROM customer WITH (SAMPLE_RATIO = 0.4) c
JOIN order o
ON c.customer_id = o.customer_id

```

The following query selects a sample of all orders and returns only those orders along with their associated customers.

```sql
SELECT c.name, o.order_id, o.order_total
FROM customer c
JOIN order o WITH (SAMPLE_RATIO = 0.4)
ON c.customer_id = o.customer_id

```

***

Modified at: December 11, 2025

Source: [/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-manipulation-language-dml/select/)

(An index of the documentation is available at /llms.txt)
