# Code Generation

A large contributor to SingleStore's superior query execution performance is its system of code generation (An industrial compiler to produce highly efficient machine code that enables low-level optimizations, which are not possible when executing queries via interpretation alone. By default, queries are interpreted first and then asynchronously compiled in the background for use in later executions. This speeds up query execution time for long and complex queries, while at the same time providing efficient query plans for later use.). In contrast to a traditional interpreter-based execution model for databases, SingleStore embeds an industrial compiler to produce highly efficient machine code that enables low-level optimizations, which are not possible when executing queries via interpretation alone. And now by default, queries are interpreted first and then asynchronously compiled in the background for use in later executions. This speeds up query execution time for long and complex queries, while at the same time providing efficient query plans for later use.

As stated above, the first time SingleStore encounters a given query shape (Patterns or structures in query. Some query shapes are unsupported in SingleStore.), it will optimize and compile the query asynchronously for future invocations. This incurs minimal overhead which does not depend on the amount of data to be processed, but rather the complexity of the query. The process of code generation involves extracting parameters from the query then transforming the normalized query into a SingleStore-specific intermediate representation tailored to the system. Subsequent requests with the same shape can reuse this plan to complete queries both quickly and consistently.

Code generation applies to all [Data Manipulation Language DML](https://docs.singlestore.com/cloud/reference/sql-reference/data-manipulation-language-dml.md) queries. In addition, SingleStore generates code during [CREATE TABLE](https://docs.singlestore.com/cloud/reference/sql-reference/data-definition-language-ddl/create-table.md) and [ALTER TABLE](https://docs.singlestore.com/cloud/reference/sql-reference/data-definition-language-ddl/alter-table.md) statements. These [Data Definition Language DDL](https://docs.singlestore.com/cloud/reference/sql-reference/data-definition-language-ddl.md) queries generate code in order to reduce the compilation time of subsequent DML queries against the table.

## The Interpreter Mode’s Effects on Code Generation

The `interpreter_mode` setting controls whether SingleStore interprets or compiles query shapes, or does both.

This section explains how this mode operates when SingleStore encounters a new query shape.

## `interpreter_mode` is set to `compile` or `llvm`

In this mode, SingleStore compiles a query’s shape the first time it is encountered. As shown in the following example, the query `SELECT * FROM t WHERE col = 1;` takes longer to complete on the first run; this is a result of the compile overhead during the first step. Also note that the `WHERE` clauses are different in the second and third query, but both query shapes are identical. Therefore, SingleStore reuses the second query’s plan to run the third query.

```sql
SELECT * FROM t WHERE col = 1;

```

```output

Empty set (0.13 sec)

```

```sql
SELECT * FROM t WHERE col = 1;

```

```output

Empty set (0.00 sec)
```

```sql
SELECT * FROM t WHERE col = 100000;

```

```output

Empty set (0.00 sec)
```

> **❗ Important**: Run the [PROFILE](https://docs.singlestore.com/cloud/reference/sql-reference/data-manipulation-language-dml/profile.md) command followed by the `SHOW PROFILE JSON` command to view detailed statistics on compilation time (displayed under `compile_time_stats` in the output).

## `interpreter_mode` is set to `compile_lite`

In this mode, a query's shape is compiled, without any optimizations, the first time the query shape is encountered. This results in faster compilation than when `interpreter_mode` is set to `compile`. However, because `compile_lite` applies no optimizations, it will execute queries slower than `compile`.

## `interpreter_mode` is set to `compile_lite_first`

In this mode, two parallel compilation threads are run: A foreground thread that compiles the query shape in `compile_lite` mode and a background thread that compiles the query shape in `compile` mode. When the background thread completes, the foreground thread switches to the background thread's compiled query plan. Generally, SingleStore recommends using `compile_lite_first` mode instead of `compile_lite` mode. An exception is when using `compile_lite_first` would consume too many CPU resources (as the same query shape is compiled twice).

## `interpreter_mode` is set to `interpret` or `mbc`

In this mode, queries are interpreted and not compiled.

## `interpreter_mode` is set to `interpret_first`

In this mode, SingleStore interprets and compiles a query shape in parallel, upon encountering the query shape for the first time. The query runs in interpreted mode until compilation of the query shape completes. In the following example, the first `SELECT * FROM t WHERE col = 1;` query runs faster than the first query in the previous example.

```sql
SELECT * FROM t WHERE col = 1;

```

```output

Empty set (0.02 sec)

```

```sql
SELECT * FROM t WHERE col = 1;

```

```output

Empty set (0.00 sec)
```

```sql
SELECT * FROM t WHERE col = 100000;

```

```output

Empty set (0.00 sec)
```

As of MemSQL 6.8, `interpreter_mode` is set to `interpret_first` by default.

## Setting the Interpreter Mode

`interpreter_mode` can be set:

## Using a Query Option

To use the query option, add the `OPTION (interpreter_mode = <interpreter_mode_setting>)` clause to the end of a query. For example:

```sql
SELECT * FROM t WHERE col = 1 OPTION (interpreter_mode = compile);
```

The `interpreter_mode` option is non-operational when a query containing the option is run on a leaf node.

## Understanding the Plancache

When a query is first run, it activates code generation, gets optimized, and translated into lower-level machine language. On this first run of the query, the code generation for it will be recorded in the logs. This lower-level version of the query, with the wildcard @ replacing for parameters is stored on-disk in the plancache. You can see the plancache files in `/var/lib/memsql/<node ID>/plancache`/. Note the directories in this location, named for their IDs. In the ID dirs, you will find the .mo, .mpl, and .mu machine language files.

If the data has undergone a lot of changes, then you might want the plancache to be updated with the new stats by running: [ANALYZE TABLE table\_name;](https://docs.singlestore.com/cloud/reference/sql-reference/operational-commands/analyze.md). This will collect new statistics and invalidate previously cached plans on tables that experienced significant changes,

After code generation, the compiled query plans are saved for later use in a plancache. Each SingleStore Helios node (aggregator and leaf) has its own plans and plancache. Each user query (on the aggregator) is associated with a plan on that aggregator, and some number of plans (zero, one, or more) on the leaves.

There are two layers of this plancache: the in-memory plancache and the on-disk plancache (located in the `@@global.plancachedir` directory). Plans are cached in the in-memory plancache until they expire or until that SingleStore Helios node restarts. A plan expires after the interval of time configured by the `plan_expiration_minutes` global variable has elapsed (the default value of this variable is 720, or 12 hours).

When a plan expires from the in-memory plancache, it remains in the on-disk plancache and is loaded back into memory the next time the query is run. If the global variables `enable_disk_plan_expiration` and `disk_plan_expiration_minutes` are set, then a plan that has not been read from disk during the specified `disk_plan_expiration_minutes` limit will expire from the on-disk plancache and be removed from disk.

To delete all plans from the in-memory and on-disk plancaches without waiting for garbage collection, refer to [Drop all Plans from the Plancache](https://docs.singlestore.com/cloud/user-and-workspace-administration/workspace-health-and-performance/managing-plancache-memory-and-disk-usage/#section-idm234875781961294.md) for more information.

To see DML query plans that have been cached in the in-memory plancache, run the [SHOW PLANCACHE](https://docs.singlestore.com/cloud/reference/sql-reference/show-commands/show-plancache.md) command or query against the `information_schema.PLANCACHE` table.

The `information_schema.PLANCACHE` and `information_schema.MV_PLANCACHE` views contain information about query plans that have been compiled and executed, cumulative query execution statistics about each plan, and plancache information for all nodes.

Please note the details on how the `plan_id` and `activity_name` columns can change in the plancache views. The following examples are based on this query: `SELECT * FROM table_name ORDER BY column_name...`

* When the plan expires from memory as set by the `plan_expiration_minutes` engine variable, if the query is re-run, note that the `plan_id` changes, the `activity_name` does not:
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+--------------------------------+
  | plan_id | activity_name                  | 
  +---------+--------------------------------+
  |      13 | Select_planc1_901e63e0a36b09c1 | 
  +---------+--------------------------------+

  ```
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+--------------------------------+
  | plan_id | activity_name                  | 
  +---------+--------------------------------+
  |      33 | Select_planc1_901e63e0a36b09c1 | 
  +---------+--------------------------------+

  ```
* If plans are dropped from both memory and disk, then both the `plan_id` and `activity_name` changes.
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+--------------------------------+
  | plan_id | activity_name                  | 
  +---------+--------------------------------+
  |      33 | Select_planc1_901e63e0a36b09c1 | 
  +---------+--------------------------------+

  ```
  ```sql
  DROP ALL FROM PLANCACHE;
  ```
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+--------------------------------+
  | plan_id | activity_name                  | 
  +---------+--------------------------------+
  |      38 | Select_planc1_2950b7e9280f64bb | 
  +---------+--------------------------------+

  ```
* If the definition of the table queried changes or an engine variable that affects query plans changes, then the `plan_id` and `activity_name` columns both change in the plancache. The following example illustrates the change of both the `plan_id` and `activity_name` when a column is added to the table:
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+--------------------------------+
  | plan_id | activity_name                  | 
  +---------+--------------------------------+
  |      38 | Select_planc1_2950b7e9280f64bb | 
  +---------+--------------------------------+

  ```
  ```sql
  ALTER TABLE planc1 ADD COLUMN d INT;
  ```
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+--------------------------------+
  | plan_id | activity_name                  |
  +---------+--------------------------------+
  |      39 | Select_planc1_4d6188772954c27c | 
  +---------+--------------------------------+

  ```
* The following example illustrates what happens when a plan-affecting engine variable is changed. In this example, auto profiling is enabled. The previous plan remains valid; however, there are two different plans for the same query.
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+--------------------------------+
  | plan_id | activity_name                  |
  +---------+--------------------------------+
  |      39 | Select_planc1_4d6188772954c27c | 
  +---------+--------------------------------+

  ```
  ```sql
  SET GLOBAL enable_auto_profile = 1;

  ```
  ```sql
  SELECT plan_id, activity_name FROM information_schema.PLANCACHE 
  WHERE query_text LIKE 'SELECT * FROM planc1 ORDER BY%';

  ```
  ```output

  +---------+----------------------------------------+
  | plan_id | activity_name                          |
  +---------+----------------------------------------+
  |      39 | Select_planc1_4d6188772954c27c         | 
  |      45 | Select_profile_planc1_cb0dd9eed1c43d24 | 
  +---------+----------------------------------------+ 

  ```

## Overriding Parameterization of Queries

Parameterization is when the constants in a query are converted into parameters, so that queries with the same structure can reuse the query plan. Some query shapes defeat the purpose of parameterization, thereby causing large plancaches.

To override parameterization, use the [NOPARAM](https://docs.singlestore.com/cloud/reference/sql-reference/code-generation-functions/noparam.md) built-in function.

## Non-parametrized Function Parameters

Following is a list of functions whose arguments are not parametrization and hence will create a new plan for each different value.

| Function                                                                        | Which argument is not parameterized            | Explanation                                                                                                                  |
| ------------------------------------------------------------------------------- | ---------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| `EXTRACT(unit FROM datetime)`                                                   | 1st argument (unit)                            | The date/time unit (for example,`DAY`,`MONTH`) is fixed into the plan; different units use different plans.                  |
| `ROUND(num, decimals)`                                                          | 2nd argument (decimals)                        | The number of decimal places is baked into the plan, so`ROUND`(x, 2) and`ROUND`(x, 3) do not share a plan.                   |
| `TRUNC(num, decimals)`                                                          | 2nd argument (decimals)                        | Same idea as`ROUND`: the decimals argument is not parameterized.                                                             |
| `TRUNCATE(num, decimals)`                                                       | 2nd argument (decimals)                        | Same as above; changing decimals gives a different plan.                                                                     |
| `LPAD(str, len [, padstr])`                                                     | 2nd argument (len)                             | The target length is fixed into the plan; different lengths lead to different plans.                                         |
| `RPAD(str, len [, padstr])`                                                     | 2nd argument (len)                             | Same as`LPAD`for the len argument.                                                                                           |
| `DATE_FORMAT(date, format [, ...])`                                             | 2nd and later arguments (starting with format) | The format string (and any extra format arguments) are fixed; queries with different formats do not share a plan.            |
| `TIME_FORMAT(time, format [, ...])`                                             | 2nd and later arguments                        | Same behavior as`DATE_FORMAT`.                                                                                               |
| `TO_DATE(str, format [, ...])`                                                  | 2nd and later arguments (starting with format) | The format is not parameterized so the optimizer can specialize on it.                                                       |
| `TO_NUMBER(str, format [, ...])`                                                | 2nd argument (format)                          | The numeric parsing format is fixed into the plan.                                                                           |
| `FROM_UNIXTIME(unix_ts [, format, ...])`                                        | 2nd and later arguments (starting with format) | The output format and related arguments are not parameterized.                                                               |
| `GROUPING(expr)`                                                                | 1st argument (expr)                            | The grouping expression is treated as part of the plan shape, not as a parameter.                                            |
| `NOW([fsp])`                                                                    | All arguments (normally the optional fsp)      | The fractional-seconds precision (fsp) is fixed into the plan when present.                                                  |
| `UTC_TIMESTAMP([fsp])`                                                          | All arguments                                  | Same as`NOW`for the optional fsp.                                                                                            |
| `UTC_TIME([fsp])`                                                               | All arguments                                  | Same behavior as above.                                                                                                      |
| `CURTIME([fsp])`                                                                | All arguments                                  | Same behavior as above.                                                                                                      |
| `GET_FORMAT(type, format)`                                                      | 1st argument (type)                            | The type (for example,`DATE`,`DATETIME)`is fixed into the plan.                                                              |
| `JSON_MATCH_ANY(json_doc, path, pattern, ...); BSON_MATCH_ANY(...)`             | 3rd and later arguments (patterns)             | The pattern and any additional pattern arguments are not parameterized.                                                      |
| `JSON_MATCH_ANY_EXISTS(json_doc, path, ...); BSON_MATCH_ANY_EXISTS(...)`        | 2nd and later arguments (paths / patterns)     | The path / pattern arguments are fixed into the plan.                                                                        |
| `JSON_EXTRACT(...)`family                                                       | 2nd and later arguments (paths), in some cases | The JSON path arguments are not parameterized when JSON key parameterization is disabled or inside a`JSON_MATCH_ANY`subtree. |
| `BSON_EXTRACT_*`family                                                          | 2nd and later arguments (paths)                | The BSON path arguments are not parameterized.                                                                               |
| `JSON_INCLUDE_MASK / JSON_EXCLUDE_MASK / BSON_INCLUDE_MASK / BSON_EXCLUDE_MASK` | 2nd and later arguments (mask / path)          | The mask / path arguments are fixed, not parameterized.                                                                      |
| Vector sort functions (`VECTOR_SORT_I8 … VECTOR_SORT`)                          | 2nd argument                                   | The second argument (for example, sort direction or related option) is not parameterized.                                    |
| `_MEMSQL_TABLE_ID_LOOKUP (internal)`                                            | 1st argument                                   | Internal helper; first argument is not parameterized.                                                                        |
| `INTERNAL_TEXT_TO_JSON(text, path) (internal)`                                  | 2nd argument (path)                            | Internal helper; the JSON path is not parameterized.                                                                         |

## Codegen Architecture

Code generation produces a specialized programming language, the SingleStore (MemSQL) Plan Language. MPL is simple, tightly managed and designed specifically for SingleStore rather than for general purpose applications. SQL operator trees are converted directly to MPL abstract syntax trees, eliminating the need for expensive parsing and semantic resolution at query compilation time.

In addition to the simplicity of the language, MPL benefits from exposing as primitives many relational operations such as “index seek” and scalar operations such as “`BIGINT` addition”. These operators are compiled statically along with the server and can easily be combined to form a query plan. In this way, much of the work of compilation can be performed ahead of time.

## SingleStore Bytecode

When compiling a query, SingleStore first flattens MPL into a compact format known as SingleStore (MemSQL) Bytecode (MBC). Plans in this form can easily be serialized, interpreted or transformed to [LLVM](http://www.wikipedia.com/wiki/LLVM) bitcode for compilation to `x86_64` machine code. Interpreting the bytecode directly nearly eliminates the first-time cost of running a query, but can hamper execution performance:

```sql
SELECT * FROM t WHERE j > 0;

```

```output

Empty set (0.05 sec)

```

```sql
SELECT * FROM t WHERE j > 0;

```

```output

Empty set (0.05 sec)
```

In this example, there is no additional latency on the first request because the query is interpreted.

```sql
SELECT * FROM t WHERE j > 0;  -- With code generation

```

```output

Empty set (0.08 sec)

```

```sql
SELECT * FROM t WHERE j > 0;  -- Using the cached plan

```

```output

Empty set (0.02 sec)
```

Here, the additional latency on the first request is the result of code generation. However, the subsequent request is more than twice as fast as the interpreted execution above. This goes to show the power of SingleStore code generation.

***

Modified at: April 2, 2026

Source: [/cloud/query-data/advanced-query-topics/code-generation/](https://docs.singlestore.com/cloud/query-data/advanced-query-topics/code-generation/)

(An index of the documentation is available at /llms.txt)
