# Audit Log File Format

> **⚠️ Warning**: As of the time of this publication, audit logging features are made available and licensed only as part of the SingleStore Enterprise edition.

Audit logs are written to the directory specified by the `auditlogsdir` variable. See [Configure Audit Logging](https://docs.singlestore.com/db/v9.1/security/audit-logging/configuring-audit-logging.md) for details on how to set this variable.

Each node in a cluster will create its own files in the destination directory using the following naming convention:

```
auditlog_<hostname>-<port>_<year>-<month>-<day>_<hour>-<minute>-<second>.log

```

For example, consider the following log file name:

```
auditlog_myVmbox-3306_2016-08-30_06-38-46.log

```

If a log file’s rotation size is very small, an auto-incrementing integer may be appended to the end of the log file. For example:

```
auditlog_myVmbox-3306_2016-08-30_06-38-46_1.log
auditlog_myVmbox-3306_2016-08-30_06-38-46_2.log
auditlog_myVmbox-3306_2016-08-30_06-38-46_3.log

```

Data within the log file is written in comma-separated values (CSV) format. Log files are updated immediately as queries are executed; they are not cached locally and then eventually written to the destination directory.

The `auditlog_disk_sync` engine variable controls whether the audit is written to disk synchronously. This can be set to ON/OFF to specify synchronous/asynchronous writes to disk. Note that if it is set to ON (synchronous writes), there can be an impact to performance as log entries must be flushed to disk before the query executes.

## Log Entry File Schemas

There are four types of schemas for log file entries, as explained below.

## Schema for First Line in a Log File

The first line in a log file has a unique schema that differs from the generic case. Consider the following log entry, which indicates that a new log was opened because the audit log level was changed to `ADMIN-ONLY`:

```
0,2020-08-11 19:04:54.580,INFO: Log opened on MemSQL startup. Format version 2. Log level changed to ADMIN-ONLY

```

The first line above demonstrates the following schema:

```
<log-entry-id>,<timestamp>,INFO: <log-file-event-origin>. <log-format-version>. <log-file-event-action>

```

This line’s schema is described in the table below:

| Example Value           | Definition                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `log-entry-id`          | This value is always`0`for the first line in a logfile.                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `timestamp`             | This value is identical to a normal[timestamp](https://docs.singlestore.com/db/v9.1/security/audit-logging/audit-log-file-format.md).                                                                                                                                                                                                                                                                                                                                                                                                 |
| `log-file-event-origin` | Describes the reason or event that caused the creation of the log file. There are currently two possible values:**Log rotated**- Indicates that a new file was required because rotation conditions were met. When a log is rotated, either the`auditlog_rotation_size`or the`auditlog_rotation_time`engine variables’ conditions were met.**Log opened on MemSQL startup**- Indicates that a new file was created because the node was either started or restarted.                                                                  |
| `log-file-event-action` | Indicates the reason or event that opened or rotated the log file. It can have the following values:**Log level initiated at`<Auditlog-Level>`**- Indicates that audit logging was first enabled at a given level and then the log file was opened.**Log level changed to`<Auditlog-Level>`**- Indicates that logging was already enabled, but the audit logging level was changed.**Logging at level`<Auditlog-Level>`**- Indicates that the system was restarted or the log was rotated, but the audit log level remained the same. |

## Schema for ALL-RESULTS Database Responses

If your node is configured to use the `ALL-RESULTS` logging mode, the database’s responses are logged uniquely from other entries. Consider the following example result entry for the `SHOW VARIABLES LIKE '%master_aggregator%'` query:

```
237,R,master aggregator

```

The log entry above demonstrates the following schema:

```
<parent-query-log-entry-id>,R,<data>

```

Query results share the same entry ID as the query itself. In the example above, the log entry for the query also has an `<entry-id>` value of `237`. The comma-separated `R` denotes that the log entry only contains query result data. Lastly, the returned data is included as the last value in the entry.

## General Log Entry Schema

All entries in a log file adhere to a predefined schema. To help understand the log file schema, consider the following example entry for the statement `UPDATE t SET a = 0 WHERE a = 1`:

```
237,2016-08-25 11:29:09,PDT,localhost:3306,agg,1,99999,root,db,temp_1_14_16,9959767090931131257,UPDATE t SET a = 0 WHERE a = 1

```

The log entry above demonstrates the following schema:

```
<log-entry-id>,<timestamp>,<time-zone>,<hostname>:<port>,<node-type>,<aggregator-id>,<thread-id>,<username>,<database-name>,<correlation-id>,<query-hash>,<query>

```

Therefore, each of the 12 values can be mapped in the following way:

| Example Value                    | Definition       |
| -------------------------------- | ---------------- |
| `237`                            | `log-entry-id`   |
| `2016-08-25 11:29:09.875`        | `timestamp`      |
| `PDT`                            | `time-zone`      |
| `localhost:3306`                 | `hostname:port`  |
| `agg`                            | `node-type`      |
| `1`                              | `aggregator-id`  |
| `99999`                          | `thread-id`      |
| `root`                           | `username`       |
| `db`                             | `database-name`  |
| `temp_1_14_16`                   | `correlation-id` |
| `9959767090931131257`            | `query-hash`     |
| `UPDATE t SET a = 0 WHERE a = 1` | `query`          |

## log-entry-id

The first value is `log-entry-id`, which represents the entry’s unique identifier within the scope of a single log file. Every new file always starts with an entry that has an ID of `0`, incrementing by one afterwards. A `log-entry-id` is a 64-bit unsigned integer value.

The entry ID for any given operation is unique to the node, regardless of how a query is processed across the cluster. For example, imagine a three node cluster with a master aggregator and two leaves. When the master aggregator receives a distributed query and executes it on both leaf nodes, each leaf’s entry ID for the operation may be completely different.

If your database is configured to use the `ALL-RESULTS` logging mode, the results will appear as additional log entries with the same `<log-entry-id>` as the parent. For more information, see [Schema for ALL-RESULTS Database Responses](https://docs.singlestore.com/db/v9.1/security/audit-logging/audit-log-file-format.md).

## timestamp

The second value is `timestamp`, which specifies the node’s local system time when an operation was executed. The timestamp is printed in `%Y-%m-%d %H:%M:%S.%f` format.

## time-zone

The third value is `time-zone`, which specifies the local time zone when an operation was executed. The time zone is printed as an initialism, such as `GMT` (Greenwich Mean Time).

## hostname:port

The fourth value is `hostname:port`, which specifies the node’s hostname and port. The hostname is the same the database’s `hostname` global variable. Together, these values represent the identifier for a node.

## node-type

The fifth value is `node-type`, which specifies whether the node is an aggregator (`agg`) or a leaf (`leaf`).

## aggregator-id

The sixth value is `aggregator-id`, which specifies the ID of the aggregator that issued any given query. An `aggregator-id` is a 64-bit unsigned integer value.

An `aggregator-id` may contain an invalid value of `18446744073709551615` if the aggregator has not yet been given an ID.

In a cluster with multiple aggregators, the ID of the master aggregator never changes, but the ID of a child aggregator may change. For example, if a child aggregator temporarily loses network connectivity and then reconnects, it may acquire a new aggregator ID. The child aggregators that remained connected might also have their aggregator IDs altered during the event.

When correlating a master aggregator’s query with its resulting downstream leaf node queries, you can use the combination of an entry’s `correlation-id` and `aggregator-id` values as a unique identifier. For example, if a query is executed against a cluster with one master aggregator and one leaf node, the query will be logged in the master aggregator’s log file and the leaf node’s log file. To quickly correlate entries for the same query across large log files, use the combination of the `correlation-id` and `aggregator-id` as a key.

## thread-id

The seventh value is `thread-id`, which specifies the ID of the node’s thread that is executing the query. A `thread-id` is a 32-bit integer value.

## username

The eight value is `username`, which specifies the username of the authenticated user who is executing the query.

## database-name

The ninth value is `database-name`, which specifies the name of the database that is being queried.

This value may appear as `[unknown]` in a log entry under a few conditions:

* If the query is not related to a specific database (e.g. `SHOW VARIABLES`).
* If the user does not specify `USE <database-name>` before executing their query. However, in this case the query does contain the database, such as `SELECT * from <db-name>.t`.

## correlation-id

The tenth value is `correlation-id`, which specifies an identifier for the current query. Multiple successive log entries can reference the same correlation-id, especially when a query requires data from more than one leaf partition.

When correlating a master aggregator’s query with its resulting downstream leaf node queries, you can use the combination of an entry’s `correlation-id` and `aggregator-id` values as a unique identifier. For example, if a query is executed against a cluster with one master aggregator and one leaf node, the query will be logged in the master aggregator’s log file and the leaf node’s log file. To quickly correlate entries for the same query across large log files, use the combination of the `correlation-id` and `aggregator-id` as a key.

Under some circumstances, a `correlation-id` value may not appear in the log entry. This value is usually missing when a DDL operation is executed.

## query-hash

The 11th value is `query-hash`, which specifies a unique hash of either the query signature or the query text. The `query-hash` is a 64-bit unsigned integer value.

The full query text will only be hashed if the node’s logging level is set to `ALL-QUERIES-PLAINTEXT` or `ALL-QUERIES-PLAINTEXT-INCLUDING-PARSE-FAILS`. Otherwise, just the query signature will be hashed.

## query

The 12th value is the `query`, which specifies either the query signature or the query text.

The full query text will only be written to the log if the node’s logging level is set to `ALL-QUERIES-PLAINTEXT` or `ALL-QUERIES-PLAINTEXT-INCLUDING-PARSE-FAILS`. Otherwise, just query signature will be hashed.

The `USING` clause of queries is not captured as part of audit logging and can be viewed via [SHOW PROCESSLIST](https://docs.singlestore.com/db/v9.1/reference/sql-reference/show-commands/show-processlist.md) where applicable.

## Log Entry Schema for LOGINS-ONLY Level

If your node is configured to use the `LOGINS-ONLY` logging mode, the log entry uses the following schema:

`<log-entry-id>,<timestamp>,<time-zone>,<hostname>:<port>,<node-type>,USER_LOGIN,<thread-id>,<username>,<remote-host>,<user-grant>,<auth-type>,<success or failure>:<reason>,<tls_cipher>,<app_name>`

**Note**: Login data and the database activity can coexist in one file.

The following log entry uses the schema specified above, and it represents a user `pac` with a successful login attempt:

`3,2020-08-11 18:53:18.150,UTC,c770dd909a9c:3306,agg,USER_LOGIN,100000,pac,192.168.0.1,pac@%,mysql_native_password,SUCCESS,ECDHE-RSA-AES256-GCM-SHA384,mysql`

The following table describes the fields in the schema, and identifies each of the values in the example above:

| Entry                         | Description                                                                                                                                                                                                                                                                                                             | Example Value                 |
| ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------- |
| `log-entry-id`                | Represents the log entry’s unique identifier.                                                                                                                                                                                                                                                                           | 3                             |
| `timestamp`                   | The node’s local system time when an operation was executed.                                                                                                                                                                                                                                                            | 2020-08-11 18:53:18.150       |
| `time-zone`                   | Specifies the local time zone when an operation was executed.                                                                                                                                                                                                                                                           | UTC                           |
| `hostname:port`               | Specifies the node’s hostname and port.                                                                                                                                                                                                                                                                                 | c770dd909a9c:3306             |
| `node-type`                   | Specifies whether the node is an`agg`(aggregator) or a`leaf`node.                                                                                                                                                                                                                                                       | agg                           |
| `USER LOGIN`                  | A constant which indicates that the record format represents a user login.                                                                                                                                                                                                                                              | USER\_LOGIN                   |
| `thread-id`                   | Id of the thread executing the statement.                                                                                                                                                                                                                                                                               | 100000                        |
| `user-name`                   | Specifies the username that is trying to log in.                                                                                                                                                                                                                                                                        | pac                           |
| `remote-host`                 | Hostname/address from which the client is connecting. This value is detected by the server and not reported by the client.                                                                                                                                                                                              | 192.168.0.1                   |
| `user-grant`                  | The`username`@`hostname`specification from the`GRANT`that defines the user for whom authentication is being attempted.                                                                                                                                                                                                  | pac@%                         |
| `auth-type`                   | Indicates the authentication method which is active (for example native password, pam, Kerberos, etc.), as specified by the`GRANT ... IDENTIFIED VIA`statement.                                                                                                                                                         | mysql\_native\_password       |
| `Success`/`Failure: <reason>` | Indicates the success or failure of the login attempt. The`<reason>`is populated only for a failed login attempt, it specifies why the login attempt failed.                                                                                                                                                            | SUCCESS                       |
| `tls_cipher`                  | The TLS encryption algorithm used during the login session. For unencrypted connections, the value is`plaintext`.                                                                                                                                                                                                       | `ECDHE-RSA-AES256-GCM-SHA384` |
| `app_name`                    | The application name retrieved from the session's`program_name`attribute, if set.Query the[MV\_CONNECTION\_ATTRIBUTES](https://docs.singlestore.com/db/v9.1/reference/information-schema-reference/query-performance-workload-management-and-statistics/mv-connection-attributes.md)view for the connection attributes. | `mysql`                       |

For more information on common log entries, refer to [General Log Entry Schema](https://docs.singlestore.com/db/v9.1/security/audit-logging/audit-log-file-format.md).

***

Modified at: November 18, 2024

Source: [/db/v9.1/security/audit-logging/audit-log-file-format/](https://docs.singlestore.com/db/v9.1/security/audit-logging/audit-log-file-format/)

(An index of the documentation is available at /llms.txt)
