SingleStore Managed Service

General Log Entry Schema

All entries in a log file adhere to a predefined schema. To help understand the log file schema, consider the following example entry for the statement UPDATE t SET a = 0 WHERE a = 1:

237,2016-08-25 11:29:09,PDT,localhost:3306,agg,1,99999,root,db,temp_1_14_16,9959767090931131257,UPDATE t SET a = 0 WHERE a = 1

The log entry above demonstrates the following schema:

<log-entry-id>,<timestamp>,<time-zone>,<hostname>:<port>,<node-type>,<aggregator-id>,<thread-id>,<username>,<database-name>,<correlation-id>,<query-hash>,<query>

Therefore, each of the 12 values can be mapped in the following way:

Example Value

Definition

237

log-entry-id

2016-08-25 11:29:09.875

timestamp

PDT

time-zone

localhost:3306

hostname:port

agg

node-type

1

aggregator-id

99999

thread-id

root

username

db

database-name

temp_1_14_16

correlation-id

9959767090931131257

query-hash

UPDATE t SET a = 0 WHERE a = 1

query

log-entry-id

The first value is log-entry-id, which represents the entry’s unique identifier within the scope of a single log file. Every new file always starts with an entry that has an ID of 0, incrementing by one afterwards. A log-entry-id is a 64-bit unsigned integer value.

The entry ID for any given operation is unique to the node, regardless of how a query is processed across the cluster. For example, imagine a three node cluster with a master aggregator and two leaves. When the master aggregator receives a distributed query and executes it on both leaf nodes, each leaf’s entry ID for the operation may be completely different.

If your database is configured to use the ALL-RESULTS logging mode, the results will appear as additional log entries with the same <log-entry-id> as the parent. For more information, see Schema for ALL-RESULTS Database Responses.

timestamp

The second value is timestamp, which specifies the node’s local system time when an operation was executed. The timestamp is printed in %Y-%m-%d %H:%M:%S.%f format.

time-zone

The third value is time-zone, which specifies the local time zone when an operation was executed. The time zone is printed as an initialism, such as GMT (Greenwich Mean Time).

hostname:port

The fourth value is hostname:port, which specifies the node’s hostname and port. The hostname is the same the database’s hostname global variable. Together, these values represent the identifier for a node.

node-type

The fifth value is node-type, which specifies whether the node is an aggregator (agg) or a leaf (leaf).

aggregator-id

The sixth value is aggregator-id, which specifies the ID of the aggregator that issued any given query. An aggregator-id is a 64-bit unsigned integer value.

An aggregator-id may contain an invalid value of 18446744073709551615 if the aggregator has not yet been given an ID.

In a cluster with multiple aggregators, the ID of the master aggregator never changes, but the ID of a child aggregator may change. For example, if a child aggregator temporarily loses network connectivity and then reconnects, it may acquire a new aggregator ID. The child aggregators that remained connected might also have their aggregator IDs altered during the event.

When correlating a master aggregator’s query with its resulting downstream leaf node queries, you can use the combination of an entry’s correlation-id and aggregator-id values as a unique identifier. For example, if a query is executed against a cluster with one master aggregator and one leaf node, the query will be logged in the master aggregator’s log file and the leaf node’s log file. To quickly correlate entries for the same query across large log files, use the combination of the correlation-id and aggregator-id as a key.

thread-id

The seventh value is thread-id, which specifies the ID of the node’s thread that is executing the query. A thread-id is a 32-bit integer value.

username

The eight value is username, which specifies the username of the authenticated user who is executing the query.

database-name

The ninth value is database-name, which specifies the name of the database that is being queried.

This value may appear as [unknown] in a log entry under a few conditions:

  • If the query is not related to a specific database (e.g. SHOW VARIABLES).

  • If the user does not specify USE <database-name> before executing their query. However, in this case the query does contain the database, such as SELECT * from <db-name>.t.

correlation-id

The tenth value is correlation-id, which specifies an identifier for the current query. Multiple successive log entries can reference the same correlation-id, especially when a query requires data from more than one leaf partition.

When correlating a master aggregator’s query with its resulting downstream leaf node queries, you can use the combination of an entry’s correlation-id and aggregator-id values as a unique identifier. For example, if a query is executed against a cluster with one master aggregator and one leaf node, the query will be logged in the master aggregator’s log file and the leaf node’s log file. To quickly correlate entries for the same query across large log files, use the combination of the correlation-id and aggregator-id as a key.

Under some circumstances, a correlation-id value may not appear in the log entry. This value is usually missing when a DDL operation is executed.

query-hash

The 11th value is query-hash, which specifies a unique hash of either the query signature or the query text. The query-hash is a 64-bit unsigned integer value.

The full query text will only be hashed if the node’s logging level is set to ALL-QUERIES-PLAINTEXT or ALL-QUERIES-PLAINTEXT-INCLUDING-PARSE-FAILS. Otherwise, just the query signature will be hashed.

query

The 12th value is the query, which specifies either the query signature or the query text.

The full query text will only be written to the log if the node’s logging level is set to ALL-QUERIES-PLAINTEXT or ALL-QUERIES-PLAINTEXT-INCLUDING-PARSE-FAILS. Otherwise, just query signature will be hashed.

The USING clause of queries is not captured as part of audit logging and can be viewed via SHOW PROCESSLIST where applicable.