Managing Disk Space Used by Transaction Logs

Calculating Disk Space Needed by Transaction Logs

Transaction logs will require (snapshot_trigger_size * snapshots_to_keep) of space over time (approximately), or about 4 GB per partition using the default values of these variables. It is recommended to have twice this amount of disk space available for long-term log storage.

Transaction logs can quickly grow in size and thereby consume a lot of disk space. For example, in an empty database with default settings, each partition requires 2 GB * 2 = 4 GB of disk space for preallocated log space. For a leaf node with 8 partitions, transaction logs can use up to 4 GB * 8 partitions per leaf node = 32 GB. If the cluster has High Availability enabled, this value will double, resulting in 64 GB. Extending that figure to a cluster with 10 databases, transaction logs could use up to 64 GB * 10 databases = 640 GB. Therefore, a highly available cluster with 10 databases and 8 partitions per leaf node can easily use 640 GB just for transaction logs.

Moreover, log files created between snapshots are split into multiple pre-sized files, resulting in several small log files. In this case, each partition would generate up to 2048 MB/256 MB * 2 snapshots = 16 log files before a snapshot is taken.

Controlling Individual Log File Size

Individual log file size can be controlled by adjusting the log_file_size_partitions and log_file_size_ref_dbs global engine variables. Note that changing the log file size will not affect existing databases. However, reducing the size of existing log files or allocating new ones will have random and severe performance impacts on your workload when new log files need to be allocated.

These engine variables impact write throughput when there is a sustained burst of rowstore writes. Dips in throughput indicate the need to create new log file . If your workload is mostly columnstore it's unlikely you would notice anything.

Reducing Disk Space Used by Transaction Logs

To reduce the total disk space occupied by all log files, tune the snapshot_trigger_size and snapshots_to_keep global engine variables while keeping the following in mind.

  • A large snapshot-trigger-size value decreases the frequency at which snapshots are taken but increases the time taken to replay the snapshot.

  • A large snapshots-to-keep value increases the number of snapshots available and therefore increases the amount of space needed to store snapshots and transaction logs.

Managing Transaction Log Growth

To manage the growing size of your transaction logs, you can safely expand the disk size to accommodate new log files.

Deleting Transaction Logs

Warning

Transaction logs should never be deleted manually as doing so may result in the loss or corruption of your data.

To delete transaction logs, run SNAPSHOT on the database in which the logs are growing. The SNAPSHOT command forces the cleanup of log files, thereby allowing you to manually control how much disk space exists between snapshots. Once you issue the SNAPSHOT command, all the logs will be committed to the snapshot and deleted. (note that the engine always keeps two logs files created ahead).

You can run the du Linux command to determine which databases have the largest log files.

In the following situations, SNAPSHOT is run implicitly, in accordance with the value of the snapshots_to_keep variable. This causes deletion of the transaction logs:

  • Transaction logs reach the size defined by the snapshot_trigger_size variable and, as a result, a snapshot is triggered.

  • The BACKUP command is run and, as a result, a snapshot is triggered.

  • Node maintenance operations occur (such as cluster upgrades) and, as a result, a snapshot is triggered.