Managing Disk Space Used by Transaction Logs
Calculating Disk Space Needed by Transaction Logs
Transaction logs require (snapshot_trigger_size
* snapshots_to_keep
) of space over time (approximately), or about 4 GB per partition using the default values of these variables. Hence the transaction logs can grow to 2GB before a snapshot is taken. It is recommended to have twice this amount of disk space available for long-term log storage.
Each database partition has two preallocated transaction log files. By default, these files are 256 MB each. The size of these preallocated transaction log files is defined by the variable log_file_size_partitions
. For example, if each leaf has 8 partitions, an empty database will require 2 log files * 256 MB * 8 partitions = 4 GB of space in transaction logs per leaf. It is important to remember that paired mode high availability is on by default. This means that each leaf will have one set of master database partitions and one set of replica database partitions, and therefore the minimum amount of space required for transaction logs is doubled: 2 log files * 256 MB * 16 partitions = 8 GB per leaf node. Extending this to a cluster with 10 databases, the transaction logs will consume at least 8 GB * 10 databases = 80 GB.
Moreover, when queries are submitted to the database transaction log files are filled. Once the preallocated logs become full, more will be created. The transaction logs will only be truncated once a snapshot is taken. Snapshots occur when a partition has enough transactions to meet the snapshot_trigger_size
, which is 2GB by default. See the section Deleting Transaction Logs below for more information on snapshots.
Controlling Individual Log File Size
Individual log file size can be controlled by adjusting the log_file_size_partitions
and log_file_size_ref_dbs
global engine variables. These variables are set to large values so that the preallocated files can handle sudden high throughput. Note that changing the log file size will not affect existing databases. Reducing the size of existing log files will have a random and severe performance impact on your workload when new log files need to be allocated.
These engine variables impact write throughput when there is a sustained burst of rowstore writes. Dips in throughput indicate the need to create a new log file. If your workload is mostly columnstore it's unlikely you would notice anything.
Reducing Disk Space Used by Transaction Logs
To reduce the total disk space occupied by all log files, tune the snapshot_trigger_size
and snapshots_to_keep
global engine variables while keeping the following in mind.
A lower
snapshot_trigger_size
value will result in more frequent snapshots, thereby reducing the accumulation of transaction logs. However, a higher setting will lead to fewer snapshots and result in a longer replay time for the snapshots taken.A large
snapshots_to_keep
value increases the number of snapshots available and therefore increases the amount of space needed to store snapshots.
Managing Transaction Log Growth
To manage the growing size of your transaction logs, you can safely expand the disk size to accommodate new log files.
Deleting Transaction Logs
Warning
Transaction log files should never be deleted manually as doing so may result in the loss or corruption of your data.
To delete transaction logs, run SNAPSHOT on the database in which the logs are growing. This command forces the cleanup of log files, thereby allowing you to control how much disk space exists between snapshots. Once you issue this command, all the logs will be committed to the snapshot and deleted. (note that the engine always keeps two log files created ahead).
You can run the du
Linux command to determine which databases have the largest log files.
In the following situations, SNAPSHOT is run implicitly, in accordance with the value of the snapshots_to_keep
variable. This causes deletion of the transaction logs:
Transaction logs reach the size defined by the
snapshot_trigger_size
variable and, as a result, a snapshot is triggered.The BACKUP command is run and, as a result, a snapshot is triggered.
Node maintenance operations occur (such as cluster upgrades) and, as a result, a snapshot is triggered.
The transaction logs may not get immediately cleaned out because the garbage collector (GC) has to do the final cleanup to complete the deletion process. Since the transaction logs are a part of the normal operation they will regrow so ideally there should be enough disk space to completely fill the TX logs and snapshots.