Verifying Full Backup Files

This section applies to full backups, only. It does not apply to incremental backups.

With each backup, there is a checksum associated with the files that comprise the backup. Each checksum is a long string of hex characters, with each eight characters representing a subsection of the backup. To create this checksum, SingleStore utilizes CRC32C, a common variant of CRC32, to process the backup files.

Our implementation of CRC32C has been verified with the commonly used package crcmod, but any library that implements or uses CRC32C can be utilized.

To illustrate how a checksum is comprised of the discrete parts of a database backup, consider the following example where all backup files for the cluster are stored into a NFS share named backup_database:

| backup_database
\
| BACKUP_COMPLETE
| db.backup
| db_0.backup
| db_0.backup_columns0.tar
| db_0.backup_columns1.tar
| db_1.backup
| db_2.backup
| db_2.backup_columns0.tar
| db_2.backup_columns1.tar
| db_2.backup_columns2.tar
| db_3.backup
| db_3.backup_columns0.tar

This is a backup is for a small, mixed columnstore/rowstore to an NFS drive with four partitions. The checksum for this backup is 91b3d3353709baa1eff3ba5cfba6bac939b318f41652eac4 and is found in the BACKUP_COMPLETE sentinel file that was created on the master aggregator node in its relative /data/db_backup/ SingleStore directory.

The first 8 characters are the CRC32C of the reference database, db.backup, on the master aggregator.

Next, each partition in order, either has 8 or 16 characters, depending on whether it has columnstore segment tar files or not. In the latter case, the first 8 characters are reserved for the partitions snapshot, db_0.backup while the second 8 characters are for the segment tar files.

If segment tar files have been created, each segment tar file is concatenated, and then the CRC32C checksum is taken. This is equivalent to calculating the CRC32C of db_0.backup_columns0.tar and db_0.backup_columns1.tar concatenated together or calculating the CRC32C of the first tar file, and then the second file, and so on without finalizing the checksum.

Note Even if a database has a columnstore table, there may not be a corresponding tar file for a partition, because data may be cached in the hidden rowstore table for that columnstore table. In this case, the data would be contained inside the rowstore snapshot.

The following Python script can be used to verify the backup checksum. Copy and paste this script into a new file named verify_backup.py. Usage instructions are covered in the code comments for the script. After running the script, run echo $?. A result of 0 means the verification was successful, while a result of -1 means it was not. The script is well commented and includes debugging output if there was an error in the validation process.

Note

Script completion time is dependent on the size of the backup and may take hours to complete.

# verify_backup.py
#
# Given a directory, this script will verify that the backup crc and size both
# are unchanged.
#
# REQUIRES: crcmod to be installed: https://pypi.org/project/crcmod/
#
# USAGE: python verify_backup.py /absolute/path/to/backup
#
# NOTE: This script needs read privileges on all files being verified.
#
import crcmod
import glob
import json
import sys
import errno
# VerifyBackup:
# Verifies the CRC located in the backup sentinel file (BACKUP_COMPLETE)
# matches the calculated CRC of files in backupDirectory.
#
# Param backupDirectory: absolute path to directory where backup exists.
# Return: 0 on success, -1 on failure.
#
def verifyBackup(backupDirectory):
# Strip off trailing '/' if exists.
#
if backupDirectory[-1] == '/':
backupDirectory = backupDirectory[:(len(backupDirectory)-1)]
with open("%s/BACKUP_COMPLETE" % backupDirectory, "r") as f:
buf = f.read()
backupDictionary = json.loads(buf)
try:
finalCrc = backupDictionary["Checksum"]
dbName = backupDictionary["Database_Name"]
numPartitions = int(backupDictionary["Num_Partitions"])
except KeyError as e:
print e
print "Sentinel File 'BACKUP_COMPLETE' is from unsupported version of backup."
return -1
# This is in the crc32c specification, crcmod also has crc32c hardcoded,
# so either can be used.
#
crc = crcmod.Crc(0x11EDC6F41, rev=True, initCrc=0x00000000, xorOut =0xFFFFFFFF)
crclist = ""
# Process the reference snapshot.
#
with open("%s/%s.backup" % (backupDirectory, dbName), "r") as f:
buf = f.read()
crc.update(buf)
crclist += crc.hexdigest()
# Process each partition.
#
for i in range (numPartitions):
crc = crc.new()
# Process Partition snapshot.
# Each Partition MUST have a snapshot.
#
with open("%s/%s_%d.backup" % (backupDirectory, dbName, i), "r") as f:
buf = f.read()
crc.update(buf)
crclist += crc.hexdigest()
# Snapshots and Columns are checksummed with separate CRC's.
#
crc = crc.new()
# To emulate a do while loop in Python.
tarFound = True
# If the columnar blobs is non empty, append the crc to the list.
columnCrc = False
j = 0
# Process all tarballed columnstore files.
#
# NOTE: Even if a database has a columnstore, this does not imply
# each partition has columnar blobs. The data might exist in the
# rowstore snapshot or might be skewed such that all rows exist
# in other partitions.
#
while tarFound:
try:
with open("%s/%s_%d.backup_columns%d.tar" % (backupDirectory, dbName, i,j) , "r") as f:
buf = f.read()
crc.update(buf)
j += 1
columnCrc = True
except IOError as e:
if e.errno == errno.ENOENT:
tarFound = False
else:
assert e
if columnCrc:
crclist += crc.hexdigest()
# CRC's will be of different case, make both uppercase.
#
if crclist != finalCrc.upper():
print "Crc calculated from directory:" + crclist
print "Crc in backup file :" + finalCrc
print "Crcs do not match!"
return -1
return 0
if __name__ == '__main__':
backupDirectory = sys.argv[1]
if len(sys.argv) != 2:
print "Incorrect usage: please include just the directory where the backup is located."
sys.exit(verifyBackup(backupDirectory))

Related Topics

Last modified: April 3, 2023

Was this article helpful?