Remote Backup Script for Windows NTFS Partitions on LVM Volumes

Linux bash script to mount and backup / synchronize a Windows 10 partition inside a LVM volume to a remote backup server using rsync and SSH

Windows 10 partitions on LVM volume

I run the bash script below to backup my Windows NTFS partitions residing on LVM volumes to a remote backup server. It uses SSH and public key authentication to authenticate at the remote side.

The script mounts an NTFS partition inside a LVM raw volume. It performs a file-based backup using rsync. It is NOT suitable for system backups!

Please carefully read the “Requirements”, “How it Works”, and “Usage” sections before attempting to use it.

Disclaimer

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You are aware that using the software (script) may risk the loss of data, or may render your computer inoperable. Backup your computer!

Requirements

  • Both source (host) and target (backup server) must run Linux
  • Your host must support bash (which is usually the case)
  • The backup script only backs up from Logical Volumes (LV)
  • The Logical Volume must contain an NTFS file system
  • The remote backup server must run an SSH server
  • Public key authentication must be used to authenticate at the remote server, before using the script! See “How to Use SSH Public Key Authentication” or “Set up SSH public-key authentication to connect to a remote system” for more information.
  • The script depends on the following packages: libnotify multipath-tools libvirt rsync
  • Sufficient storage space on the remote backup server to hold the backup.

How it Works

This backup script is specifically written for backing up files in an NTFS file system that reside on LVM Logical Volumes. A Logical Volume (LV) can be conceived as a disk partition or disk. In the context of virtualization, the Logical Volume is typically created as an unformatted “raw volume”. During Windows installation, the Windows 10 installer will format the raw volume and create several partitions within that single volume (LV), including the main NTFS partition we know and cherish as “drive C:”.

Even data partitions that we create under Windows 10 usually contain 2 partitions, only one of which is usually visible to the user.

LVM volumes are considered a good storage choice for VMs because of their excellent performance and great flexibility.

The script requires “root” privileges but the volume is mounted under your user name and group. The same goes for the rsync backup utility – it is run as

sudo -u login_user -H rsync ...

Mounting a Windows NTFS Partition Inside a LVM Volume

At first, the script checks to see if the LV we specified is already mounted under Linux. If yes, it aborts. This is a precaution. We don’t want any other process or app inadvertently “force unmount” our source LV while we are in the middle of a backup.

The “NTFS partitions” that Windows has created inside the LV cannot be mounted directly under Linux. We need to use some trick.

kpartx is a device mapper able to read the partition layout inside the LV. It automagically creates sub-volumes of the LV equal to the number of partitions inside the LV. Here is a screenshot before mapping my win10 Windows VM volume:

My Windows 10 system disk as LVM volume vmvg-win10

And here the result of mapping the LV with kpartx -av /dev/…:

kpartx maps the “hidden” partitions inside win10 as win10p1 etc.

In the past I had 4 sub-volumes and the last one would hold “drive C:”. But surprise – the latest Windows release 2004 added another partition. Aside from “drive C:” there is an EFI boot partition (FAT32 format), the rest are recovery partitions.

Strangely enough, my data drives (D:, E:, etc.) have only numbers denoting the sub-volumes, without the “p” like in “vmvg-win10p1”.

After mapping the “sub-volumes”, we need to determine which of them to mount. The easiest way is by size – just mount the largest. I use

lsblk -lo NAME,SIZE

to get the size, then sort them by size and grab the biggest sub-volume. However, if you created and formatted the LV to NTFS while on Linux, you will not get any sub-volumes. In that case you could mount the LV without the need for the kpartx mapper.

The backup script should handle all three scenarios and mount the correct (sub-)volume using your login name and privileges:

mount -t ntfs-3g -o ro,nls=utf8,umask=000,dmask=027,fmask=137,uid=$(id -u $user),gid=$(id -g $user),windows_names,big_writes "$mount_dev" /mnt/vm-volume

Remote Sync using rsync

The script uses rsync – a powerful and versatile remote syncing utility – to copy files from the local computer to a remote backup computer. It employs SSH to connect and authorize at the remote backup server. The user must first authenticate him/herself to the remote computer using public key authentication (see requirements above for instructions). This is a one-time process.

rsync is invoked as follows (see “man rsync” for explanations):

rsync -avuh -zz ...

The backup script is written for either unattended use, for example to backup a drive once a day/week/month using a cron job, or for use via an app-launcher. There are no command line options or GUI to configure.

You need to edit the script and set the options before running it the first time (see “Usage” section below). By default, the script will run in “dry run mode“. In that mode, it will not perform any backup or syncing. However, it will create a log file that shows what it would have done if it was for real.

I strongly recommend that you test each configuration change in “dry run mode” BEFORE you actually execute the backup or synchronization!

As I’ve just mentioned, the script provides two different operational modes: backup and synchronization. The mode is set by the “backup” option inside the script (see highlighted section inside the script below).

Backup Mode

When in backup mode, the script copies all newer or modified directories and files of the source LVM volume to the remote destination (except the directories and/or files specified under “exclude” – see further down).

The rsync utility compares the creation date/time as well as file size parameters to determine if a file has changed and needs to be updated on the remote backup server.

Directories and files that exist in the source volume, but not on the remote target, will be copied (unless they are excluded).

Important: Files that have been deleted on the source LV volume will NOT be deleted on the backup server! In “backup” mode, no files will be deleted on the remote backup server.

Synchronization Mode

Synchronization mode synchronizes the files between the source and the backup server. This means that new and changed files on the source volume will be copied to / updated on the remote backup server.

When using synchronization mode (as opposed to backup mode), a file that has been deleted on the source volume will also be deleted on the remote backup server.

Warning: If you inadvertently deleted files in your source volume and run the backup script in “synchronization mode”, you will be left without a backup for these files!

Use “synchronization mode” only for storage locations such as a work drive where files are constantly changed, moved or deleted, and you do not want to keep the deleted files in the backup.

Example: I copy the photos from my camera to a special “workdrive” LV on a NVMe drive. These files are then culled and processed, sometimes at a much later time. Once processed, I move them to my “photos” LV, spanning one or more HDD. If I were to backup the “workdrive” LV, it would grow rapidly to a huge disk hog. Since the “workdrive” is no permanent storage location, synchronization is fine to have a backup if things go bad.

Source Path Option

By default, the script mounts the LV at its root directory (‘/’), for example “C:\” for the Windows system disk. rsync operates recursively starting at the root of that disk to ensure that all files on that partition (LV) are backed up or synced.

If you do not want to backup or sync the entire volume, you can specify a folder that you wish to use:

source_path='/' # source path to backup

The default is ‘/’ to backup/sync the entire volume. If you wish to backup/sync a specific folder, specify it like this:

source_path='/Documents and Settings/' # source path to backup

This will backup/sync “C:\Documents and Settings\” with all its file and sub-folders.

Exclude Option

Sometimes we want to exclude directories or files. This is where the “exclude” option comes in. Here are some examples:

exclude='' # nothing to exclude
exclude='$RECYCLE.BIN/' # exclude the recycle bin - note the single quotes!
exclude='Pictures/' # exclude the Pictures folder
exclude='Lightroom/*Previews.lrdata/' # exclude all *Previews.lrdata folders
                                      # inside the Lightroom folder
exclude='Lightroom/*.lrcat' # exclude all *.lrcat files in Lightroom folder

Note that the exclude= option can only contain one exclude rule. But you can expand it as follows:

exclude='rule1 --exclude rule2 --exclude rule3'

For example:

exclude='$RECYCLE.BIN/ --exclude Pictures/ --exclude /Lightroom/*.lrcat'

Important: “If the pattern starts with a / then it is anchored to a particular spot in the hierarchy of files, otherwise it is matched against the end of the path name. This is similar to a leading ^ in regular expressions.” See the rsync man page.

Various

While the script runs, notification messages will appear on your screen. It will also tell you if an error occurred (red alert). These notification windows are useful when you create a launcher for the script, as you are not using a terminal window. The notifications should also keep you posted when running the script automatically using a cron job.

Notifications are limited to mounting/unmounting the volume and errors. There is no progress indicator, except for a final message that the backup or sync succeeded or failed.

The script creates a logfile in your home directory with the name of the VM volume you are backing up, for example: /home/username/vm_volume.log. You should run the script in dry-run mode first and carefully inspect the logfile.

Note that the logfile can rapidly grow large as each run of the script will append to the logfile. Depending on the number of files to process, expect to see 30-100MB or more. You should backup and delete the logfiles from time to time.

Usage

Here is an abbreviated usage section, summarizing the steps to use the vm-backup.sh script. For detailed information, see “How it Works” above.

  1. If you haven’t done so already, create a key pair and copy the public key to the remote backup server. (See “Requirements” for some links to instructions.)
  2. Double click on the bash script below, copy and paste into an empty text file. Save the file as “vm-backup.sh” or whatever name you like, with the “.sh” extension.
  3. Inspect the file: the first line reads #!/bin/bash, the last line #EOF.
  4. Edit the following VARIABLES inside the script file:
    • (vm_name=”win10″
      Enter the name of the VM that the volume to backup is associated with, or leave as is. This is for future use.)
    • vm_volume=”vm-win10″
      Enter the name of the Windows LVM volume you wish to backup, for example “win10”, “data”, “photos” etc. Use “ls /dev/mapper/” to get a list of volumes on your PC. The volume must contain a NTFS file system.
    • source_path=’/’
      Leave as is (‘/’) to make a full backup, or add the path to the directory you wish to backup/sync. Always precede the name with a ‘/’.
    • remote_server=”192.168.1.100″
      Enter the IP address of the remote backup server where you wish to create/store the backup.
    • remote_user=”remoteuser”
      Enter the username to login to the remote backup server via SSH and public key authentication.
    • backup_path=”/media/backup/my-win10-backup/”
      Specify the location of the backup on the remote backup server. Be careful to include the trailing “/” – see man rsync for explanations.
    • exclude=’$RECYCLE.BIN/’
      If you want to exclude a path/folder or a file from the backup, put it here. If not, leave empty. See “Exclude Option” above for more details. Simple wildcards such as “*” and “?” are allowed.
    • dryrun=”yes”
      If “yes”, this option tells the script to do a dry-run without performing an actual backup/synchronization. It will create or add to the logfile so you can check if everything is working as expected. Allways do a dryrun BEFORE an actual backup and check the logfile!
      You must change this field to “no” to actually backup or synchronize the data.
    • backup=”yes”
      If “yes”, the script performs a backup (default). If “no”, the script synchronizes the remote backup server with the local source volume and deletes any file that isn’t on the local volume. See “Synchronization Mode” above.
  5. After editing the options, save the script (preferably under a different, meaningful name).
  6. Make the script executable.
  7. Run the script with root privileges “sudo ./name-of-script.sh”.

Bash Script

Download the script below (cut and paste into an empty text file). Edit the variables at the beginning, then save the bash script to “filename.sh” (name of choice ending with .sh) and make it executable.

Note: Create a separate script file for each backup/sync task. Don’t forget to edit the VARIABLES.

#!/bin/bash
#
# vm-backup.sh - backup Windows NTFS partitions on LVM to remote server
# using rsync and ssh.
#
# Copyright (C) 2020 Heiko Sieger
# https://heiko-sieger.info
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <https://www.gnu.org/licenses/>.
# USAGE:
#
# EDIT THE VARIABLES BELOW:
vm_name="win10" # name of VM as listed by sudo virsh list --name
vm_volume="vm-win10" # name of the VM volume to backup as found in "/dev/mapper/"
source_path='/' # source path to backup - default '/' for entire volume
remote_server="192.168.1.100" # IP of remote backup server
remote_user="remoteuser" # user name to use with remote backup server
backup_path="/media/backup/my-win10-backup/" # backup path on remote backup server
exclude='$RECYCLE.BIN/' # exclude option of rsync - can be empty
dryrun="yes" # run dryrun to check - use "no" or "" (empty string) to disable dryrun
backup="yes" # "yes" to backup data, "no" to synchronize (using rsync --delete)
# Then run the script using:
# sudo vm-backup.sh

# Do NOT modify:
vm_path="/dev/mapper/${vm_volume}"
user=$(logname)
logfile="/home/${user}/${vm_volume}.log"
mount_path="/mnt/${vm_volume}"
rc=0
 
mount_vol ()
{
    rw_mode="$1"
    case $rw_mode in
        rw)
            notify-send "Mounting $vm_volume" "Mounting volume $vm_volume in read-write mode." -i dialog-information
            ;;
        ro)
            notify-send "Mounting $vm_volume" "Mounting volume $vm_volume in read-only mode." -i dialog-information
            ;;
        *)
            exit 105
            ;;
    esac
    sleep 1
    mount -t ntfs-3g -o "$rw_mode",nls=utf8,umask=000,dmask=027,fmask=137,uid=$(id -u $user),gid=$(id -g $user),windows_names,big_writes "$mount_dev" "$mount_path"
    rc=$?
}
 
 
if [ $UID -ne 0 ]; then
    notify-send "User error" "Script must be run as 'root'. Abort." -i dialog-error
    exit 101
fi
 
case $dryrun in
    yes|YES|Yes)
        options="-avuhn";;
    no|NO|No)
        options="-avuh";;
    *)
        options="-avuh";;
esac
 
case $backup in
    yes|YES|Yes)
        delete=""
        backup_type="Backup";;
    no|NO|No)
        delete=" --delete"
        backup_type="Synchronization";;
    *)
        delete=""
        backup_type="Backup";;
esac
     
if [ ! -z "$exclude" ]; then exclude=" --exclude ${exclude}"; fi
 
# Check if volume to backup is already mounted.
# $mount_dev returns the mounted /dev/mapper device, or an empty string
mount_dev="$(egrep ${vm_path}[1-9p]?[1-9]? /proc/mounts | awk '{ print $1 }')"
 
if [ -b "$mount_dev" ]; then
    notify-send "Error" "Volume $vm_volume is already mounted and perhaps in use. Abort." -i dialog-error
    exit 102
else
    # Create a mapper device for each partition in the vm_volume
    # Assign the largest (sub)partition to $dev 
    kpartx -av "$vm_path"
    dev="$(lsblk -lo NAME,SIZE | egrep ${vm_volume}[1-9p]?[1-9] | sort -h -k 2 | awk '{ print $(NF-1) }')"
    dev="$(echo $dev | awk '{ print $NF }')"
     
    if [ -z "${dev}" ]; then dev="$vm_volume"; fi
 
    mount_dev="/dev/mapper/$dev"
    mkdir -p "$mount_path"
 
    # Check if VM is shutoff - this is for later use (vm_restore).
    # For backup, mount "ro" so we don't care if the VM is running.
    if virsh list --name --state-shutoff | grep -xq $vm_name; then
        mount_vol ro
    else
        mount_vol ro
    fi
     
    if [ $rc -ne 0 ] ; then
        notify-send "Mount error" "Could not mount volume $vm_volume !!! Backup aborted." -i dialog-error
        # clean up
        kpartx -d "$vm_path"
        rmdir "$mount_path"
        exit 103
    else
        # Start the backup and write a $logfile
        echo "" >> $logfile
        echo "============================================" >> $logfile
        echo "VM path: ${vm_path}" >> $logfile
        echo "Source path: ${source_path}" >> $logfile
        echo "Remote user@server: ${remote_user}@${remote_server}" >> $logfile
        echo "Remote backup path: ${backup_path}" >> $logfile
        echo "Exclude: $exclude" >> $logfile
        echo "$(date) $backup_type start" >> $logfile
        # Run rsync as $user
        sudo -u $user -H rsync $options -zz${exclude}${delete} -e ssh "${mount_path}${source_path}" "${remote_user}"@"${remote_server}":"${backup_path}" >>$logfile 2>>$logfile
        rc=$?
        # Notify user about success or failure
        msg_type="dialog-error"
        heading="$backup_type failed"
        msg="Unknown error"
        case $rc in
            0) msg_type="dialog-information"; heading="$backup_type succeeded"
                msg="$backup_type of ${vm_volume} finished successfully";;
            1) msg="Syntax or usage error";;
            2) msg="Protocol incompatibility";;
            3) msg="Errors selecting input/output files, dirs";;
            4) msg="Requested  action  not supported";;
            5) msg="Error starting client-server protocol";;
            6) msg="Daemon unable to append to log-file";;
            10) msg="Error in socket I/O";;
            11) msg="Error in file I/O";;
            12) msg="Error in rsync protocol data stream";;
            13) msg="Errors with program diagnostics";;
            14) msg="Error in IPC code";;
            20) msg="Received SIGUSR1 or SIGINT";;
            21) msg="Some error returned by waitpid()";;
            22) msg="Error allocating core memory buffers";;
            23) msg="Partial transfer due to error";;
            24) msg="Partial transfer due to vanished source files";;
            25) msg="The --max-delete limit stopped deletions";;
            30) msg="Timeout in data send/receive";;
            35) msg="Timeout waiting for daemon connection";;
        esac
         
        # Log the backup success or failure (with rsync error code)
        if [ $rc -ne 0 ] ; then
            echo "$(date) $backup_type failed with rsync error $rc" >> $logfile
            echo "$msg" >> $logfile
        else
            echo "$(date) $backup_type finished" >> $logfile
        fi
 
        notify-send "${heading}" "${msg} - see log $logfile for more information" -i $msg_type
        echo "${heading}: ${msg} - see log $logfile for more information"
                 
        # Unmount and delete mapper device
        notify-send "Info" "Unmounting volume $vm_volume." -i dialog-information
        umount "$mount_path"
        sleep 2
        kpartx -d "$vm_path"
        rc=$?
        if [ $rc -ne 0 ] ; then
            notify-send "Error" "Could not unmount volume $vm_volume !!!" -i dialog-error
            exit 104
        else
            rmdir "$mount_path"
            exit $rc
        fi
        exit $rc
    fi
fi
 
#EOF

Author: Heiko Sieger

The day has 24 hours. If that isn't enough, I also use the night.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.