Backup solution
My previous backup strategy was to copy most of my home directory when someone I know had computer problems. This clearly isn't a good solution, but it mostly comes down to it being too much effort, and I'm too lazy.
Requirements
Hence, I decided to implement a system that would be automatic (without regular input from me). This means it should actually happen! My criteria look something like
- Scriptable (so I don't need to run it manually)
- Incremental (so it doesnt need to re-transfer all data each time)
- Off-site (as one of the main failure modes I'm concerned about is an earthquake destroying my house...
- Regarding off-site, preferably not needing shell access
- Encrypted (see off-site)
- Open source
The software I settled on was restic. While there are lots of other options out there (eg bup, bacula, borg, duplicity, rsync, rclone, rdiff-backup), I liked restic's support for Amazon S3 (for which I already had an account, other cloud providers are available), and relative ease to configure. However, I didn't try any of the other options, I'm sure most of them are good too. See https://wiki.archlinux.org/index.php/Synchronization_and_backup_programs for a good collection.
I want this to just run in the background, so I am using systemd timers to run things automatically. My plan was to run a backup every day, so a daily timer seems to be a good idea. However, I found that often the task would fail (due to lack of a network connection), and so miss the daily backup. Hence I have gone to a half-hourly script, that checks when a backup as last run. This should ensure that backups are run sufficiently often.
Scripts
Here are the contents of a few files
~/scripts/restic-env-S3.sh
#!/bin/sh
export RESTIC_REPOSITORY="s3:https://s3.amazonaws.com/MY_BUCKET_NAME"
export AWS_ACCESS_KEY_ID="MY_AWS_KEY_ID"
export AWS_SECRET_ACCESS_KEY="MY_AWS_ACCESS_KEY"
export RESTIC_PASSWORD="MY_RESTIC_PASSWORD"
This file defines parameters needed to access the repository. Obviously, if not using Amazon S3, the RESTIC_REPOSITORY format will be different. I have one of these files for S3, and one for my USB HDD.
~/scripts/backup.sh
#!/bin/sh
#Must pass as argument the env file
. $1
if [ "x$RESTIC_REPOSITORY" = "x" ]; then
echo "RESTIC_REPOSITORY must be set"
exit 1
fi
FORCE_BACKUP=0
if [ "x$OVERRIDE_TIMESTAMP_CHK" = "x1" ]; then
echo "Forcing backup [$RESTIC_REPOSITORY]"
FORCE_BACKUP=1
fi
TOUCH_FILE="$HOME/backup_touches/$(echo $RESTIC_REPOSITORY | sha512sum -|cut -f1 -d' ')"
FEXISTS=$(test -f $TOUCH_FILE;echo $? )
FRECENT=$(find "$(dirname $TOUCH_FILE)" -mtime -1 -name "$(basename $TOUCH_FILE)" 2>/dev/null | grep -q "." ;echo $? )
if [ $FEXISTS -eq 1 -o $FRECENT -eq 1 -o $FORCE_BACKUP -eq 1 ];
then
sleep 10
echo "Backing up, as no backup made in last day [$RESTIC_REPOSITORY]"
if ~/bin/restic backup --tag kronos /etc /home --exclude-file=$HOME/scripts/excludelist ;
then
echo "Backup succeeeded [$RESTIC_REPOSITORY]"
touch "$TOUCH_FILE"
$HOME/scripts/forget.sh
else
echo "Problem with backup [$RESTIC_REPOSITORY]"
exit 2
fi
exit 0
else
echo "Not backing up, as there is a recent backup [$RESTIC_REPOSITORY]"
fi
This script takes as an arguemnt the previous file (defining the repository parameters), and actually runs the backup. It only runs the backup if the relevent file is older than 1 day. That could be adjusted to another period of time, if desired. Alternatively, if OVERRIDE_TIMESTAMP_CHK is 1, then it runs the backup.
~/scripts/forget.sh
#!/bin/sh
#Must pass as argument the env file
. $1
if [ "x$RESTIC_REPOSITORY" = "x" ]; then
echo "RESTIC_REPOSITORY must be set"
exit 1
fi
echo "Starting to forget [$RESTIC_REPOSITORY]"
if restic forget -y 100 -m 12 -w 5 -d 7 ; then
echo "Forgotten; what was I doing again? [$RESTIC_REPOSITORY]"
else
echo "Problem forgetting [$RESTIC_REPOSITORY]"
exit 1
fi
This removes old snapshots, such that we keep 7 daily, 5 weekly, 12 monthly, and 100 yearly snapshopts. Howevery, no information is removed from the repository, a prune command is required for that (periodically), and I haven't automated that.
Those are the scripts necessary to run backups. I'm sure they could be made better, but they seem functional enough for now.
I also am using systemd to run them.
~/.config/systemd/user/backup-S3.timer
[Unit]
Description=Backup using restic on a timer
Wants=network-online.target
After=network.target network-online.target
[Timer]
OnCalendar=*:0/30:00
Persistent=true
[Install]
WantedBy=timers.target
~/.config/systemd/user/backup-S3.service
[Unit]
Description=restic backup to S3
After=systemd-networkd-wait-online.service
Wants=systemd-networkd-wait-online.service
[Service]
Type=simple
Nice=10
Restart=on-failure
RestartSec=60
Environment="HOME=%h"
ExecStart=%h/scripts/backup.sh %h/scripts/restic-env-S3.sh
[Install]
WantedBy=default.target
These systemd user service and timer files work for me. But, I would say this was the hardest part of all of this. Specifically, this service will run on startup if the computer was off (or asleep, etc) when it was scheduled. But, it will run before the network is properly connected, and so fail. That is what the After, Wants lines are meant for. But, they don't work properly (or I don't understand what they mean exactly). Hence I added the Restart=on-failure, so it will retry 60s later in that case. I assume there is a better way to do this.
For backing up to USB HDD, I have replaced the last block with
[Install]
WantedBy=dev-disk-by\x2duuid-MY_UUID.device
and removed the After, Wants lines, in backup-HDD.service (and corresponding backup-HDD.timer). Thus, it runs the script every 30 minutes, and whenever the device is connected (which is preferable for an external drive).
The timer and service are enabled with
systemctl --user daemon-reload
systemctl --user enable backup-S3.service
systemctl --user start backup-S3.timer
I am actually running these as a user restic, so I have also run
sudo loginctl enable-linger restic
(note: I access a shell for user restic with sudo -u restic bash, but also need to run export XDG_RUNTIME_DIR=/run/user/1002, where 1002 is the UID of restic, to be able to run the systemctl command)
I installed a copy of restic for the user restic, and on the binary ran
sudo setcap cap_dac_read_search=+ep ~restic/bin/restic
so that it would have access to all files. This way, I can avoid running as root, yet still back up all files.
Resources
Some sites I found helpful in doing this:
- https://www.digitalocean.com/community/tutorials/how-to-back-up-data-to-an-object-storage-service-with-the-restic-backup-client
- https://jpmens.net/2017/08/22/my-backup-software-of-choice-restic/
- https://blog.filippo.io/restic-cryptography/
- https://restic.org
- https://restic.readthedocs.io/en/latest/080_examples.html