Building a Robust Linux Backup Strategy with Borg
We have all heard the mantra: backups are important, test your backups, follow the 3-2-1 rule. But actually implementing a backup strategy that works reliably, alerts you when something goes wrong, and survives reboots is another matter entirely. This is the story of how I built a comprehensive backup system for my homelab server.
Why I finally got serious about backups#
The catalyst was a failing hard drive. My backup disk, a Seagate ST2000DM001, started showing signs of degradation. It had been quietly handling daily backups for years, but SMART monitoring revealed it was on its way out. This was the wake-up call I needed to not just replace the disk, but to rethink the entire backup strategy.
The old setup was simple but fragile: a single script running rsync to an external disk. No monitoring, no offsite copies, no alerting when things went wrong. If a backup silently failed, I would not know until I needed to restore something.
Choosing the right tool: Borg Backup#
After researching modern backup solutions, I settled on Borg Backup
. Borg offers several key advantages over traditional tools like rsync or tar:
Deduplication — Borg identifies duplicate chunks of data across all backups and stores them only once. This means that if you have 7 daily backups, you are not using 7 times the space. In practice, my 70GB system backups only consume about 2% more space per additional backup.
Compression — Built-in compression (lz4, zstd, or zlib) reduces storage requirements without significant CPU overhead.
Encryption — Optional client-side encryption, though I opted out since my backup targets are physically secure.
Mountable archives — You can mount any backup archive as a filesystem and browse it like a regular directory, making file-level recovery trivial.
Efficient pruning — Built-in retention policies let you keep 7 daily, 4 weekly, and 6 monthly backups with a single command.
Alternatives I considered were restic (similar feature set, but I found Borg’s documentation clearer) and good old rsync (no deduplication, no built-in pruning, harder to manage multiple point-in-time backups).
Storage architecture#
The system I built backs up a Ubuntu server called john-ai that runs KVM virtual machines and systemd-nspawn containers. The storage layout looks like this:
Primary disk (NVMe, 932GB):
/(92GB) — System root/home(366GB) — User data/var(92GB) — Application data/storage(1.3TB) — Bulk storage, intentionally excluded from backups
Backup targets:
- USB HDD (4.5TB) mounted at
/mnt/backup— Daily backups - NAS (SMB/CIFS share) mounted at
/mnt/nas— Weekly offsite backups
The USB disk handles the frequent, fast backups. The NAS provides offsite protection against fire, theft, or catastrophic failure of the primary machine.
What gets backed up#
Not everything needs to be backed up. The key insight is to focus on irreplaceable data:
Included:
- System root (
/) — For disaster recovery - User data (
/home) — Documents, code, configuration - Application data (
/var) — Databases, service data - VM disk images — Virtual machine storage
- Container storage — systemd-nspawn machines
Excluded:
/storage— Bulk storage that can be recreated/var/log— Transient log files that cause backup errors/proc,/sys,/dev,/run,/tmp— Virtual filesystems/mnt,/media— Mount points for other filesystems
Excluding /var/log was a practical decision. Log files are constantly being written to, which causes Borg to report “file changed during backup” warnings. Since logs are transient and not critical for recovery, excluding them eliminated these errors.
Retention strategy#
The 3-2-1 backup rule says you should have 3 copies of data, on 2 different media types, with 1 offsite. My strategy adapts this for a homelab environment:
| Backup Type | Daily | Weekly | Monthly |
|---|---|---|---|
| System (USB) | 7 | 1 | 1 |
| System (NAS) | — | 8 | 6 |
| VMs (USB) | 7 | 1 | 1 |
| Containers (USB) | 7 | 1 | 1 |
Daily backups to USB give me quick recovery points for the past week. Weekly backups to the NAS extend this to two months, with monthly archives going back half a year. This balances storage efficiency with recovery granularity.
Beyond system files: VMs and containers#
The server runs several virtual machines and containers that need their own backup strategy:
Virtual Machines (KVM):
- Home Assistant — Daily backup of the VM disk image
- Development VMs — As needed
Containers (systemd-nspawn):
- AdGuard DNS — Daily backup
- Minecraft server — Daily backup
VM disk images are backed up while the VMs are running, thanks to Borg’s file-level deduplication. For larger VMs, I could use live snapshots, but the current approach works well for my modest VM sizes.
The system also integrates with my Proxmox backup infrastructure. The backup-status script monitors Proxmox VM backups stored on the NAS, giving me a unified view of all backup health across both my KVM host and the Proxmox server.
Monitoring and alerting#
A backup system that silently fails is worse than no backup system at all. I built several layers of monitoring:
Failure alerts — Every backup script sends an email if it exits with a non-zero code. The email includes the exit code, the full Borg output, and suggested remediation steps.
Mount checks — Before running, each script verifies that the backup target is mounted. If not, it attempts to mount it. If that fails, an alert is sent.
Staleness detection — A script runs every 6 hours to check that backups are fresh. If the daily backup is older than 30 hours, or the weekly backup older than 174 hours, an alert is sent. This catches situations where cron jobs silently stop running.
Weekly summary — Every Monday at 8 AM, I receive a comprehensive email showing the status of all backups, their sizes, and the current schedule. This serves as a regular health check and reminder that the system is working.
Status dashboard — A backup-status command displays the current state of all backups in a formatted terminal dashboard, including archive counts, sizes, freshness, and any issues detected.
Email notifications use msmtp, a lightweight SMTP client that sends mail through my email provider. The configuration stores credentials securely and handles TLS encryption.
The schedule#
All backup jobs are defined in a single cron file /etc/cron.d/john-ai-backups:
| Job | Schedule | Description |
|---|---|---|
| Daily backup | 02:00 | System, VMs, and containers to USB |
| Weekly NAS | 03:00 Sunday | System backup to NAS |
| Staleness check | Every 6 hours | Verify backups are fresh |
| Weekly summary | 08:00 Monday | Email status report |
Running backups at 2 AM minimizes impact on system performance during active hours. The staleness check runs frequently enough to catch problems quickly, but not so often that it becomes noisy.
Reboot survival#
A backup system is only reliable if it survives reboots. Here is what makes the configuration persistent:
| Component | Mechanism |
|---|---|
| USB mount | /etc/fstab entry with UUID and nofail option |
| NAS mount | /etc/fstab entry with _netdev option for network dependency |
| Backup scripts | Installed in /usr/local/bin/ |
| Cron jobs | Defined in /etc/cron.d/john-ai-backups |
| Email config | /etc/msmtprc with proper permissions |
The nofail option for the USB mount ensures the system boots even if the backup disk is disconnected. The _netdev option for the NAS ensures the mount happens after the network is available.
The result#
The final system provides:
- Daily backups of system, VMs, and containers to local USB storage
- Weekly offsite backups to NAS storage
- Automated retention with configurable policies
- Multi-layered alerting for failures, staleness, and weekly status
- Unified dashboard for checking backup health at a glance
The peace of mind is substantial. I know that if a disk fails, a file is accidentally deleted, or the entire system is lost, I have multiple recovery options ranging from hours old to months old.
Under the hood: script highlights#
The core of the system is the daily backup script. Here are the key parts:
Pre-flight checks ensure the backup disk is mounted before proceeding:
# Pre-flight check: verify backup disk is mounted
if ! mountpoint -q /mnt/backup; then
echo "ERROR: /mnt/backup is not mounted! Attempting to mount..."
mount /mnt/backup 2>&1
if ! mountpoint -q /mnt/backup; then
echo "=== Backup ABORTED: Cannot mount backup disk ==="
# Send alert email...
exit 1
fi
fiSystem backup uses Borg with compression and exclusion patterns:
borg create \
--compression zstd \
--exclude-caches \
--exclude '/proc' \
--exclude '/sys' \
--exclude '/dev' \
--exclude '/run' \
--exclude '/tmp' \
--exclude '/mnt' \
--exclude '/media' \
--exclude '/storage' \
--exclude '/var/log' \
--stats \
"$REPO::daily-{now:%Y-%m-%dT%H:%M}" \
/ /home /varVM backup includes the VM definition XML for easy restoration:
for VM_DIR in "$VM_BACKUP_DIR"/*; do
if [ -d "$VM_DIR" ]; then
VM_NAME=$(basename "$VM_DIR")
# Dump VM XML definition if VM exists
if virsh dominfo "$VM_NAME" &>/dev/null; then
virsh dumpxml "$VM_NAME" > "$VM_DIR/vm-definition.xml"
fi
borg create --compression zstd \
"$VM_REPO::$VM_NAME-daily-{now:%Y-%m-%dT%H:%M}" \
"$VM_DIR"
fi
doneRetention pruning keeps the archive count manageable:
borg prune "$REPO" \
--keep-daily=7 \
--keep-weekly=1 \
--keep-monthly=1The backup-status dashboard#
Running sudo backup-status gives an at-a-glance view of the entire backup situation:
╔════════════════════════════════════════════════════════════════════════════════╗
║ JOHN-AI BACKUP STATUS DASHBOARD ║
╚════════════════════════════════════════════════════════════════════════════════╝
Generated: 2026-03-25 12:10:35
┌─ STORAGE STATUS ──────────────────────────────────────────────────────────────┐
│ USB Backup Disk (/mnt/backup): MOUNTED
│ Path: /mnt/backup
│ Size: 4.6T | Used: 245G (6%) | Available: 4.1T
│
│ NAS (/mnt/nas): MOUNTED
│ Path: /mnt/nas
│ Size: 1000G | Used: 472G (48%) | Available: 529G
└───────────────────────────────────────────────────────────────────────────────┘
┌─ DAILY BACKUP (USB) ───────────────────────────────────────────────────────────┐
│ Repository: /mnt/backup/john-ai
│ Archives: 9
│ Last backup: 10h
│ Recent System archives:
│ 2026-03-21 at 02:00 - System [3cf6d4ec]
│ 2026-03-22 at 02:00 - System [a4968d2a]
│ 2026-03-23 at 02:00 - System [19b91ca8]
│ 2026-03-24 at 19:32 - System [b9935541]
│ 2026-03-25 at 02:00 - System [abc58cee]
│
│ VM Archives:
│ 2026-03-25 at 02:03 - 100 homeassistant [e22e6e09]
│
│ Container Archives:
│ 2026-03-25 at 02:06 - 102 minecraft [9f19b8c4]
│
│ Repository stats:
│ Original size: 2.88 TB (total data across all archives)
│ Deduplicated: 1.98 TB (after removing duplicates)
│ Compressed: 132.77 GB (actual disk space used)
└───────────────────────────────────────────────────────────────────────────────┘
┌─ WEEKLY BACKUP (NAS) ──────────────────────────────────────────────────────────┐
│ Repository: /mnt/nas/backups/john-ai
│ Archives: 3
│ Last backup: 3d 9h
│ Recent archives:
│ 2026-03-10 at 21:57 - System [21b99aae]
│ 2026-03-15 at 03:00 - System [99942789]
│ 2026-03-22 at 03:00 - System [b9b210ae]
│
│ Repository stats:
│ Original size: 668.23 GB
│ Compressed: 86.16 GB
└───────────────────────────────────────────────────────────────────────────────┘
┌─ VM BACKUPS ───────────────────────────────────────────────────────────────────┐
│ 100 homeassistant: 10h (healthy)
└───────────────────────────────────────────────────────────────────────────────┘
┌─ CONTAINER BACKUPS ────────────────────────────────────────────────────────────┐
│ 101 adguard: 16h (healthy)
│ 102 minecraft: 10h (healthy)
└───────────────────────────────────────────────────────────────────────────────┘
┌─ SUMMARY ──────────────────────────────────────────────────────────────────────┐
│ All backups healthy! No issues detected.
└───────────────────────────────────────────────────────────────────────────────┘The deduplication statistics tell an important story: 2.88 TB of original data compressed down to just 132 GB on disk. That is a 95% space savings, meaning I can keep far more backup history than a naive copy would allow.