Building a Robust Linux Backup Strategy with Borg

We have all heard the mantra: backups are important, test your backups, follow the 3-2-1 rule. But actually implementing a backup strategy that works reliably, alerts you when something goes wrong, and survives reboots is another matter entirely. This is the story of how I built a comprehensive backup system for my homelab server.

Why I finally got serious about backups
#

The catalyst was a failing hard drive. My backup disk, a Seagate ST2000DM001, started showing signs of degradation. It had been quietly handling daily backups for years, but SMART monitoring revealed it was on its way out. This was the wake-up call I needed to not just replace the disk, but to rethink the entire backup strategy.

The old setup was simple but fragile: a single script running rsync to an external disk. No monitoring, no offsite copies, no alerting when things went wrong. If a backup silently failed, I would not know until I needed to restore something.

Choosing the right tool: Borg Backup
#

After researching modern backup solutions, I settled on Borg Backup . Borg offers several key advantages over traditional tools like rsync or tar:

Deduplication — Borg identifies duplicate chunks of data across all backups and stores them only once. This means that if you have 7 daily backups, you are not using 7 times the space. In practice, my 70GB system backups only consume about 2% more space per additional backup.

Compression — Built-in compression (lz4, zstd, or zlib) reduces storage requirements without significant CPU overhead.

Encryption — Optional client-side encryption, though I opted out since my backup targets are physically secure.

Mountable archives — You can mount any backup archive as a filesystem and browse it like a regular directory, making file-level recovery trivial.

Efficient pruning — Built-in retention policies let you keep 7 daily, 4 weekly, and 6 monthly backups with a single command.

Alternatives I considered were restic (similar feature set, but I found Borg’s documentation clearer) and good old rsync (no deduplication, no built-in pruning, harder to manage multiple point-in-time backups).

Storage architecture
#

The system I built backs up a Ubuntu server called john-ai that runs KVM virtual machines and systemd-nspawn containers. The storage layout looks like this:

Primary disk (NVMe, 932GB):

/ (92GB) — System root
/home (366GB) — User data
/var (92GB) — Application data
/storage (1.3TB) — Bulk storage, intentionally excluded from backups

Backup targets:

USB HDD (4.5TB) mounted at /mnt/backup — Daily backups
NAS (SMB/CIFS share) mounted at /mnt/nas — Weekly offsite backups

The USB disk handles the frequent, fast backups. The NAS provides offsite protection against fire, theft, or catastrophic failure of the primary machine.

What gets backed up
#

Not everything needs to be backed up. The key insight is to focus on irreplaceable data:

Included:

System root (/) — For disaster recovery
User data (/home) — Documents, code, configuration
Application data (/var) — Databases, service data
VM disk images — Virtual machine storage
Container storage — systemd-nspawn machines

Excluded:

/storage — Bulk storage that can be recreated
/var/log — Transient log files that cause backup errors
/proc, /sys, /dev, /run, /tmp — Virtual filesystems
/mnt, /media — Mount points for other filesystems

Excluding /var/log was a practical decision. Log files are constantly being written to, which causes Borg to report “file changed during backup” warnings. Since logs are transient and not critical for recovery, excluding them eliminated these errors.

Retention strategy
#

The 3-2-1 backup rule says you should have 3 copies of data, on 2 different media types, with 1 offsite. My strategy adapts this for a homelab environment:

Backup Type	Daily	Weekly	Monthly
System (USB)	7	1	1
System (NAS)	—	8	6
VMs (USB)	7	1	1
Containers (USB)	7	1	1

Daily backups to USB give me quick recovery points for the past week. Weekly backups to the NAS extend this to two months, with monthly archives going back half a year. This balances storage efficiency with recovery granularity.

Beyond system files: VMs and containers
#

The server runs several virtual machines and containers that need their own backup strategy:

Virtual Machines (KVM):

Home Assistant — Daily backup of the VM disk image
Development VMs — As needed

Containers (systemd-nspawn):

AdGuard DNS — Daily backup
Minecraft server — Daily backup

VM disk images are backed up while the VMs are running, thanks to Borg’s file-level deduplication. For larger VMs, I could use live snapshots, but the current approach works well for my modest VM sizes.

The system also integrates with my Proxmox backup infrastructure. The backup-status script monitors Proxmox VM backups stored on the NAS, giving me a unified view of all backup health across both my KVM host and the Proxmox server.

Monitoring and alerting
#

A backup system that silently fails is worse than no backup system at all. I built several layers of monitoring:

Failure alerts — Every backup script sends an email if it exits with a non-zero code. The email includes the exit code, the full Borg output, and suggested remediation steps.

Mount checks — Before running, each script verifies that the backup target is mounted. If not, it attempts to mount it. If that fails, an alert is sent.

Staleness detection — A script runs every 6 hours to check that backups are fresh. If the daily backup is older than 30 hours, or the weekly backup older than 174 hours, an alert is sent. This catches situations where cron jobs silently stop running.

Weekly summary — Every Monday at 8 AM, I receive a comprehensive email showing the status of all backups, their sizes, and the current schedule. This serves as a regular health check and reminder that the system is working.

Status dashboard — A backup-status command displays the current state of all backups in a formatted terminal dashboard, including archive counts, sizes, freshness, and any issues detected.

Email notifications use msmtp, a lightweight SMTP client that sends mail through my email provider. The configuration stores credentials securely and handles TLS encryption.

The schedule
#

All backup jobs are defined in a single cron file /etc/cron.d/john-ai-backups:

Job	Schedule	Description
Daily backup	02:00	System, VMs, and containers to USB
Weekly NAS	03:00 Sunday	System backup to NAS
Staleness check	Every 6 hours	Verify backups are fresh
Weekly summary	08:00 Monday	Email status report

Running backups at 2 AM minimizes impact on system performance during active hours. The staleness check runs frequently enough to catch problems quickly, but not so often that it becomes noisy.

Reboot survival
#

A backup system is only reliable if it survives reboots. Here is what makes the configuration persistent:

Component	Mechanism
USB mount	`/etc/fstab` entry with UUID and `nofail` option
NAS mount	`/etc/fstab` entry with `_netdev` option for network dependency
Backup scripts	Installed in `/usr/local/bin/`
Cron jobs	Defined in `/etc/cron.d/john-ai-backups`
Email config	`/etc/msmtprc` with proper permissions

The nofail option for the USB mount ensures the system boots even if the backup disk is disconnected. The _netdev option for the NAS ensures the mount happens after the network is available.

The result
#

The final system provides:

Daily backups of system, VMs, and containers to local USB storage
Weekly offsite backups to NAS storage
Automated retention with configurable policies
Multi-layered alerting for failures, staleness, and weekly status
Unified dashboard for checking backup health at a glance

The peace of mind is substantial. I know that if a disk fails, a file is accidentally deleted, or the entire system is lost, I have multiple recovery options ranging from hours old to months old.

Under the hood: script highlights
#

The core of the system is the daily backup script. Here are the key parts:

Pre-flight checks ensure the backup disk is mounted before proceeding:

# Pre-flight check: verify backup disk is mounted
if ! mountpoint -q /mnt/backup; then
  echo "ERROR: /mnt/backup is not mounted! Attempting to mount..."
  mount /mnt/backup 2>&1
  if ! mountpoint -q /mnt/backup; then
    echo "=== Backup ABORTED: Cannot mount backup disk ==="
    # Send alert email...
    exit 1
  fi
fi

System backup uses Borg with compression and exclusion patterns:

borg create \
  --compression zstd \
  --exclude-caches \
  --exclude '/proc' \
  --exclude '/sys' \
  --exclude '/dev' \
  --exclude '/run' \
  --exclude '/tmp' \
  --exclude '/mnt' \
  --exclude '/media' \
  --exclude '/storage' \
  --exclude '/var/log' \
  --stats \
  "$REPO::daily-{now:%Y-%m-%dT%H:%M}" \
  / /home /var

VM backup includes the VM definition XML for easy restoration:

for VM_DIR in "$VM_BACKUP_DIR"/*; do
  if [ -d "$VM_DIR" ]; then
    VM_NAME=$(basename "$VM_DIR")

    # Dump VM XML definition if VM exists
    if virsh dominfo "$VM_NAME" &>/dev/null; then
      virsh dumpxml "$VM_NAME" > "$VM_DIR/vm-definition.xml"
    fi

    borg create --compression zstd \
      "$VM_REPO::$VM_NAME-daily-{now:%Y-%m-%dT%H:%M}" \
      "$VM_DIR"
  fi
done

Retention pruning keeps the archive count manageable:

borg prune "$REPO" \
  --keep-daily=7 \
  --keep-weekly=1 \
  --keep-monthly=1

The backup-status dashboard
#

Running sudo backup-status gives an at-a-glance view of the entire backup situation:

╔════════════════════════════════════════════════════════════════════════════════╗
║                          JOHN-AI BACKUP STATUS DASHBOARD                       ║
╚════════════════════════════════════════════════════════════════════════════════╝
Generated: 2026-03-25 12:10:35

┌─ STORAGE STATUS ──────────────────────────────────────────────────────────────┐
│ USB Backup Disk (/mnt/backup): MOUNTED
│   Path: /mnt/backup
│   Size: 4.6T | Used: 245G (6%) | Available: 4.1T
│
│ NAS (/mnt/nas):               MOUNTED
│   Path: /mnt/nas
│   Size: 1000G | Used: 472G (48%) | Available: 529G
└───────────────────────────────────────────────────────────────────────────────┘

┌─ DAILY BACKUP (USB) ───────────────────────────────────────────────────────────┐
│ Repository: /mnt/backup/john-ai
│ Archives: 9
│ Last backup: 10h
│ Recent System archives:
│   2026-03-21 at 02:00 - System [3cf6d4ec]
│   2026-03-22 at 02:00 - System [a4968d2a]
│   2026-03-23 at 02:00 - System [19b91ca8]
│   2026-03-24 at 19:32 - System [b9935541]
│   2026-03-25 at 02:00 - System [abc58cee]
│
│ VM Archives:
│   2026-03-25 at 02:03 - 100 homeassistant [e22e6e09]
│
│ Container Archives:
│   2026-03-25 at 02:06 - 102 minecraft [9f19b8c4]
│
│ Repository stats:
│   Original size:     2.88 TB (total data across all archives)
│   Deduplicated:      1.98 TB (after removing duplicates)
│   Compressed:        132.77 GB (actual disk space used)
└───────────────────────────────────────────────────────────────────────────────┘

┌─ WEEKLY BACKUP (NAS) ──────────────────────────────────────────────────────────┐
│ Repository: /mnt/nas/backups/john-ai
│ Archives: 3
│ Last backup: 3d 9h
│ Recent archives:
│   2026-03-10 at 21:57 - System [21b99aae]
│   2026-03-15 at 03:00 - System [99942789]
│   2026-03-22 at 03:00 - System [b9b210ae]
│
│ Repository stats:
│   Original size:     668.23 GB
│   Compressed:        86.16 GB
└───────────────────────────────────────────────────────────────────────────────┘

┌─ VM BACKUPS ───────────────────────────────────────────────────────────────────┐
│ 100 homeassistant: 10h (healthy)
└───────────────────────────────────────────────────────────────────────────────┘

┌─ CONTAINER BACKUPS ────────────────────────────────────────────────────────────┐
│ 101 adguard: 16h (healthy)
│ 102 minecraft: 10h (healthy)
└───────────────────────────────────────────────────────────────────────────────┘

┌─ SUMMARY ──────────────────────────────────────────────────────────────────────┐
│ All backups healthy! No issues detected.
└───────────────────────────────────────────────────────────────────────────────┘

The deduplication statistics tell an important story: 2.88 TB of original data compressed down to just 132 GB on disk. That is a 95% space savings, meaning I can keep far more backup history than a naive copy would allow.

Why I finally got serious about backups#

Choosing the right tool: Borg Backup#

Storage architecture#

What gets backed up#

Retention strategy#

Beyond system files: VMs and containers#

Monitoring and alerting#

The schedule#

Reboot survival#

The result#

Under the hood: script highlights#

The backup-status dashboard#

Resources#