Linux Tutorial

Setting Up Automated ZFS Snapshots with Sanoid on Ubuntu Server

4views

ZFS snapshots are one of the most powerful features of the ZFS filesystem, providing instant point-in-time copies of your data. Sanoid is an elegant snapshot management tool that automates ZFS snapshot creation, retention, and replication.

This guide will show you how to set up automated ZFS snapshots using Sanoid on Ubuntu Server, ensuring your data is always protected.

Time Required: 30-40 minutes
Difficulty Level: Intermediate to Advanced
Prerequisites:

  • Ubuntu Server 20.04 or later with ZFS installed
  • Root or sudo access
  • Basic understanding of ZFS concepts
  • At least one ZFS pool is configured

Why Sanoid for ZFS Snapshots?

Unlike manual snapshot scripts or basic cron jobs, Sanoid offers:

  • Flexible retention policies – Keep hourly, daily, weekly, monthly snapshots
  • Automatic snapshot pruning – Old snapshots are automatically deleted
  • Dataset templates – Apply consistent policies across multiple datasets
  • Syncoid integration – Replicate snapshots to remote systems
  • Minimal overhead – Lightweight and efficient
  • Battle-tested – Used in production environments worldwide

Step 1: Verify ZFS Installation

First, confirm ZFS is properly installed and working on your Ubuntu server:

zfs version

You should see output showing both zfs and zfs-kmod versions. If ZFS isn’t installed:

sudo apt update
sudo apt install zfsutils-linux -y

Check your existing ZFS pools:

zpool list
zpool status

For this guide, we’ll assume you have a pool named tank. Adjust commands according to your actual pool name.

Step 2: Install Sanoid and Dependencies

Sanoid requires several Perl modules and utilities. Install all dependencies:

sudo apt update
sudo apt install -y git perl libcapture-tiny-perl libconfig-inifiles-perl pv lzop mbuffer

Clone the Sanoid repository from GitHub:

cd /opt
sudo git clone https://github.com/jimsalterjrs/sanoid.git

Create symbolic links to make Sanoid accessible system-wide:

sudo ln -s /opt/sanoid/sanoid /usr/local/sbin/sanoid
sudo ln -s /opt/sanoid/syncoid /usr/local/sbin/syncoid
sudo ln -s /opt/sanoid/findoid /usr/local/sbin/findoid

Verify the installation:

sanoid --version

Step 3: Create Sanoid Configuration Directory

Create the necessary configuration directory:

sudo mkdir -p /etc/sanoid

Copy the default configuration template:

sudo cp /opt/sanoid/sanoid.defaults.conf /etc/sanoid/sanoid.defaults.conf

Create your main configuration file:

sudo nano /etc/sanoid/sanoid.conf

Step 4: Configure Basic Snapshot Policies

Add the following basic configuration to /etc/sanoid/sanoid.conf:

#############################
# Default Template Settings #
#############################

[template_production]
        frequently = 0
        hourly = 24
        daily = 7
        weekly = 4
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

[template_backup]
        frequently = 0
        hourly = 0
        daily = 14
        weekly = 8
        monthly = 6
        yearly = 0
        autosnap = yes
        autoprune = yes

#############################
# Dataset Configurations    #
#############################

[tank/important-data]
        use_template = production
        recursive = yes

[tank/backups]
        use_template = backup
        recursive = no

Configuration Breakdown:

  • frequently: Snapshots taken every 15 minutes (0 = disabled)
  • hourly: Number of hourly snapshots to retain
  • daily: Number of daily snapshots to retain
  • weekly: Number of weekly snapshots to retain
  • monthly: Number of monthly snapshots to retain
  • yearly: Number of yearly snapshots to retain
  • autosnap: Automatically create new snapshots (yes/no)
  • autoprune: Automatically delete old snapshots (yes/no)
  • recursive: Apply policy to child datasets (yes/no)

Replace tank/important-data and tank/backups with your actual ZFS datasets.

Step 5: Configure Advanced Retention Policies

For more granular control, create custom templates:

sudo nano /etc/sanoid/sanoid.conf

Add advanced configurations:

[template_database]
        frequently = 4
        frequent_period = 15
        hourly = 48
        daily = 30
        weekly = 8
        monthly = 12
        yearly = 2
        autosnap = yes
        autoprune = yes

[template_media]
        frequently = 0
        hourly = 0
        daily = 7
        weekly = 0
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

[tank/databases/mysql]
        use_template = database
        recursive = yes
        process_children_only = no

[tank/media]
        use_template = media
        recursive = yes

Template Examples:

  • Database template: Frequent snapshots (every 15 min), long retention
  • Media template: Minimal snapshots since media files rarely change
  • process_children_only: Only snapshot child datasets, not parent

Step 6: Test Sanoid Configuration

Before automating, test your configuration manually:

# Test without making changes (dry run)
sudo sanoid --verbose --readonly --debug

# Take actual snapshots
sudo sanoid --take-snapshots --verbose

# Prune old snapshots
sudo sanoid --prune-snapshots --verbose

Verify snapshots were created:

zfs list -t snapshot -r tank

You should see snapshots with names like:

tank/important-data@autosnap_2024-11-17_12:00:00_hourly

Step 7: Set Up Automated Snapshots with Cron

Create a cron job to run Sanoid automatically. As mentioned in our fail2ban SSH protection guide, automation is key to server security and reliability.

Edit the root crontab:

sudo crontab -e

Add these entries:

# Take snapshots every 15 minutes
*/15 * * * * /usr/local/sbin/sanoid --take-snapshots --quiet

# Prune old snapshots daily at 2 AM
0 2 * * * /usr/local/sbin/sanoid --prune-snapshots --quiet

Cron Schedule Explanation:

  • */15 * * * * – Every 15 minutes
  • 0 2 * * * – Every day at 2:00 AM

For better logging, redirect output:

*/15 * * * * /usr/local/sbin/sanoid --take-snapshots --quiet >> /var/log/sanoid.log 2>&1
0 2 * * * /usr/local/sbin/sanoid --prune-snapshots --quiet >> /var/log/sanoid.log 2>&1

Create the log file:

sudo touch /var/log/sanoid.log
sudo chmod 644 /var/log/sanoid.log

Step 8: Configure Systemd Timer (Alternative to Cron)

For a more modern approach, use systemd timers instead of cron:

Create the service file:

sudo nano /etc/systemd/system/sanoid.service

Add this content:

[Unit]
Description=Sanoid ZFS Snapshot Management
Requires=zfs.target
After=zfs.target

[Service]
Type=oneshot
ExecStart=/usr/local/sbin/sanoid --cron

Create the timer file:

sudo nano /etc/systemd/system/sanoid.timer

Add:

[Unit]
Description=Sanoid ZFS Snapshot Timer
Requires=sanoid.service

[Timer]
OnCalendar=*:0/15
Persistent=true

[Install]
WantedBy=timers.target

Enable and start the timer:

sudo systemctl daemon-reload
sudo systemctl enable sanoid.timer
sudo systemctl start sanoid.timer

Check timer status:

sudo systemctl status sanoid.timer
sudo systemctl list-timers sanoid.timer

Step 9: Monitor Snapshot Creation

Create a monitoring script to track snapshot health:

sudo nano /usr/local/bin/check-sanoid-snapshots.sh

Add this content:

#!/bin/bash

POOL="tank"
MAX_AGE=3600  # 1 hour in seconds

LATEST=$(zfs list -t snapshot -o name,creation -s creation -r $POOL | tail -1 | awk '{print $2" "$3}')
TIMESTAMP=$(date -d "$LATEST" +%s)
CURRENT=$(date +%s)
AGE=$((CURRENT - TIMESTAMP))

if [ $AGE -gt $MAX_AGE ]; then
    echo "WARNING: Latest snapshot is $((AGE/60)) minutes old"
    # Send alert (email, Slack, etc.)
else
    echo "OK: Snapshots are current"
fi

# Show snapshot count
echo "Total snapshots: $(zfs list -t snapshot -r $POOL | wc -l)"

Make it executable:

sudo chmod +x /usr/local/bin/check-sanoid-snapshots.sh

Run it:

sudo /usr/local/bin/check-sanoid-snapshots.sh

Step 10: Set Up Remote Replication with Syncoid

Syncoid (included with Sanoid) enables easy ZFS replication to remote servers. This is crucial for disaster recovery, similar to how proper backup strategies protect your Linux servers.

Configure SSH Key Authentication

On the source server:

sudo ssh-keygen -t ed25519 -f /root/.ssh/syncoid_key -N ""

Copy the public key to the remote server:

sudo ssh-copy-id -i /root/.ssh/syncoid_key.pub [email protected]

Test Syncoid Replication

Replicate a dataset to remote server:

sudo syncoid tank/important-data [email protected]:backup-pool/important-data

Automate Replication

Add to crontab:

sudo crontab -e

Add:

# Replicate to remote server every hour
0 * * * * /usr/local/sbin/syncoid --recursive tank/important-data [email protected]:backup-pool/important-data --quiet

Step 11: Restore from Snapshots

List Available Snapshots

zfs list -t snapshot -r tank/important-data

Restore Individual Files

Mount a snapshot to browse files:

# Snapshots are automatically mounted under .zfs/snapshot/
cd /tank/important-data/.zfs/snapshot
ls -la

Copy files from snapshot:

cp /tank/important-data/.zfs/snapshot/autosnap_2024-11-17_12:00:00_hourly/myfile.txt /tank/important-data/

Rollback Entire Dataset

WARNING: This destroys all changes made after the snapshot!

# Rollback to specific snapshot
sudo zfs rollback tank/important-data@autosnap_2024-11-17_12:00:00_hourly

# Rollback to most recent snapshot
sudo zfs rollback tank/important-data@autosnap_2024-11-17_14:00:00_hourly

Clone a Snapshot

Create a writable clone for testing:

sudo zfs clone tank/important-data@autosnap_2024-11-17_12:00:00_hourly tank/test-restore

Step 12: Optimize Sanoid Performance

Enable Compression

sudo zfs set compression=lz4 tank/important-data

Adjust ARC Cache

For servers with limited RAM, tune ZFS cache:

sudo nano /etc/modprobe.d/zfs.conf

Add:

# Limit ARC to 4GB
options zfs zfs_arc_max=4294967296

Apply changes:

sudo update-initramfs -u
sudo reboot

Monitor ZFS Performance

# Check ARC statistics
sudo cat /proc/spl/kstat/zfs/arcstats

# Monitor pool I/O
sudo zpool iostat -v 5

# Check dataset space usage
zfs list -o name,used,avail,refer,mountpoint

Troubleshooting

Sanoid Not Creating Snapshots

  1. Check cron/systemd timer is running:
sudo systemctl status cron
# or
sudo systemctl status sanoid.timer
  1. Verify configuration syntax:
sudo sanoid --configdir=/etc/sanoid --verbose --readonly
  1. Check permissions:
ls -la /etc/sanoid/sanoid.conf
sudo chmod 644 /etc/sanoid/sanoid.conf

Snapshots Not Being Pruned

Check autoprune setting:

grep autoprune /etc/sanoid/sanoid.conf

Manually force pruning:

sudo sanoid --prune-snapshots --verbose

Syncoid Replication Failing

  1. Test SSH connection:
sudo ssh -i /root/.ssh/syncoid_key [email protected]
  1. Check ZFS permissions on remote:
sudo zfs allow [email protected]
  1. Increase verbosity:
sudo syncoid --debug tank/data [email protected]:backup-pool/data

High Snapshot Count

List snapshot count per dataset:

for dataset in $(zfs list -H -o name); do
    count=$(zfs list -t snapshot -r $dataset 2>/dev/null | wc -l)
    echo "$dataset: $count snapshots"
done

Adjust retention policy if too many snapshots exist.

Best Practices

  1. Start Conservative – Begin with shorter retention periods, expand as needed
  2. Monitor Disk Space – Snapshots consume space; use zfs list -o space
  3. Test Restores Regularly – Verify your snapshots are actually usable
  4. Document Policies – Keep records of which datasets use which templates
  5. Use Descriptive Names – Sanoid’s naming convention is clear but add comments
  6. Implement Monitoring – Set up alerts for failed snapshots
  7. Replicate Off-Site – Use Syncoid for geographic redundancy
  8. Review Logs – Check /var/log/sanoid.log regularly

Useful Commands Reference

# Sanoid Management
sudo sanoid --monitor-snapshots              # Check snapshot health
sudo sanoid --take-snapshots --verbose       # Manual snapshot creation
sudo sanoid --prune-snapshots --verbose      # Manual pruning
sudo sanoid --monitor-health                 # Pool health check

# ZFS Snapshot Operations
zfs list -t snapshot -r tank                 # List all snapshots
zfs destroy tank/data@snapshot-name          # Delete specific snapshot
zfs diff tank/data@snap1 tank/data@snap2     # Compare snapshots
zfs send tank/data@snap | gzip > backup.gz   # Export snapshot

# Syncoid Replication
syncoid --recursive tank/data backup:data    # Recursive replication
syncoid --no-sync-snap tank/data backup:data # Skip sync snapshot
syncoid --monitor-version tank/data          # Check version compatibility

# Monitoring
zpool status -v                              # Pool health
zfs list -o space                            # Space usage with snapshots
arc_summary                                  # ARC cache statistics

Advanced Configuration Examples

High-Frequency Database Snapshots

[template_critical_db]
        frequently = 8
        frequent_period = 5
        hourly = 72
        daily = 30
        weekly = 12
        monthly = 24
        yearly = 3
        autosnap = yes
        autoprune = yes

[tank/databases/production]
        use_template = critical_db
        recursive = yes

Docker Volume Snapshots

[template_docker]
        frequently = 0
        hourly = 12
        daily = 7
        weekly = 4
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

[tank/docker/volumes]
        use_template = docker
        recursive = yes

Home Directory Snapshots

[template_home]
        frequently = 0
        hourly = 24
        daily = 14
        weekly = 8
        monthly = 6
        yearly = 1
        autosnap = yes
        autoprune = yes

[tank/home]
        use_template = home
        recursive = children
        process_children_only = yes

Integration with Monitoring Systems

Prometheus Exporter

Create a simple exporter script:

sudo nano /usr/local/bin/sanoid-exporter.sh
#!/bin/bash
echo "# HELP sanoid_snapshots_total Total number of snapshots"
echo "# TYPE sanoid_snapshots_total gauge"
echo "sanoid_snapshots_total $(zfs list -t snapshot | wc -l)"

Nagios Check

sudo nano /usr/lib/nagios/plugins/check_sanoid
#!/bin/bash
CRIT_AGE=7200
WARN_AGE=3600
LATEST=$(zfs list -t snapshot -o creation -s creation | tail -1)
AGE=$(($(date +%s) - $(date -d "$LATEST" +%s)))

if [ $AGE -gt $CRIT_AGE ]; then
    echo "CRITICAL: Snapshots older than 2 hours"
    exit 2
elif [ $AGE -gt $WARN_AGE ]; then
    echo "WARNING: Snapshots older than 1 hour"
    exit 1
else
    echo "OK: Snapshots current"
    exit 0
fi

Conclusion

You’ve successfully configured automated ZFS snapshots with Sanoid on Ubuntu Server. Your data now has comprehensive point-in-time protection with automatic retention management. Just as securing SSH with fail2ban protects against external threats, regular snapshots protect against data loss, accidental deletions, and ransomware.

Remember to regularly test your restore procedures and monitor snapshot creation to ensure your backup strategy remains effective.

Related Resources

Thank you for visiting our website, TechsBucket. If you liked the article, then share it with others.

Leave a Response