FreeBSD Manual Pages
ZREPL(1) zrepl ZREPL(1) NAME zrepl - zrepl Documentation GitHub license Language: Go Twitter Donate via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal Matrix zrepl is a one-stop, integrated solution for ZFS replication. GETTING STARTED The 10 minute quick-start guides give you a first impression. MAIN FEATURES • Filesystem replication • [x] Pull & Push mode • [x] Multiple transport modes: TCP, TCP + TLS client auth, SSH • Advanced replication features • [x] Automatic retries for temporary network errors • [x] Automatic resumable send & receive • [x] Automatic ZFS holds during send & receive • [x] Automatic bookmark & hold management for guaranteed incremen- tal send & recv • [x] Encrypted raw send & receive to untrusted receivers (OpenZFS native encryption) • [x] Properties send & receive • [x] Compressed send & receive • [x] Large blocks send & receive • [x] Embedded data send & receive • [x] Resume state send & receive • [x] Bandwidth limiting • Automatic snapshot management • [x] Periodic filesystem snapshots • [x] Support for pre- and post-snapshot hooks with builtins for MySQL & Postgres • [x] Flexible pruning rule system • [x] Age-based fading (grandfathering scheme) • [x] Bookmarks to avoid divergence between sender and receiver • Sophisticated Monitoring & Logging • [x] Live progress reporting via zrepl status subcommand • [x] Comprehensive, structured logging • human, logfmt and json formatting • stdout, syslog and TCP (+TLS client auth) outlets • [x] Prometheus monitoring endpoint • Maintainable implementation in Go • [x] Cross platform • [x] Dynamic feature checking • [x] Type safe & testable code ATTENTION: zrepl as well as this documentation is still under active develop- ment. There is no stability guarantee on the RPC protocol or con- figuration format, but we do our best to document breaking changes in the Changelog. CONTRIBUTING We are happy about any help we can get! • Financial Support • Explore the codebase • These docs live in the docs/ subdirectory • Document any non-obvious / confusing / plain broken behavior you en- counter when setting up zrepl for the first time • Check the Issues and Projects sections for things to do. The good first issues and docs are suitable starting points. Development Workflow The GitHub repository is where all development happens. Make sure to read the Developer Documentation section and open new issues or pull requests there. TABLE OF CONTENTS Quick Start by Use Case The goal of this quick-start guide is to give you an impression of how zrepl can accomodate your use case. Install zrepl Follow the OS-specific installation instructions and come back here. Overview Of How zrepl Works Check out the overview section to get a rough idea of what you are go- ing to configure in the next step, then come back here. Configuration Examples zrepl is configured through a YAML configuration file in /etc/zrepl/zrepl.yml. We have prepared example use cases that show-case typical deployments and different functionality of zrepl. We encourage you to read through all of the examples to get an idea of what zrepl has to offer, and how you can mix-and-match configurations for your use case. Keep the full config documentation handy if a con- fig snippet is unclear. Example Use Cases Continuous Backup of a Server This config example shows how we can backup our ZFS-based server to an- other machine using a zrepl push job. • Production server prod with filesystems to back up: • The entire pool zroot • except zroot/var/tmp and all child datasets of it • and except zroot/usr/home/paranoid which belongs to a user doing backups themselves. • Backup server backups with a dataset sub-tree for use by zrepl: • In our example, that will be storage/zrepl/sink/prod. Our backup solution should fulfill the following requirements: • Periodically snapshot the filesystems on prod every 10 minutes • Incrementally replicate these snapshots to storage/zrepl/sink/prod/* on backups • Keep only very few snapshots on prod to save disk space • Keep a fading history (24 hourly, 30 daily, 6 monthly) of snapshots on backups • The network is untrusted - zrepl should use TLS to protect its commu- nication and our data. Analysis We can model this situation as two jobs: • A push job on prod • Creates the snapshots • Keeps a short history of local snapshots to enable incremental replication to backups • Connects to the zrepl daemon process on backups • Pushes snapshots backups • Prunes snapshots on backups after replication is complete • A sink job on backups • Accepts connections & responds to requests from prod • Limits client prod access to filesystem sub-tree stor- age/zrepl/sink/prod Generate TLS Certificates We use the TLS client authentication transport to protect our data on the wire. To get things going quickly, we skip setting up a CA and generate two self-signed certificates as described here. For conve- nience, we generate the key pairs on our local machine and distribute them using ssh: (name=backups; openssl req -x509 -sha256 -nodes \ -newkey rsa:4096 \ -days 365 \ -keyout $name.key \ -out $name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name") (name=prod; openssl req -x509 -sha256 -nodes \ -newkey rsa:4096 \ -days 365 \ -keyout $name.key \ -out $name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name") ssh root@backups "mkdir /etc/zrepl" scp backups.key backups.crt prod.crt root@backups:/etc/zrepl ssh root@prod "mkdir /etc/zrepl" scp prod.key prod.crt backups.crt root@prod:/etc/zrepl Note that alternative transports exist, e.g. via TCP without TLS or ssh. Configure server prod We define a push job named prod_to_backups in /etc/zrepl/zrepl.yml on host prod : jobs: - name: prod_to_backups type: push connect: type: tls address: "backups.example.com:8888" ca: /etc/zrepl/backups.crt cert: /etc/zrepl/prod.crt key: /etc/zrepl/prod.key server_cn: "backups" filesystems: { "zroot<": true, "zroot/var/tmp<": false, "zroot/usr/home/paranoid": false } snapshotting: type: periodic prefix: zrepl_ interval: 10m pruning: keep_sender: - type: not_replicated - type: last_n count: 10 keep_receiver: - type: grid grid: 1x1h(keep=all) | 24x1h | 30x1d | 6x30d regex: "^zrepl_" Configure server backups We define a corresponding sink job named sink in /etc/zrepl/zrepl.yml on host backups : jobs: - name: sink type: sink serve: type: tls listen: ":8888" ca: "/etc/zrepl/prod.crt" cert: "/etc/zrepl/backups.crt" key: "/etc/zrepl/backups.key" client_cns: - "prod" root_fs: "storage/zrepl/sink" Go Back To Quickstart Guide Click here to go back to the quickstart guide. Local Snapshots + Offline Backup to an External Disk This config example shows how we can use zrepl to make periodic snap- shots of our local workstation and back it up to a zpool on an external disk which we occassionally connect. The local snapshots should be taken every 15 minutes for pain-free re- covery from CLI disasters (rm -rf / and the like). However, we do not want to keep the snapshots around for very long because our workstation is a little tight on disk space. Thus, we only keep one hour worth of high-resolution snapshots, then fade them out to one per hour for a day (24 hours), then one per day for 14 days. At the end of each work day, we connect our external disk that serves as our workstation's local offline backup. We want zrepl to inspect the filesystems and snapshots on the external pool, figure out which snapshots were created since the last time we connected the external disk, and use incremental replication to efficiently mirror our work- station to our backup disk. Afterwards, we want to clean up old snap- shots on the backup pool: we want to keep all snapshots younger than one hour, 24 for each hour of the first day, then 360 daily backups. A few additional requirements: • Snapshot creation and pruning on our workstation should happen in the background, without interaction from our side. • However, we want to explicitly trigger replication via the command line. • We want to use OpenZFS native encryption to protect our data on the external disk. It is absolutely critical that only encrypted data leaves our workstation. zrepl should provide an easy config knob for this and prevent replication of unencrypted datasets to the external disk. • We want to be able to put off the backups for more than three weeks, i.e., longer than the lifetime of the automatically created snapshots on our workstation. zrepl should use bookmarks and holds to achieve this goal. • When we yank out the drive during replication and go on a long vaca- tion, we do not want the partially replicated snapshot to stick around as it would hold on to too much disk space over time. There- fore, we want zrepl to deviate from its default behavior and sacri- fice resumability, but nonetheless retain the ability to do incremen- tal replication once we return from our vacation. zrepl should pro- vide an easy config knob to disable step holds for incremental repli- cation. The following config snippet implements the setup described above. You will likely want to customize some aspects mentioned in the top comment in the file. # This config serves as an example for a local zrepl installation that # backups the entire zpool `system` to `backuppool/zrepl/sink` # # The requirements covered by this setup are described in the zrepl documentation's # quick start section which inlines this example. # # CUSTOMIZATIONS YOU WILL LIKELY WANT TO APPLY: # - adjust the name of the production pool `system` in the `filesystems` filter of jobs `snapjob` and `push_to_drive` # - adjust the name of the backup pool `backuppool` in the `backuppool_sink` job # - adjust the occurences of `myhostname` to the name of the system you are backing up (cannot be easily changed once you start replicating) # - make sure the `zrepl_` prefix is not being used by any other zfs tools you might have installed (it likely isn't) jobs: # this job takes care of snapshot creation + pruning - name: snapjob type: snap filesystems: { "system<": true, } # create snapshots with prefix `zrepl_` every 15 minutes snapshotting: type: periodic interval: 15m prefix: zrepl_ pruning: keep: # fade-out scheme for snapshots starting with `zrepl_` # - keep all created in the last hour # - then destroy snapshots such that we keep 24 each 1 hour apart # - then destroy snapshots such that we keep 14 each 1 day apart # - then destroy all older snapshots - type: grid grid: 1x1h(keep=all) | 24x1h | 14x1d regex: "^zrepl_.*" # keep all snapshots that don't have the `zrepl_` prefix - type: regex negate: true regex: "^zrepl_.*" # This job pushes to the local sink defined in job `backuppool_sink`. # We trigger replication manually from the command line / udev rules using # `zrepl signal wakeup push_to_drive` - type: push name: push_to_drive connect: type: local listener_name: backuppool_sink client_identity: myhostname filesystems: { "system<": true } send: encrypted: true replication: protection: initial: guarantee_resumability # Downgrade protection to guarantee_incremental which uses zfs bookmarks instead of zfs holds. # Thus, when we yank out the backup drive during replication # - we might not be able to resume the interrupted replication step because the partially received `to` snapshot of a `from`->`to` step may be pruned any time # - but in exchange we get back the disk space allocated by `to` when we prune it # - and because we still have the bookmarks created by `guarantee_incremental`, we can still do incremental replication of `from`->`to2` in the future incremental: guarantee_incremental snapshotting: type: manual pruning: # no-op prune rule on sender (keep all snapshots), job `snapshot` takes care of this keep_sender: - type: regex regex: ".*" # retain keep_receiver: # longer retention on the backup drive, we have more space there - type: grid grid: 1x1h(keep=all) | 24x1h | 360x1d regex: "^zrepl_.*" # retain all non-zrepl snapshots on the backup drive - type: regex negate: true regex: "^zrepl_.*" # This job receives from job `push_to_drive` into `backuppool/zrepl/sink/myhostname` - type: sink name: backuppool_sink root_fs: "backuppool/zrepl/sink" serve: type: local listener_name: backuppool_sink Offline Backups with two (or more) External Disks It can be desirable to have multiple disk-based backups of the same ma- chine. To accomplish this, • create one zpool per external HDD, each with a unique name, and • define a pair of push and sink job for each of these zpools, each with a unique name, listener_name, and root_fs. The unique names ensure that the jobs don't step on each others' toes when managing zrepl's ZFS abstractions . Click here to go back to the quickstart guide. Fan-out replication This quick-start example demonstrates how to implement a fan-out repli- cation setup where datasets on a server (A) are replicated to multiple targets (B, C, etc.). This example uses multiple source jobs on server A and pull jobs on the target servers. WARNING: Before implementing this setup, please see the caveats listed in the fan-out replication configuration overview. Overview On the source server (A), there should be: • A snap job • Creates the snapshots • Handles the pruning of snapshots • A source job for target B • Accepts connections from server B and B only • Further source jobs for each additional target (C, D, etc.) • Listens on a unique port • Only accepts connections from the specific target On each target server, there should be: • A pull job that connects to the corresponding source job on A • prune_sender should keep all snapshots since A's snap job handles the pruning • prune_receiver can be configured as appropriate on each target server Generate TLS Certificates Mutual TLS via the TLS client authentication transport can be used to secure the connections between the servers. In this example, a self-signed certificate is created for each server without setting up a CA. source=a.example.com targets=( b.example.com c.example.com # ... ) for server in "${source}" "${targets[@]}"; do openssl req -x509 -sha256 -nodes \ -newkey rsa:4096 \ -days 365 \ -keyout "${server}.key" \ -out "${server}.crt" \ -addext "subjectAltName = DNS:${server}" \ -subj "/CN=${server}" done # Distribute each host's keypair for server in "${source}" "${targets[@]}"; do ssh root@"${server}" mkdir /etc/zrepl scp "${server}".{crt,key} root@"${server}":/etc/zrepl/ done # Distribute target certificates to the source scp "${targets[@]/%/.crt}" root@"${source}":/etc/zrepl/ # Distribute source certificate to the targets for server in "${targets[@]}"; do scp "${source}.crt" root@"${server}":/etc/zrepl/ done Configure source server A jobs: # Separate job for snapshots and pruning - name: snapshots type: snap filesystems: 'tank<': true # all filesystems snapshotting: type: periodic prefix: zrepl_ interval: 10m pruning: keep: # Keep non-zrepl snapshots - type: regex negate: true regex: '^zrepl_' # Time-based snapshot retention - type: grid grid: 1x1h(keep=all) | 24x1h | 30x1d | 12x30d regex: '^zrepl_' # Source job for target B - name: target_b type: source serve: type: tls listen: :8888 ca: /etc/zrepl/b.example.com.crt cert: /etc/zrepl/a.example.com.crt key: /etc/zrepl/a.example.com.key client_cns: - b.example.com filesystems: 'tank<': true # all filesystems # Snapshots are handled by the separate snap job snapshotting: type: manual # Source job for target C - name: target_c type: source serve: type: tls listen: :8889 ca: /etc/zrepl/c.example.com.crt cert: /etc/zrepl/a.example.com.crt key: /etc/zrepl/a.example.com.key client_cns: - c.example.com filesystems: 'tank<': true # all filesystems # Snapshots are handled by the separate snap job snapshotting: type: manual # Source jobs for remaining targets. Each one should listen on a different port # and reference the correct certificate and client CN. # - name: target_c # ... Configure each target server jobs: # Pull from source server A - name: source_a type: pull connect: type: tls # Use the correct port for this specific client (eg. B is 8888, C is 8889, etc.) address: a.example.com:8888 ca: /etc/zrepl/a.example.com.crt # Use the correct key pair for this specific client cert: /etc/zrepl/b.example.com.crt key: /etc/zrepl/b.example.com.key server_cn: a.example.com root_fs: pool0/backup interval: 10m pruning: keep_sender: # Source does the pruning in its snap job - type: regex regex: '.*' # Receiver-side pruning can be configured as desired on each target server keep_receiver: # Keep non-zrepl snapshots - type: regex negate: true regex: '^zrepl_' # Time-based snapshot retention - type: grid grid: 1x1h(keep=all) | 24x1h | 30x1d | 12x30d regex: '^zrepl_' Go Back To Quickstart Guide Click here to go back to the quickstart guide. Use zrepl configcheck to validate your configuration. No output indi- cates that everything is fine. NOTE: Please open an issue on GitHub if your use case for zrepl is signif- icantly different from those listed above. Or even better, write it up in the same style as above and open a PR! Apply Configuration Changes We hope that you have found a configuration that fits your use case. Use zrepl configcheck once again to make sure the config is correct (output indicates that everything is fine). Then restart the zrepl daemon on all systems involved in the replication, likely using service zrepl restart or systemctl restart zrepl. WARNING: Please read up carefully on the pruning rules before applying the config. In particular, note that most example configs apply to all snapshots, not just zrepl-created snapshots. Use the following keep rule on sender and receiver to prevent this: - type: regex negate: true regex: "^zrepl_.*" # <- the 'prefix' specified in snapshotting.prefix Watch it Work Run zrepl status on the active side of the replication setup to monitor snaphotting, replication and pruning activity. To re-trigger replica- tion (snapshots are separate!), use zrepl signal wakeup JOBNAME. (re- fer to the example use case document if you are uncertain which job you want to wake up). You can also use basic UNIX tools to inspect see what's going on. If you like tmux, here is a handy script that works on FreeBSD: pkg install gnu-watch tmux tmux new -s zrepl -d tmux split-window -t zrepl "tail -f /var/log/messages" tmux split-window -t zrepl "gnu-watch 'zfs list -t snapshot -o name,creation -s creation'" tmux split-window -t zrepl "zrepl status" tmux select-layout -t zrepl tiled tmux attach -t zrepl The Linux equivalent might look like this: # make sure tmux is installed & let's assume you use systemd + journald tmux new -s zrepl -d tmux split-window -t zrepl "journalctl -f -u zrepl.service" tmux split-window -t zrepl "watch 'zfs list -t snapshot -o name,creation -s creation'" tmux split-window -t zrepl "zrepl status" tmux select-layout -t zrepl tiled tmux attach -t zrepl What Next? • Read more about configuration format, options & job types • Configure logging & monitoring. Installation TIP: Note: check out the quick-start guides if you want a first impres- sion of zrepl. User Privileges It is possible to run zrepl as an unprivileged user in combination with ZFS delegation. Also, there is the possibility to run it in a jail on FreeBSD by delegating a dataset to the jail. TIP: Note: check out the FreeBSD Jail With iocage for FreeBSD jail setup instructions. Packages zrepl source releases are signed & tagged by the author in the git repository. Your OS vendor may provide binary packages of zrepl through the package manager. Additionally, binary releases are pro- vided on GitHub. The following list may be incomplete, feel free to submit a PR with an update: +---------------------+--------------------+--------------------------------------------+ | OS / Distro | Install Command | Link | +---------------------+--------------------+--------------------------------------------+ | FreeBSD | pkg install zrepl | https://www.freshports.org/sysutils/zrepl/ | | | | | | | | FreeBSD Jail With | | | | iocage | +---------------------+--------------------+--------------------------------------------+ | FreeNAS | | FreeBSD Jail With iocage | +---------------------+--------------------+--------------------------------------------+ | MacOS | brew install zrepl | Available on homebrew | +---------------------+--------------------+--------------------------------------------+ | Arch Linux | yay install zrepl | Available on AUR | +---------------------+--------------------+--------------------------------------------+ | Fedora, CentOS, | dnf install zrepl | RPM repository config | | RHEL, OpenSUSE | | | +---------------------+--------------------+--------------------------------------------+ | Debian + Ubuntu | apt install zrepl | APT repository config | +---------------------+--------------------+--------------------------------------------+ | OmniOS | pkg install zrepl | Available since r151030 | +---------------------+--------------------+--------------------------------------------+ | Void Linux | xbps-install zrepl | Available since a88a2a4 | +---------------------+--------------------+--------------------------------------------+ | Others | | Use binary releases or build from source. | +---------------------+--------------------+--------------------------------------------+ Debian / Ubuntu APT repositories We maintain APT repositories for Debian, Ubuntu and derivatives. The fingerprint of the signing key is E101 418F D3D6 FBCB 9D65 A62D 7086 99FC 5F2E BF16. It is available at https://zrepl.cschwarz.com/apt/apt-key.asc . Please open an issue in on GitHub if you encounter any issues with the repository. ( set -ex zrepl_apt_key_url=https://zrepl.cschwarz.com/apt/apt-key.asc zrepl_apt_key_dst=/usr/share/keyrings/zrepl.gpg zrepl_apt_repo_file=/etc/apt/sources.list.d/zrepl.list # Install dependencies for subsequent commands sudo apt update && sudo apt install curl gnupg lsb-release # Deploy the zrepl apt key. curl -fsSL "$zrepl_apt_key_url" | tee | gpg --dearmor | sudo tee "$zrepl_apt_key_dst" > /dev/null # Add the zrepl apt repo. ARCH="$(dpkg --print-architecture)" CODENAME="$(lsb_release -i -s | tr '[:upper:]' '[:lower:]') $(lsb_release -c -s | tr '[:upper:]' '[:lower:]')" echo "Using Distro and Codename: $CODENAME" echo "deb [arch=$ARCH signed-by=$zrepl_apt_key_dst] https://zrepl.cschwarz.com/apt/$CODENAME main" | sudo tee /etc/apt/sources.list.d/zrepl.list # Update apt repos. sudo apt update ) NOTE: Until zrepl reaches 1.0, the repositories will be updated to the latest zrepl release immediately. This includes breaking changes between zrepl versions. Use apt-mark hold zrepl to prevent upgrades of zrepl. RPM repositories We provide a single RPM repository for all RPM-based Linux distros. The zrepl binary in the repo is the same as the one published to GitHub. Since Go binaries are statically linked, the RPM should work about everywhere. The fingerprint of the signing key is F6F6 E8EA 6F2F 1462 2878 B5DE 50E3 4417 826E 2CE6. It is available at https://zrepl.cschwarz.com/rpm/rpm-key.asc . Please open an issue on GitHub if you encounter any issues with the repository. Copy-paste the following snippet into your shell to set up the zrepl repository. Then dnf install zrepl and make sure to confirm that the signing key matches the one shown above. cat > /etc/yum.repos.d/zrepl.repo <<EOF [zrepl] name = zrepl baseurl = https://zrepl.cschwarz.com/rpm/repo gpgkey = https://zrepl.cschwarz.com/rpm/rpm-key.asc EOF NOTE: Until zrepl reaches 1.0, the repository will be updated to the lat- est zrepl release immediately. This includes breaking changes be- tween zrepl versions. If that bothers you, use the dnf versionlock plugin to pin the version of zrepl on your system. Compile From Source Producing a release requires Go 1.11 or newer and Python 3 + pip3 + docs/requirements.txt for the Sphinx documentation. A tutorial to in- stall Go is available over at golang.org. Python and pip3 should prob- ably be installed via your distro's package manager. :: cd to/your/zrepl/checkout python3 -m venv3 source venv3/bin/ac- tivate ./lazy.sh devsetup make release # build artifacts are available in ./artifacts/release The Python venv is used for the documentation build dependencies. If you just want to build the zrepl binary, leave it out and use ./lazy.sh godep instead. Alternatively, you can use the Docker build process: it is used to pro- duce the official zrepl binary releases and serves as a reference for build dependencies and procedure: cd to/your/zrepl/checkout # make sure your user has access to the docker socket make release-docker # if you want .deb or .rpm packages, invoke the follwoing # targets _after_ you invoked release-docker make deb-docker make rpm-docker # build artifacts are available in ./artifacts/release # packages are available in ./artifacts NOTE: It is your job to install the built binary in the zrepl users's $PATH, e.g. /usr/local/bin/zrepl. Otherwise, the examples in the quick-start guides may need to be adjusted. FreeBSD Jail With iocage This tutorial shows how zrepl can be installed on FreeBSD, or FreeNAS in a jail using iocage. While this tutorial focuses on using iocage, much of the setup would be similar using a different jail manager. NOTE: From a security perspective, just keep in mind that zfs send/recv was never designed with jails in mind, an attacker could probably crash the receive-side kernel or worse induce stateful damage to the receive-side pool if they were able to get access to the jail. The jail doesn't provide security benefits, but only management ones. Requirements A dataset that will be delegated to the jail needs to be created if one does not already exist. For the tutorial tank/zrepl will be used. zfs create -o mountpoint=none tank/zrepl The only software requirements on the host system are iocage, which can be installed from ports or packages. pkg install py37-iocage NOTE: By default iocage will "activate" on first use which will set up some defaults such as which pool will be used. To activate iocage manually the iocage activate command can be used. Jail Creation There are two options for jail creation using FreeBSD. 1. Manually set up the jail from scratch 2. Create the jail using the zrepl plugin. On FreeNAS this is possible from the user interface using the community index. Manual Jail Create a jail, using the same release as the host, called zrepl that will be automatically started at boot. The jail will have tank/zrepl delegated into it. iocage create --release "$(freebsd-version -k | cut -d '-' -f '1,2')" --name zrepl \ boot=on nat=1 \ jail_zfs=on \ jail_zfs_dataset=zrepl \ jail_zfs_mountpoint='none' Enter the jail: iocage console zrepl Install zrepl pkg update && pkg upgrade pkg install zrepl Create the log file /var/log/zrepl.log touch /var/log/zrepl.log && service newsyslog restart Tell syslogd to redirect facility local0 to the zrepl.log file: service syslogd reload Enable the zrepl daemon to start automatically at boot: sysrc zrepl_enable="YES" Now jump to the summary below. Plugin When using the plugin, zrepl will be installed for you in a jail using the following iocage properties. • nat=1 • jail_zfs=on • jail_zfs_mountpoint=none Additionally the delegated dataset should be specified upon creation, and optionally start on boot can be set. This can also be done from the FreeNAS webui. fetch https://raw.githubusercontent.com/ix-plugin-hub/iocage-plugin-index/master/zrepl.json -o /tmp/zrepl.json iocage fetch -P /tmp/zrepl.json --name zrepl jail_zfs_dataset=zrepl boot=on Configuration Now zrepl can be configured. Enter the jail. iocage console zrepl Modify the /usr/local/etc/zrepl/zrepl.yml configuration file. TIP: Note: check out the quick-start guides for examples of a sink job. Now zrepl can be started. service zrepl start Now jump to the summary below. Summary Congratulations, you have a working jail! NOTE: With FreeBSD 13's transition to OpenZFS 2.0, please ensure that your jail's FreeBSD version matches the one in the kernel module. If you are getting cryptic errors such as cannot receive new filesystem stream: invalid backup stream the instructions posted here might help. What next? Read the configuration chapter and then continue with the usage chap- ter. Reminder: If you want a quick introduction, please read the quick-start guides. Configuration Overview & Terminology All work zrepl does is performed by the zrepl daemon which is config- ured in a single YAML configuration file loaded on startup. The fol- lowing paths are searched, in this order: 1. The path specified via the global --config flag 2. /etc/zrepl/zrepl.yml 3. /usr/local/etc/zrepl/zrepl.yml zrepl configcheck can be used to validate the configuration. If the configuration is valid, it will output nothing and exit with code 0. The error messages vary in quality and usefulness: please report con- fusing config errors to the tracking issue #155. Full example configs are available at quick-start guides and config/samples/. However, copy-pasting examples is no substitute for reading documentation! Config File Structure global: ... jobs: - name: backup type: push - ... A zrepl configuration file is divided in to two main sections: global and jobs. global has sensible defaults. It is covered in logging, monitoring & miscellaneous. Jobs & How They Work Together A job is the unit of activity tracked by the zrepl daemon. The type of a job determines its role in a replication setup and in snapshot man- agement. Jobs are identified by their name, both in log files and the zrepl status command. NOTE: The job name is persisted in several places on disk and thus cannot be changed easily. Replication always happens between a pair of jobs: one active side and one passive side. The active side connects to the passive side using a transport and starts executing the replication logic. The passive side responds to requests from the active side after checking its permis- sions. The following table shows how different job types can be combined to achieve both push and pull mode setups. Note that snapshot-creation denoted by "(snap)" is orthogonal to whether a job is active or pas- sive. +------------------+---------------------------+------------------+-------------------------------------------------+ | Setup name | active side | passive side | use case | +------------------+---------------------------+------------------+-------------------------------------------------+ | Push mode | push (snap) | sink | | | | | | • Laptop | | | | | backup | | | | | | | | | | • NAS be- | | | | | hind | | | | | NAT to | | | | | offsite | +------------------+---------------------------+------------------+-------------------------------------------------+ | Pull mode | pull | source (snap) | | | | | | • Central | | | | | backup-server | | | | | for | | | | | many | | | | | nodes | | | | | | | | | | • Remote | | | | | server | | | | | to NAS | | | | | behind | | | | | NAT | +------------------+---------------------------+------------------+-------------------------------------------------+ | Local replica- | push + sink in one config | | | | tion | with local transport | • Backup | | | | | to | | | | | locally | | | | | at- | | | | | tached | | | | | disk | | | | | | | | | | • Backup | | | | | FreeBSD | | | | | boot | | | | | pool | | +------------------+---------------------------+------------------+-------------------------------------------------+ | Snap & | snap (snap) | N/A | | | prune-only | | | • | | | | | Snapshots & pruning but no replication | | | | | required | | | | | | | | | | | | | | | • Workaround | | | | | for | | | | | source-side | | | | | pruning | +------------------+---------------------------+------------------+-------------------------------------------------+ How the Active Side Works The active side (push and pull job) executes the replication and prun- ing logic: 1. Wakeup after snapshotting (push job) or pull interval ticker (pull job). 2. Connect to the passive side and instantiate an RPC client. 3. Replicate data from the sender to the receiver. 4. Prune on sender & receiver. TIP: The progress of the active side can be watched live using zrepl sta- tus. How the Passive Side Works The passive side (sink and source) waits for connections from the ac- tive side, on the transport specified with serve in the job configura- tion. The respective transport then perfoms authentication & autho- rization, resulting in a stable client identity. The passive side job uses this client identity as follows: • In sink jobs, to map requests from different client identities to their respective sub-filesystem tree root_fs/${client_identity}. • In the future, ``source`` might embed the client identity in :ref:`zrepl's ZFS abstraction names <zrepl-zfs-abstractions>`, to support multi-host replication. TIP: The use of the client identity in the sink job implies that it must be usable as a ZFS ZFS filesystem name component. How Replication Works One of the major design goals of the replication module is to avoid any duplication of the nontrivial logic. As such, the code works on ab- stract senders and receiver endpoints, where typically one will be im- plemented by a local program object and the other is an RPC client in- stance. Regardless of push- or pull-style setup, the logic executes on the active side, i.e. in the push or pull job. The following high-level steps take place during replication and can be monitored using zrepl status: • Plan the replication: • Compare sender and receiver filesystem snapshots • Build the replication plan • Per filesystem, compute a diff between sender and receiver snap- shots • Build a list of replication steps • If possible, use incremental and resumable sends • Otherwise, use full send of most recent snapshot on sender • Retry on errors that are likely temporary (i.e. network failures). • Give up on filesystems where a permanent error was received over RPC. • Execute the plan • Perform replication steps in the following order: Among all filesystems with pending replication steps, pick the filesystem whose next replication step's snapshot is the oldest. • Create placeholder filesystems on the receiving side to mirror the dataset paths on the sender to root_fs/${client_identity}. • Acquire send-side step-holds on the step's from and to snapshots. • Perform the replication step. • Move the replication cursor bookmark on the sending side (see be- low). • Move the last-received-hold on the receiving side (see below). • Release the send-side step-holds. The idea behind the execution order of replication steps is that if the sender snapshots all filesystems simultaneously at fixed intervals, the receiver will have all filesystems snapshotted at time T1 before the first snapshot at T2 = T1 + $interval is replicated. ZFS Background Knowledge This section gives some background knowledge about ZFS features that zrepl uses to provide guarantees for a replication filesystem. Specif- ically, zrepl guarantees by default that incremental replication is al- ways possible and that started replication steps can always be resumed if they are interrupted. ZFS Send Modes & Bookmarks ZFS supports full sends (zfs send fs@to) and incremental sends (zfs send -i @from fs@to). Full sends are used to create a new filesystem on the receiver with the send-side state of fs@to. Incremental sends only transfer the delta between @from and @to. Incremental sends require that @from be present on the receiving side when receiving the incremental stream. Incremental sends can also use a ZFS bookmark as from on the sending side (zfs send -i #bm_from fs@to), where #bm_from was created using zfs bookmark fs@from fs#bm_from. The receiving side must always have the actual snapshot @from, regardless of whether the sending side uses @from or a bookmark of it. Plain and raw sends By default, zfs send sends the most generic, back- wards-compatible data stream format (so-called 'plain send'). If the sent uses newer features, e.g. compression or encryption, zfs send has to un-do these operations on the fly to produce the plain send stream. If the receiver uses newer features (e.g. compression or encryption in- herited from the parent FS), it applies the necessary transformations again on the fly during zfs recv. Flags such as -e, -c and -L tell ZFS to produce a send stream that is closer to how the data is stored on disk. Sending with those flags re- moves computational overhead from sender and receiver. However, the receiver will not apply certain transformations, e.g., it will not com- press with the receive-side compression algorithm. The -w (--raw) flag produces a send stream that is as raw as possible. For unencrypted datasets, its current effect is the same as -Lce. Encrypted datasets can only be sent plain (unencrypted) or raw (en- crypted) using the -w flag. Resumable Send & Recv The -s flag for zfs recv tells zfs to save the partially received send stream in case it is interrupted. To resume the replication, the receiving side filesystem's receive_resume_token must be passed to a new zfs send -t <value> | zfs recv command. A full send can only be resumed if @to still exists. An incremental send can only be resumed if @to still exists and either @from still exists or a bookmark #fbm of @from still exists. ZFS Holds ZFS holds prevent a snapshot from being deleted through zfs destroy, letting the destroy fail with a datset is busy error. Holds are created and referred to by a tag. They can be thought of as a named, persistent lock on the snapshot. ZFS Abstractions Managed By zrepl With the background knowledge from the previous paragraph, we now sum- marize the different on-disk ZFS objects that zrepl manages to provide its functionality. Placeholder filesystems on the receiving side are regular ZFS filesys- tems with the ZFS property zrepl:placeholder=on. Placeholders allow the receiving side to mirror the sender's ZFS dataset hierarchy without replicating every filesystem at every intermediary dataset path compo- nent. Consider the following example: S/H/J shall be replicated to R/sink/job/S/H/J, but neither S/H nor S shall be replicated. ZFS re- quires the existence of R/sink/job/S and R/sink/job/S/H in order to re- ceive into R/sink/job/S/H/J. Thus, zrepl creates the parent filesys- tems as placeholders on the receiving side. If at some point S/H and S shall be replicated, the receiving side invalidates the placeholder flag automatically. The zrepl test placeholder command can be used to check whether a filesystem is a placeholder. The replication cursor bookmark and last-received-hold are managed by zrepl to ensure that future replications can always be done incremen- tally. The replication cursor is a send-side bookmark of the most re- cent successfully replicated snapshot, and the last-received-hold is a hold of that snapshot on the receiving side. Both are moved atomically after the receiving side has confirmed that a replication step is com- plete. The replication cursor has the format #zrepl_CUSOR_G_<GUID>_J_<JOB- NAME>. The last-received-hold tag has the format zrepl_last_re- ceived_J_<JOBNAME>. Encoding the job name in the names ensures that multiple sending jobs can replicate the same filesystem to different receivers without interference. Tentative replication cursor bookmarks are short-lived bookmarks that protect the atomic moving-forward of the replication cursor and last-received-hold (see this issue). They are only necessary if step holds are not used as per the replication.protection setting. The ten- tative replication cursor has the format #zrepl_CUSORTENTA- TIVE_G_<GUID>_J_<JOBNAME>. The zrepl zfs-abstraction list command pro- vides a listing of all bookmarks and holds managed by zrepl. Step holds are zfs holds managed by zrepl to ensure that a replication step can always be resumed if it is interrupted, e.g., due to network outage. zrepl creates step holds before it attempts a replication step and releases them after the receiver confirms that the replication step is complete. For an initial replication full @initial_snap, zrepl puts a zfs hold on @initial_snap. For an incremental send @from -> @to, zrepl puts a zfs hold on both @from and @to. Note that @from is not strictly necessary for resumability -- a bookmark on the sending side would be sufficient --, but size-estimation in currently used OpenZFS versions only works if @from is a snapshot. The hold tag has the for- mat zrepl_STEP_J_<JOBNAME>. A job only ever has one active send per filesystem. Thus, there are never more than two step holds for a given pair of (job,filesystem). Step bookmarks are zrepl's equivalent for holds on bookmarks (ZFS does not support putting holds on bookmarks). They are intended for a situ- ation where a replication step uses a bookmark #bm as incremental from where #bm is not managed by zrepl. To ensure resumability, zrepl copies #bm to step bookmark #zrepl_STEP_G_<GUID>_J_<JOBNAME>. If the replication is interrupted and #bm is deleted by the user, the step bookmark remains as an incremental source for the resumable send. Note that zrepl does not yet support creating step bookmarks because the corresponding ZFS feature for copying bookmarks is not yet widely available . Subscribe to zrepl issue #326 for details. The zrepl zfs-abstraction list command provides a listing of all book- marks and holds managed by zrepl. NOTE: More details can be found in the design document replication/design.md. Caveats With Complex Setups (More Than 2 Jobs or Machines) Most users are served well with a single sender and a single receiver job. This section documents considerations for more complex setups. ATTENTION: Before you continue, make sure you have a working understanding of how zrepl works and what zrepl does to ensure that replication be- tween sender and receiver is always possible without conflicts. This will help you understand why certain kinds of multi-machine se- tups do not (yet) work. NOTE: If you can't find your desired configuration, have questions or would like to see improvements to multi-job setups, please open an issue on GitHub. Multiple Jobs on One Machine As a general rule, multiple jobs configured on one machine must operate on disjoint sets of filesystems. Otherwise, concurrently running jobs might interfere when operating on the same filesystem. On your setup, ensure that • all filesystems filter specifications are disjoint • no root_fs is a prefix or equal to another root_fs • no filesystems filter matches any root_fs Exceptions to the rule: • A snap and push job on the same machine can match the same filesys- tems. To avoid interference, only one of the jobs should be pruning snapshots on the sender, the other one should keep all snapshots. Since the jobs won't coordinate, errors in the log are to be ex- pected, but zrepl's ZFS abstractions ensure that push and sink can always replicate incrementally. This scenario is detailed in one of the quick-start guides. Two Or More Machines This section might be relevant to users who wish to fan-in (N machines replicate to 1) or fan-out (replicate 1 machine to N machines). Working setups: • Fan-in: N servers replicated to one receiver, disjoint dataset trees. • This is the common use case of a centralized backup server. • Implementation: • N push jobs (one per sender server), 1 sink (as long as the dif- ferent push jobs have a different client identity) • N source jobs (one per sender server), N pull on the receiver server (unique names, disjoint root_fs) • The sink job automatically constrains each client to a disjoint sub-tree of the sink-side dataset hierarchy ${root_fs}/${client_identity}. Therefore, the different clients cannot interfere. • The pull job only pulls from one host, so it's up to the zrepl user to ensure that the different pull jobs don't interfere. • Fan-out: 1 server replicated to N receivers • Can be implemented either in a pull or push fashion. • pull setup: 1 pull job on each receiver server, each with a cor- responding unique source job on the sender server. • push setup: 1 sink job on each receiver server, each with a cor- responding unique push job on the sender server. • It is critical that we have one sending-side job (source, push) per receiver. The reason is that zrepl's ZFS abstractions (zrepl zfs-abstraction list) include the name of the source/push job, but not the receive-side job name or client identity (see issue #380). As a counter-example, suppose we used multiple pull jobs with only one source job. All pull jobs would share the same replication cursor bookmark and trip over each other, breaking incremental replication guarantees quickly. The anlogous problem exists for 1 push to N sink jobs. • The filesystems matched by the sending side jobs (source, push) need not necessarily be disjoint. For this to work, we need to avoid interference between snapshotting and pruning of the differ- ent sending jobs. The solution is to centralize sender-side snap- shot management in a separate snap job. Snapshotting in the source/push job should then be disabled (type: manual). And sender-side pruning (keep_sender) needs to be disabled in the ac- tive side (pull / push), since that'll be done by the snap job. • Restore limitations: when restoring from one of the pull targets (e.g., using zfs send -R), the replication cursor bookmarks don't exist on the restored system. This can break incremental replica- tion to all other receive-sides after restore. • See the fan-out replication quick-start guide for an example of this setup. Setups that do not work: • N pull identities, 1 source job. Tracking issue #380. Job Types in Detail Job Type push +---------------------+----------------------------+ | Parameter | Comment | +---------------------+----------------------------+ | type | = push | +---------------------+----------------------------+ | name | unique name of the job | | | (must not change) | +---------------------+----------------------------+ | connect | connect specification | +---------------------+----------------------------+ | filesystems | filter specification for | | | filesystems to be snap- | | | shotted and pushed to the | | | sink | +---------------------+----------------------------+ | send | send options, e.g. for en- | | | crypted sends | +---------------------+----------------------------+ | snapshotting | snapshotting specification | +---------------------+----------------------------+ | pruning | pruning specification | +---------------------+----------------------------+ | replication | replication options | +---------------------+----------------------------+ | conflict_resolution | conflict resolution op- | | | tions | +---------------------+----------------------------+ Example config: config/samples/push.yml Job Type sink +-----------+----------------------------+ | Parameter | Comment | +-----------+----------------------------+ | type | = sink | +-----------+----------------------------+ | name | unique name of the job | | | (must not change) | +-----------+----------------------------+ | serve | serve specification | +-----------+----------------------------+ | root_fs | ZFS filesystems are re- | | | ceived to | | | $root_fs/$client_iden- | | | tity/$source_path | +-----------+----------------------------+ Example config: config/samples/sink.yml Job Type pull +---------------------+----------------------------------------------------------------------------+ | Parameter | Comment | +---------------------+----------------------------------------------------------------------------+ | type | = pull | +---------------------+----------------------------------------------------------------------------+ | name | unique name of the job | | | (must not change) | +---------------------+----------------------------------------------------------------------------+ | connect | connect specification | +---------------------+----------------------------------------------------------------------------+ | root_fs | ZFS filesystems are re- | | | ceived to | | | $root_fs/$source_path | +---------------------+----------------------------------------------------------------------------+ | interval | Interval at which to pull from the source job (e.g. 10m). | | | manual disables periodic pulling, replication then only happens on wakeup. | +---------------------+----------------------------------------------------------------------------+ | pruning | pruning specification | +---------------------+----------------------------------------------------------------------------+ | replication | replication options | +---------------------+----------------------------------------------------------------------------+ | conflict_resolution | conflict resolution options | +---------------------+----------------------------------------------------------------------------+ Example config: config/samples/pull.yml Job Type source +--------------+----------------------------+ | Parameter | Comment | +--------------+----------------------------+ | type | = source | +--------------+----------------------------+ | name | unique name of the job | | | (must not change) | +--------------+----------------------------+ | serve | serve specification | +--------------+----------------------------+ | filesystems | filter specification for | | | filesystems to be snap- | | | shotted and exposed to | | | connecting clients | +--------------+----------------------------+ | send | send options, e.g. for en- | | | crypted sends | +--------------+----------------------------+ | snapshotting | snapshotting specification | +--------------+----------------------------+ Example config: config/samples/source.yml Local replication If you have the need for local replication (most likely between two lo- cal storage pools), you can use the local transport type to connect a local push job to a local sink job. Example config: config/samples/local.yml. Job Type snap (snapshot & prune only) Job type that only takes snapshots and performs pruning on the local machine. +--------------+----------------------------+ | Parameter | Comment | +--------------+----------------------------+ | type | = snap | +--------------+----------------------------+ | name | unique name of the job | | | (must not change) | +--------------+----------------------------+ | filesystems | filter specification for | | | filesystems to be snap- | | | shotted | +--------------+----------------------------+ | snapshotting | snapshotting specification | +--------------+----------------------------+ | pruning | pruning specification | +--------------+----------------------------+ Example config: config/samples/snap.yml Transports The zrepl RPC layer uses transports to establish a single, bidirec- tional data stream between an active and passive job. On the passive (serving) side, the transport also provides the client identity to the upper layers: this string is used for access control and separation of filesystem sub-trees in sink jobs. Transports are specified in the connect or serve section of a job definition. Contents • Transports • tcp Transport • Serve • Connect • tls Transport • Serve • Connect • Mutual-TLS between Two Machines • Certificate Authority using EasyRSA • ssh+stdinserver Transport • Serve • Connect • local Transport ATTENTION: The client identities must be valid ZFS dataset path components be- cause the sink job uses ${root_fs}/${client_identity} to determine the client's subtree. tcp Transport The tcp transport uses plain TCP, which means that the data is not en- crypted on the wire. Clients are identified by their IPv4 or IPv6 ad- dresses, and the client identity is established through a mapping on the server. This transport may also be used in conjunction with network-layer en- cryption and/or VPN tunnels to provide encryption on the wire. To make the IP-based client authentication effective, such solutions should provide authenticated IP addresses. Some options to consider: • WireGuard: Linux-focussed, in-kernel TLS • OpenVPN: Cross-platform VPN, uses tun on *nix • IPSec: Properly standardized, in-kernel network-layer VPN • spiped: think of it as an encrypted pipe between two servers • SSH • sshuttle: VPN-like solution, but using SSH • SSH port forwarding: Systemd user unit & make it start before the zrepl service. Serve jobs: - type: sink serve: type: tcp listen: ":8888" listen_freebind: true # optional, default false clients: { "192.168.122.123" : "mysql01", "192.168.122.42" : "mx01", "2001:0db8:85a3::8a2e:0370:7334": "gateway", # CIDR masks require a '*' in the client identity string # that is expanded to the client's IP address "10.23.42.0/24": "cluster-*" "fde4:8dba:82e1::/64": "san-*" } ... listen_freebind controls whether the socket is allowed to bind to non-local or unconfigured IP addresses (Linux IP_FREEBIND , FreeBSD IP_BINDANY). Enable this option if you want to listen on a specific IP address that might not yet be configured when the zrepl daemon starts. Connect jobs: - type: push connect: type: tcp address: "10.23.42.23:8888" dial_timeout: # optional, default 10s ... tls Transport The tls transport uses TCP + TLS with client authentication using client certificates. The client identity is the common name (CN) pre- sented in the client certificate. It is recommended to set up a dedicated CA infrastructure for this transport, e.g. using OpenVPN's EasyRSA. For a simple 2-machine setup, mutual TLS might also be sufficient. We provide copy-pastable instruc- tions to generate the certificates below. The implementation uses Go's TLS library. Since Go binaries are stati- cally linked, you or your distribution need to recompile zrepl when vulnerabilities in that library are disclosed. All file paths are resolved relative to the zrepl daemon's working di- rectory. Specify absolute paths if you are unsure what directory that is (or find out from your init system). If intermediate CAs are used, the full chain must be present in either in the ca file or the individual cert files. Regardless, the client's certificate must be first in the cert file, with each following cer- tificate directly certifying the one preceding it (see TLS's specifica- tion). This is the common default when using a CA management tool. NOTE: As of Go 1.15 (zrepl 0.3.0 and newer), the Go TLS / x509 library re- qurires Subject Alternative Names be present in certificates. You might need to re-generate your certificates using one of the two al- ternatives provided below. Note further that zrepl continues to use the CommonName field to as- sign client identities. Hence, we recommend to keep the Subject Al- ternative Name and the CommonName in sync. Serve jobs: - type: sink root_fs: "pool2/backup_laptops" serve: type: tls listen: ":8888" listen_freebind: true # optional, default false ca: /etc/zrepl/ca.crt cert: /etc/zrepl/prod.fullchain key: /etc/zrepl/prod.key client_cns: - "laptop1" - "homeserver" The ca field specified the certificate authority used to validate client certificates. The client_cns list specifies a list of accepted client common names (which are also the client identities for this transport). The listen_freebind field is explained here. Connect jobs: - type: pull connect: type: tls address: "server1.foo.bar:8888" ca: /etc/zrepl/ca.crt cert: /etc/zrepl/backupserver.fullchain key: /etc/zrepl/backupserver.key server_cn: "server1" dial_timeout: # optional, default 10s The ca field specifies the CA which signed the server's certificate (serve.cert). The server_cn specifies the expected common name (CN) of the server's certificate. It overrides the hostname specified in ad- dress. The connection fails if either do not match. Mutual-TLS between Two Machines However, for a two-machine setup, self-signed certificates distributed using an out-of-band mechanism will also work just fine: Suppose you have a push-mode setup, with backups.example.com running the sink job, and prod.example.com running the push job. Run the fol- lowing OpenSSL commands on each host, substituting HOSTNAME in both filenames and the interactive input prompt by OpenSSL: (name=HOSTNAME; openssl req -x509 -sha256 -nodes \ -newkey rsa:4096 \ -days 365 \ -keyout $name.key \ -out $name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name") Now copy each machine's HOSTNAME.crt to the other machine's /etc/zrepl/HOSTNAME.crt, for example using scp. The serve & connect configuration will thus look like the following: # on backups.example.com - type: sink serve: type: tls listen: ":8888" ca: "/etc/zrepl/prod.example.com.crt" cert: "/etc/zrepl/backups.example.com.crt" key: "/etc/zrepl/backups.example.com.key" client_cns: - "prod.example.com" ... # on prod.example.com - type: push connect: type: tls address:"backups.example.com:8888" ca: /etc/zrepl/backups.example.com.crt cert: /etc/zrepl/prod.example.com.crt key: /etc/zrepl/prod.example.com.key server_cn: "backups.example.com" ... Certificate Authority using EasyRSA For more than two machines, it might make sense to set up a CA infra- structure. Tools like EasyRSA make this very easy: #!/usr/bin/env bash set -euo pipefail HOSTS=(backupserver prod1 prod2 prod3) curl -L https://github.com/OpenVPN/easy-rsa/releases/download/v3.0.7/EasyRSA-3.0.7.tgz > EasyRSA-3.0.7.tgz echo "157d2e8c115c3ad070c1b2641a4c9191e06a32a8e50971847a718251eeb510a8 EasyRSA-3.0.7.tgz" | sha256sum -c rm -rf EasyRSA-3.0.7 tar -xf EasyRSA-3.0.7.tgz cd EasyRSA-3.0.7 ./easyrsa ./easyrsa init-pki ./easyrsa build-ca nopass for host in "${HOSTS[@]}"; do ./easyrsa build-serverClient-full $host nopass echo cert for host $host available at pki/issued/$host.crt echo key for host $host available at pki/private/$host.key done echo ca cert available at pki/ca.crt ssh+stdinserver Transport ssh+stdinserver uses the ssh command and some features of the server-side SSH authorized_keys file. It is less efficient than other transports because the data passes through two more pipes. However, it is fairly convenient to set up and allows the zrepl daemon to not be directly exposed to the internet, because all traffic passes through the system's SSH server. The concept is inspired by git shell and Borg Backup. The implementa- tion is provided by the Go package github.com/problame/go-netssh. NOTE: ssh+stdinserver generally provides inferior error detection and han- dling compared to the tcp and tls transports. When encountering such problems, consider using tcp or tls transports, or help im- prove package go-netssh. Serve jobs: - type: source serve: type: stdinserver client_identities: - "client1" - "client2" ... First of all, note that type=stdinserver in this case: Currently, only connect.type=ssh+stdinserver can connect to a serve.type=stdinserver, but we want to keep that option open for future extensions. The serving job opens a UNIX socket named after client_identity in the runtime directory. In our example above, that is /var/run/zrepl/stdin- server/client1 and /var/run/zrepl/stdinserver/client2. On the same machine, the zrepl stdinserver $client_identity command connects to /var/run/zrepl/stdinserver/$client_identity. It then passes its stdin and stdout file descriptors to the zrepl daemon via cmsg(3). zrepl daemon in turn combines them into an object implement- ing net.Conn: a Write() turns into a write to stdout, a Read() turns into a read from stdin. Interactive use of the stdinserver subcommand does not make much sense. However, we can force its execution when a user with a particular SSH pubkey connects via SSH. This can be achieved with an entry in the au- thorized_keys file of the serving zrepl daemon. # for OpenSSH >= 7.2 command="zrepl stdinserver CLIENT_IDENTITY",restrict CLIENT_SSH_KEY # for older OpenSSH versions command="zrepl stdinserver CLIENT_IDENTITY",no-port-forwarding,no-X11-forwarding,no-pty,no-agent-forwarding,no-user-rc CLIENT_SSH_KEY • CLIENT_IDENTITY is substituted with an entry from client_identities in our example • CLIENT_SSH_KEY is substituted with the public part of the SSH keypair specified in the connect.identity_file directive on the connecting host. NOTE: You may need to adjust the PermitRootLogin option in /etc/ssh/sshd_config to forced-commands-only or higher for this to work. Refer to sshd_config(5) for details. To recap, this is of how client authentication works with the ssh+stdinserver transport: • Connections to the /var/run/zrepl/stdinserver/${client_identity} UNIX socket are blindly trusted by zrepl daemon. The connection client identity is the name of the socket, i.e. ${client_identity}. • Thus, the runtime directory must be private to the zrepl user (this is checked by zrepl daemon) • The admin of the host with the serving zrepl daemon controls the au- thorized_keys file. • Thus, the administrator controls the mapping PUBKEY -> CLIENT_IDEN- TITY. Connect jobs: - type: pull connect: type: ssh+stdinserver host: prod.example.com user: root port: 22 identity_file: /etc/zrepl/ssh/identity # options: # optional, default [], `-o` arguments passed to ssh # - "Compression=yes" # dial_timeout: 10s # optional, default 10s, max time.Duration until initial handshake is completed The connecting zrepl daemon 1. Creates a pipe 2. Forks 3. In the forked process 1. Replaces forked stdin and stdout with the corresponding pipe ends 2. Executes the ssh binary found in $PATH. 1. The identity file (-i) is set to $identity_file. 2. The remote user, host and port correspond to those configured. 3. Further options can be specified using the options field, which appends each entry in the list to the command line using -o $entry. 4. Wraps the pipe ends in a net.Conn and returns it to the RPC layer. As discussed in the section above, the connecting zrepl daemon expects that zrepl stdinserver $client_identity is executed automatically via an authorized_keys file entry. The known_hosts file used by the ssh command must contain an entry for connect.host prior to starting zrepl. Thus, run the following on the pulling host's command line (substituting connect.host): ssh -i /etc/zrepl/ssh/identity root@prod.example.com NOTE: The environment variables of the underlying SSH process are cleared. $SSH_AUTH_SOCK will not be available. It is suggested to create a separate, unencrypted SSH key solely for that purpose. local Transport The local transport can be used to implement local replication, i.e., push replication between a push and sink job defined in the same con- figuration file. The listener_name is analogous to a hostname and must match between serve and connect. The client_identity is used by the sink as docu- mented above. jobs: - type: sink serve: type: local listener_name: localsink ... - type: push connect: type: local listener_name: localsink client_identity: local_backup dial_timeout: 2s # optional, 0 for no timeout ... Filter Syntax For source, push and snap jobs, a filesystem filter must be defined (field filesystems). A filter takes a filesystem path (in the ZFS filesystem hierarchy) as parameter and returns true (pass) or false (block). A filter is specified as a YAML dictionary with patterns as keys and booleans as values. The following rules determine which result is cho- sen for a given filesystem path: • More specific path patterns win over less specific ones • Non-wildcard patterns (full path patterns) win over subtree wildcards (< at end of pattern) • If the path in question does not match any pattern, the result is false. The subtree wildcard < means "the dataset left of < and all its chil- dren". TIP: You can try out patterns for a configured job using the zrepl test filesystems subcommand for push and source jobs. Examples Full Access The following configuration will allow access to all filesystems. jobs: - type: source filesystems: { "<": true, } ... Fine-grained The following configuration demonstrates all rules presented above. jobs: - type: source filesystems: { "tank<": true, # rule 1 "tank/foo<": false, # rule 2 "tank/foo/bar": true, # rule 3 } ... Which rule applies to given path, and what is the result? tank/foo/bar/loo => 2 false tank/bar => 1 true tank/foo/bar => 3 true zroot => NONE false tank/var/log => 1 true Send & Recv Options Send Options Source and push jobs have an optional send configuration section. jobs: - type: push filesystems: ... send: # flags from the table below go here ... The following table specifies the list of (boolean) options. Flags with an entry in the zfs send column map directly to the zfs send CLI flags. zrepl does not perform feature checks for these flags. If you enable a flag that is not supported by the installed version of ZFS, the zfs error will show up at runtime in the logs and zrepl status. See the upstream man page (man zfs-send) for their semantics. +-------------------+----------+---------------------+ | send. | zfs send | Comment | +-------------------+----------+---------------------+ | encrypted | | Specific to zrepl, | | | | see below. | +-------------------+----------+---------------------+ | bandwidth_limit | | Specific to zrepl, | | | | see below. | +-------------------+----------+---------------------+ | raw | -w | Use encrypted to | | | | only allow en- | | | | crypted sends. | | | | Mixed sends are not | | | | supported. | +-------------------+----------+---------------------+ | send_properties | -p | Be careful, read | | | | the note on prop- | | | | erty replication | | | | below. | +-------------------+----------+---------------------+ | backup_properties | -b | Be careful, read | | | | the note on prop- | | | | erty replication | | | | below. | +-------------------+----------+---------------------+ | large_blocks | -L | Potential data loss | | | | on OpenZFS < 2.0, | | | | see warning below. | +-------------------+----------+---------------------+ | compressed | -c | | +-------------------+----------+---------------------+ | embedded_data | -e | | +-------------------+----------+---------------------+ | saved | -S | | +-------------------+----------+---------------------+ encrypted The encrypted option controls whether the matched filesystems are sent as OpenZFS native encryption raw sends. More specifically, if en- crypted=true, zrepl • checks for any of the filesystems matched by filesystems whether the ZFS encryption property indicates that the filesystem is actually en- crypted with ZFS native encryption and • invokes the zfs send subcommand with the -w option (raw sends) and • expects the receiving side to support OpenZFS native encryption (recv will fail otherwise) Filesystems matched by filesystems that are not encrypted are not sent and will cause error log messages. If encrypted=false, zrepl expects that filesystems matching filesystems are not encrypted or have loaded encryption keys. NOTE: Use encrypted instead of raw to make your intent clear that zrepl must only replicate filesystems that are actually encrypted by Open- ZFS native encryption. It is meant as a safeguard to prevent unin- tended sends of unencrypted filesystems in raw mode. properties Sends the dataset properties along with snapshots. Please be careful with this option and read the note on property replication below. backup_properties When properties are modified on a filesystem that was received from a send stream with send.properties=true, ZFS archives the original re- ceived value internally. This also applies to inheriting or overriding properties during zfs receive. When sending those received filesystems another hop, the backup_proper- ties flag instructs ZFS to send the original property values rather than the current locally set values. This is useful for replicating properties across multiple levels of backup machines. Example: Suppose we want to flow snapshots from Ma- chine A to B, then from B to C. A will enable the properties send op- tion. B will want to override critical properties such as mountpoint or canmount. But the job that replicates from B to C should be sending the original property values received from A. Thus, B sets the backup_properties option. Please be careful with this option and read the note on property repli- cation below. large_blocks This flag should not be changed after initial replication. Prior to OpenZFS commit 7bcb7f08 it was possible to change this setting which resulted in data loss on the receiver. The commit in question is in- cluded in OpenZFS 2.0 and works around the problem by prohibiting re- ceives of incremental streams with a flipped setting. WARNING: This bug has not been fixed in the OpenZFS 0.8 releases which means that changing this flag after initial replication might cause data loss on the receiver. Recv Options Sink and pull jobs have an optional recv configuration section: jobs: - type: pull recv: properties: inherit: - "mountpoint" override: { "org.openzfs.systemd:ignore": "on" } bandwidth_limit: ... placeholder: encryption: unspecified | off | inherit ... Jump to properties , bandwidth_limit , and placeholder. properties override maps directly to the zfs recv -o flag. Property name-value pairs specified in this map will apply to all received filesystems, re- gardless of whether the send stream contains properties or not. inherit maps directly to the zfs recv -x flag. Property names speci- fied in this list will be inherited from the receiving side's parent filesystem (e.g. root_fs). With both options, the sending side's property value is still stored on the receiver, but the local override or inherit is the one that takes effect. You can send the original properties from the first receiver to another receiver using send.backup_properties. A Note on Property Replication If a send stream contains properties, as per send.properties or send.backup_properties, the default ZFS behavior is to use those prop- erties on the receiving side, verbatim. In many use cases for zrepl, this can have devastating consequences. For example, when backing up a filesystem that has mountpoint=/ to a storage server, that storage server's root filesystem will be shadowed by the received file system on some platforms. Also, many scripts and tools use ZFS user properties for configuration and do not check the property source (local vs. received). If they are installed on the re- ceiving side as well as the sending side, property replication could have unintended effects. zrepl currently does not provide any automatic safe-guards for property replication: • Make sure to read the entire man page on zfs recv (man zfs recv) be- fore enabling this feature. • Use recv.properties.override whenever possible, e.g. for mount- point=none or canmount=off. • Use recv.properties.inherit if that makes more sense to you. Below is an non-exhaustive list of problematic properties. Please open a pull request if you find a property that is missing from this list. (Both with regards to core ZFS tools and other software in the broader ecosystem.) Mount behaviour • mountpoint • canmount • overlay Note: Before OpenZFS 2.0.5, inheriting or overriding the mountpoint property on ZVOLs fails in zfs recv. If you are on such an older ver- sion, consider creating separate zrepl jobs for your ZVOL and filesys- tem datasets. Systemd With systemd, you should also consider the properties processed by the zfs-mount-generator . Most notably: • org.openzfs.systemd:ignore • org.openzfs.systemd:wanted-by • org.openzfs.systemd:required-by Encryption If the sender filesystems are encrypted but the sender does plain sends and property replication is enabled, the receiver must inherit the fol- lowing properties: • keylocation • keyformat • encryption Placeholders placeholder: encryption: unspecified | off | inherit During replication, zrepl creates placeholder datasets on the receiving side if the sending side's filesystems filter creates gaps in the dataset hierarchy. This is generally fully transparent to the user. However, with OpenZFS Native Encryption, placeholders require zrepl user attention. Specifically, the problem is that, when zrepl attempts to create the placeholder dataset on the receiver, and that place- holder's parent dataset is encrypted, ZFS wants to inherit encryption to the placeholder. This is relevant to two use cases that zrepl sup- ports: 1. encrypted-send-to-untrusted-receiver In this use case, the sender sends an encrypted send stream and the receiver doesn't have the key loaded. 2. send-plain-encrypt-on-receive The receive-side root_fs dataset is encrypted, and the senders are unencrypted. The key of root_fs is loaded, and the goal is that the plain sends (e.g., from production) are encrypted on-the-fly during receive, with root_fs's key. For encrypted-send-to-untrusted-receiver, the placeholder datasets need to be created with -o encryption=off. Without it, creation would fail with an error, indicating that the placeholder's parent dataset's key needs to be loaded. But we don't trust the receiver, so we can't ex- pect that to ever happen. However, for send-plain-encrypt-on-receive, we cannot set -o encryp- tion=off. The reason is that if we did, any of the (non-placeholder) child datasets below the placeholder would inherit encryption=off, thereby silently breaking our encrypt-on-receive use case. So, to cover this use case, we need to create placeholders without specifying -o encryption. This will make zfs create inherit the encryption mode from the parent dataset, and thereby transitively from root_fs. The zrepl config provides the recv.placeholder.encryption knob to con- trol this behavior. In undefined mode (default), placeholder creation bails out and asks the user to configure a behavior. In off mode, the placeholder is created with encryption=off, i.e., encrypted-send-to-un- trusted-rceiver use case. In inherit mode, the placeholder is created without specifying -o encryption at all, i.e., the send-plain-en- crypt-on-receive use case. Common Options Bandwidth Limit (send & recv) bandwidth_limit: max: 23.5 MiB # -1 is the default and disabled rate limiting bucket_capacity: # token bucket capacity in bytes; defaults to 128KiB Both send and recv can be limited to a maximum bandwidth through band- width_limit. For most users, it should be sufficient to just set band- width_limit.max. The bandwidth_limit.bucket_capacity refers to the token bucket size. The bandwidth limit only applies to the payload data, i.e., the ZFS send stream. It does not account for transport protocol overheads. The scope is the job level, i.e., all concurrent sends or incoming re- ceives of a job share the bandwidth limit. Replication Options jobs: - type: push filesystems: ... replication: protection: initial: guarantee_resumability # guarantee_{resumability,incremental,nothing} incremental: guarantee_resumability # guarantee_{resumability,incremental,nothing} concurrency: size_estimates: 4 steps: 1 ... protection option The protection variable controls the degree to which a replicated filesystem is protected from getting out of sync through a zrepl pruner or external tools that destroy snapshots. zrepl can guarantee resumability or just incremental replication. guarantee_resumability is the default value and guarantees that a replication step is always resumable and that incremental replication will always be possible. The implementation uses replication cursors, last-received-hold and step holds. guarantee_incremental only guarantees that incremental replication will always be possible. If a step from -> to is interrupted and its to snapshot is destroyed, zrepl will remove the half-received to's resume state and start a new step from -> to2. The implementation uses repli- cation cursors, tentative replication cursors and last-received-hold. guarantee_nothing does not make any guarantees with regards to keeping sending and receiving side in sync. No bookmarks or holds are created to protect sender and receiver from diverging. Tradeoffs Using guarantee_incremental instead of guarantee_resumability obviously removes the resumability guarantee. This means that replication progress is no longer monotonic which might lead to a replication setup that never makes progress if mid-step interruptions are too frequent (e.g. frequent network outages). However, the advantage and reason for existence of the incremental mode is that it allows the pruner to delete snapshots of interrupted replication steps which is useful if replication happens so rarely (or fails so frequently) that the amount of disk space exclusively referenced by the step's snapshots becomes intolerable. NOTE: When changing this flag, obsoleted zrepl-managed bookmarks and holds will be destroyed on the next replication step that is attempted for each filesystem. concurrency option The concurrency options control the maximum amount of concurrency dur- ing replication. The default values allow some concurrency during size estimation but no parallelism for the actual replication. • concurrency.steps (default = 1) controls the maximum number of con- currently executed replication steps. The planning step for each file system is counted as a single step. • concurrency.size_estimates (default = 4) controls the maximum number of concurrent step size estimations done by the job. Note that initial replication cannot start replicating child filesys- tems before the parent filesystem's initial replication step has com- pleted. Some notes on tuning these values: • Disk: Size estimation is less I/O intensive than step execution be- cause it does not need to access the data blocks. • CPU: Size estimation is usually a dense CPU burst whereas step execu- tion CPU utilization is stretched out over time because of disk IO. Faster disks, sending a compressed dataset in plain mode and the zrepl transport mode all contribute to higher CPU requirements. • Network bandwidth: Size estimation does not consume meaningful amounts of bandwidth, step execution does. • zrepl ZFS abstractions: for each replication step zrepl needs to up- date its ZFS abstractions through the zfs command which often waits multiple seconds for the zpool to sync. Thus, if the actual send & recv time of a step is small compared to the time spent on zrepl ZFS abstractions then increasing step execution concurrency will result in a lower overall turnaround time. Conflict Resolution Options jobs: - type: push filesystems: ... conflict_resolution: initial_replication: most_recent | all | fail # default: most_recent ... initial_replication option The initial_replication option determines how many snapshots zrepl replicates if the filesystem has not been replicated before. If most_recent (the default), the initial replication will only transfer the most recent snapshot, while ignoring previous snapshots. If all snapshots should be replicated, specify all. Use fail to make replica- tion of the filesystem fail in case there is no corresponding fileystem on the receiver. For example, suppose there are snapshosts tank@1, tank@2, tank@3 on a sender. Then most_recent will replicate just @3, but all will repli- cate @1, @2, and @3. If initial replication is interrupted, and there is at least one (maybe partial) snapshot on the receiver, zrepl will always resume in incre- mental mode. And that is regardless of where the initial replication was interrupted. For example, if initial_replication: all and the transfer of @1 is in- terrupted, zrepl would retry/resume at @1. And even if the user changes the config to initial_replication: most_recent before resuming, incremental mode will still resume at @1. Taking Snaphots You can configure zrepl to take snapshots of the filesystems in the filesystems field specified in push, source and snap jobs. The following snapshotting types are supported: +-------------------+----------------------------+ | snapshotting.type | Comment | +-------------------+----------------------------+ | periodic | Ensure that snapshots are | | | taken at a particular in- | | | terval. | +-------------------+----------------------------+ | cron | Use cron spec to take | | | snapshots at particular | | | points in time. | +-------------------+----------------------------+ | manual | zrepl does not take any | | | snapshots by itself. | +-------------------+----------------------------+ The periodic and cron snapshotting types share some common options and behavior: • Naming: The snapshot names are composed of a user-defined prefix fol- lowed by a UTC date formatted like 20060102_150405_000. We use UTC because it will avoid name conflicts when switching time zones or be- tween summer and winter time. • Hooks: You can configure hooks to run before or after zrepl takes the snapshots. See below for details. • Push replication: After creating all snapshots, the snapshotter will wake up the replication part of the job, if it's a push job. Note that snapshotting is decoupled from replication, i.e., if it is down or takes too long, snapshots will still be taken. Note further that other jobs are not woken up by snapshotting. NOTE: There is no concept of ownership of the snapshots that are cre- ated by periodic or cron. Thus, there is no distinction between zrepl-created snapshots and user-created snapshots during repli- cation or pruning. In particular, pruning will take all snapshots into considera- tion by default. To constrain pruning to just zrepl-created snapshots: 1. Assign a unique prefix to the snapshotter and 2. Use the regex functionality of the various pruning keep rules to just consider snapshots with that prefix. There is currently no way to constrain replication to just zrepl-created snapshots. Follow and comment at issue #403 if you need this functionality. NOTE: The zrepl signal wakeup JOB subcommand does not trigger snapshot- ting. periodic Snapshotting jobs: - ... filesystems: { ... } snapshotting: type: periodic prefix: zrepl_ interval: 10m # Timestamp format that is used as snapshot suffix. # Can be any of "dense" (default), "human", "iso-8601", "unix-seconds" or a custom Go time format (see https://go.dev/src/time/format.go) timestamp_format: dense hooks: ... pruning: ... The periodic snapshotter ensures that snapshots are taken in the speci- fied interval. If you use zrepl for backup, this translates into your recovery point objective (RPO). To meet your RPO, you still need to monitor that replication, which happens asynchronously to snapshotting, actually works. It is desirable to get all filesystems snapshotted simultaneously be- cause it results in a more consistent backup. To accomplish this while still maintaining the interval, the periodic snapshotter attempts to get the snapshotting rhythms in sync. To find that sync point, the most recent snapshot, created by the snapshotter, in any of the matched filesystems is used. A filesystem that does not have snapshots by the snapshotter has lower priority than filesystem that do, and thus might not be snapshotted (and replicated) until it is snapshotted at the next sync point. The snapshotter uses the prefix to identify which snap- shots it created. cron Snapshotting jobs: - type: snap filesystems: { ... } snapshotting: type: cron prefix: zrepl_ # (second, optional) minute hour day-of-month month day-of-week # This example takes snapshots daily at 3:00. cron: "0 3 * * *" # Timestamp format that is used as snapshot suffix. # Can be any of "dense" (default), "human", "iso-8601", "unix-seconds" or a custom Go time format (see https://go.dev/src/time/format.go) timestamp_format: dense pruning: ... In cron mode, the snapshotter takes snaphots at fixed points in time. See https://en.wikipedia.org/wiki/Cron for details on the syntax. zrepl uses the the github.com/robfig/cron/v3 Go package for parsing. An optional field for "seconds" is supported to take snapshots at sub-minute frequencies. Timestamp Format The cron and periodic snapshotter support configuring a custom time- stamp format that is used as suffix for the snapshot name. It can be used by setting timestamp_format to any of the following values: • dense (default) looks like 20060102_150405_000 • human looks like 2006-01-02_15:04:05 • iso-8601 looks like 2006-01-02T15:04:05.000Z • unix-seconds looks like 1136214245 • Any custom Go time format accepted by time.Time#Format. manual Snapshotting jobs: - type: push snapshotting: type: manual ... In manual mode, zrepl does not take snapshots by itself. Manual snap- shotting is most useful if you have existing infrastructure for snap- shot management. Or, if you want to decouple snapshot management from replication using a zrepl snap job. See this quickstart guide for an example. To trigger replication after taking snapshots, use the zrepl signal wakeup JOB command. Pre- and Post-Snapshot Hooks Jobs with periodic snapshots can run hooks before and/or after taking the snapshot specified in snapshotting.hooks: Hooks are called per filesystem before and after the snapshot is taken (pre- and post-edge). Pre-edge invocations are in configuration order, post-edge invocations in reverse order, i.e. like a stack. If a pre-snapshot invocation fails, err_is_fatal=true cuts off subsequent hooks, does not take a snapshot, and only invokes post-edges corresponding to previous suc- cessful pre-edges. err_is_fatal=false logs the failed pre-edge invoca- tion but does not affect subsequent hooks nor snapshotting itself. Post-edges are only invoked for hooks whose pre-edges ran without er- ror. Note that hook failures for one filesystem never affect other filesystems. The optional timeout parameter specifies a period after which zrepl will kill the hook process and report an error. The default is 30 sec- onds and may be specified in any units understood by time.ParseDuration. The optional filesystems filter which limits the filesystems the hook runs for. This uses the same filter specification as jobs. Most hook types take additional parameters, please refer to the respec- tive subsections below. +---------------------+---------+---------------------+ | Hook type | Details | Description | +---------------------+---------+---------------------+ | command | Details | Arbitrary pre- and | | | | post snapshot | | | | scripts. | +---------------------+---------+---------------------+ | postgres-checkpoint | Details | Execute Postgres | | | | CHECKPOINT SQL com- | | | | mand before snap- | | | | shot. | +---------------------+---------+---------------------+ | mysql-lock-tables | Details | Flush and read-Lock | | | | MySQL tables while | | | | taking the snap- | | | | shot. | +---------------------+---------+---------------------+ command Hooks jobs: - type: push filesystems: { "<": true, "tmp": false } snapshotting: type: periodic prefix: zrepl_ interval: 10m hooks: - type: command path: /etc/zrepl/hooks/zrepl-notify.sh timeout: 30s err_is_fatal: false - type: command path: /etc/zrepl/hooks/special-snapshot.sh filesystems: { "tank/special": true } ... command hooks take a path to an executable script or binary to be exe- cuted before and after the snapshot. path must be absolute (e.g. /etc/zrepl/hooks/zrepl-notify.sh). No arguments may be specified; cre- ate a wrapper script if zrepl must call an executable that requires ar- guments. The process standard output is logged at level INFO. Standard error is logged at level WARN. The following environment variables are set: • ZREPL_HOOKTYPE: either "pre_snapshot" or "post_snapshot" • ZREPL_FS: the ZFS filesystem name being snapshotted • ZREPL_SNAPNAME: the zrepl-generated snapshot name (e.g. zrepl_20380119_031407_000) • ZREPL_DRYRUN: set to "true" if a dry run is in progress so scripts can print, but not run, their commands An empty template hook can be found in config/samples/hooks/template.sh. postgres-checkpoint Hook Connects to a Postgres server and executes the CHECKPOINT statement pre-snapshot. Checkpointing applies the WAL contents to all data files and syncs the data files to disk. This is not required for a consis- tent database backup: it merely forward-pays the "cost" of WAL replay to the time of snapshotting instead of at restore. However, the Post- gres manual recommends against checkpointing during normal operation. Further, the operation requires Postgres superuser privileges. zrepl users must decide on their own whether this hook is useful for them (it likely isn't). ATTENTION: Note that WALs and Postgres data directory (with all database data files) must be on the same filesystem to guarantee a correct point-in-time backup with the ZFS snapshot. DSN syntax documented here: https://godoc.org/github.com/lib/pq CREATE USER zrepl_checkpoint PASSWORD yourpasswordhere; ALTER ROLE zrepl_checkpoint SUPERUSER; - type: postgres-checkpoint dsn: "host=localhost port=5432 user=postgres password=yourpasswordhere sslmode=disable" filesystems: { "p1/postgres/data11": true } mysql-lock-tables Hook Connects to MySQL and executes • pre-snapshot FLUSH TABLES WITH READ LOCK to lock all tables in all databases in the MySQL server we connect to (docs) • post-snapshot UNLOCK TABLES reverse above operation. Above procedure is documented in the MySQL manual as a means to produce a consistent backup of a MySQL DBMS installation (i.e., all databases). DSN syntax: [username[:password]@][protocol[(address)]]/db- name[?param1=value1&...¶mN=valueN] ATTENTION: All MySQL databases must be on the same ZFS filesystem to guarantee a consistent point-in-time backup with the ZFS snapshot. CREATE USER zrepl_lock_tables IDENTIFIED BY 'yourpasswordhere'; GRANT RELOAD ON *.* TO zrepl_lock_tables; FLUSH PRIVILEGES; - type: mysql-lock-tables dsn: "zrepl_lock_tables:yourpasswordhere@tcp(localhost)/" filesystems: { "tank/mysql": true } Pruning Policies In zrepl, pruning means destroying snapshots. Pruning must happen on both sides of a replication or the systems would inevitably run out of disk space at some point. Typically, the requirements to temporal resolution and maximum reten- tion time differ per side. For example, when using zrepl to back up a busy database server, you will want high temporal resolution (snapshots every 10 min) for the last 24h in case of administrative disasters, but cannot afford to store them for much longer because you might have high turnover volume in the database. On the receiving side, you may have more disk space available, or need to comply with other backup reten- tion policies. zrepl uses a set of keep rules per sending and receiving side to de- termine which snapshots shall be kept per filesystem. A snapshot that is not kept by any rule is destroyed. The keep rules are evaluated on the active side (push or pull job) of the replication setup, for both active and passive side, after replication completed or was determined to have failed permanently. Example Configuration: jobs: - type: push name: ... connect: ... filesystems: { "<": true, "tmp": false } snapshotting: type: periodic prefix: zrepl_ interval: 10m pruning: keep_sender: - type: not_replicated # make sure manually created snapshots by the administrator are kept - type: regex regex: "^manual_.*" - type: grid grid: 1x1h(keep=all) | 24x1h | 14x1d regex: "^zrepl_.*" keep_receiver: - type: grid grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d regex: "^zrepl_.*" # manually created snapshots will be kept forever on receiver - type: regex regex: "^manual_.*" DANGER: You might have existing snapshots of filesystems affected by pruning which you want to keep, i.e. not be destroyed by zrepl. Make sure to actually add the necessary regex keep rules on both sides, like with manual in the example above. Policy not_replicated jobs: - type: push pruning: keep_sender: - type: not_replicated ... not_replicated keeps all snapshots that have not been replicated to the receiving side. It only makes sense to specify this rule for the keep_sender. The reason is that, by definition, all snapshots on the receiver have already been replicated to there from the sender. To de- termine whether a sender-side snapshot has already been replicated, zrepl uses the replication cursor bookmark which corresponds to the most recent successfully replicated snapshot. Policy grid jobs: - type: pull pruning: keep_receiver: - type: grid regex: "^zrepl_.*" grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d 1 repetition of a one-hour interval with keep=all 24 repetitions of a one-hour interval with keep=1 6 repetitions of a 30-day interval with keep=1 ... The retention grid can be thought of as a time-based sieve that thins out snapshots as they get older. The grid field specifies a list of adjacent time intervals. Each in- terval is a bucket with a maximum capacity of keep snapshots. The fol- lowing procedure happens during pruning: 1. The list of snapshots is filtered by the regular expression in regex. Only snapshots names that match the regex are considered for this rule, all others will be pruned unless another rule keeps them. 2. The snapshots that match regex are placed onto a time axis according to their creation date. The youngest snapshot is on the left, the oldest on the right. 3. The first buckets are placed "under" that axis so that the grid spec's first bucket's left edge aligns with youngest snapshot. 4. All subsequent buckets are placed adjacent to their predecessor bucket. 5. Now each snapshot on the axis either falls into one bucket or it is older than our rightmost bucket. Buckets are left-inclusive and right-exclusive which means that a snapshot on the edge of bucket will always 'fall into the right one'. 6. Snapshots older than the rightmost bucket are not kept by the grid specification. 7. For each bucket, we only keep the keep oldest snapshots. The syntax to describe the bucket list is as follows: Repeat x Duration (keep=all) • The duration specifies the length of the interval. • The keep count specifies the number of snapshots that fit into the bucket. It can be either a positive integer or all (all snapshots are kept). • The repeat count repeats the bucket definition for the specified num- ber of times. Example: Assume the following grid specification: grid: 1x1h(keep=all) | 2x2h | 1x3h This grid specification produces the following constellation of buckets: 0h 1h 2h 3h 4h 5h 6h 7h 8h 9h | | | | | | | | | | |-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------| | keep=all| keep=1 | keep=1 | keep=1 | Now assume that we have a set of snapshots @a, @b, ..., @D. Snapshot @a is the most recent snapshot. Snapshot @D is the oldest snapshot, it is almost 9 hours older than snapshot @a. We place the snapshots on the same timeline as the buckets: 0h 1h 2h 3h 4h 5h 6h 7h 8h 9h | | | | | | | | | | |-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------| | keep=all| keep=1 | keep=1 | keep=1 | | | | | | | a b c | d e f g h i j k l m n o p |q r s t u v w x y z |A B C D We obtain the following mapping of snapshots to buckets: Bucket1: a,b,c Bucket2: d,e,f,g,h,i Bucket3: j,k,l,m,n,o,p Bucket4: q,r,s,t,u,v,w,x,y,z No bucket: A,B,C,D For each bucket, we now prune snapshots until it only contains `keep` snapshots. Newer snapshots are destroyed first. Snapshots that do not fall into a bucket are always destroyed. Result after pruning: 0h 1h 2h 3h 4h 5h 6h 7h 8h 9h | | | | | | | | | | |-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------| | | | | | | a b c | i | p | z | Policy last_n jobs: - type: push pruning: keep_receiver: - type: last_n count: 10 regex: ^zrepl_.*$ # optional ... last_n filters the snapshot list by regex, then keeps the last count snapshots in that list (last = youngest = most recent creation date) All snapshots that don't match regex or exceed count in the filtered list are destroyed unless matched by other rules. Policy regex jobs: - type: push pruning: keep_receiver: # keep all snapshots with prefix zrepl_ or manual_ - type: regex regex: "^(zrepl|manual)_.*" - type: push snapshotting: prefix: zrepl_ pruning: keep_sender: # keep all snapshots that were not created by zrepl - type: regex negate: true regex: "^zrepl_.*" regex keeps all snapshots whose names are matched by the regular ex- pression in regex. Like all other regular expression fields in prune policies, zrepl uses Go's regexp.Regexp Perl-compatible regular expres- sions (Syntax). The optional negate boolean field inverts the seman- tics: Use it if you want to keep all snapshots that do not match the given regex. Source-side snapshot pruning A source jobs takes snapshots on the system it runs on. The corre- sponding pull job on the replication target connects to the source job and replicates the snapshots. Afterwards, the pull job coordinates pruning on both sender (the source job side) and receiver (the pull job side). There is no built-in way to define and execute pruning on the source side independently of the pull side. The source job will continue tak- ing snapshots which will not be pruned until the pull side connects. This means that extended replication downtime will fill up the source's zpool with snapshots. If the above is a conceivable situation for you, consider using push mode, where pruning happens on the same side where snapshots are taken. Workaround using snap job As a workaround (see GitHub issue #102 for development progress), a pruning-only snap job can be defined on the source side: The snap job is in charge of snapshot creation & destruction, whereas the source job's role is reduced to just serving snapshots. However, since, jobs are run independently, it is possible that the snap job will prune snapshots that are queued for replication / destruction by the remote pull job that connects to the source job. Symptoms of such race condi- tions are spurious replication and destroy errors. Example configuration: # source side jobs: - type: snap snapshotting: type: periodic pruning: keep: # source side pruning rules go here ... - type: source snapshotting: type: manual root_fs: ... # pull side jobs: - type: pull pruning: keep_sender: # let the source-side snap job do the pruning - type: regex regex: ".*" ... keep_receiver: # feel free to prune on the pull side as desired ... Logging zrepl uses structured logging to provide users with easily processable log messages. Logging outlets are configured in the global section of the config file. global: logging: - type: OUTLET_TYPE level: MINIMUM_LEVEL format: FORMAT - type: OUTLET_TYPE level: MINIMUM_LEVEL format: FORMAT ... jobs: ... ATTENTION: The first outlet is special: if an error writing to any outlet oc- curs, the first outlet receives the error and can print it. Thus, the first outlet must be the one that always works and does not block, e.g. stdout, which is the default. Default Configuration By default, the following logging configuration is used global: logging: - type: "stdout" level: "warn" format: "human" Building Blocks The following sections document the semantics of the different log lev- els, formats and outlet types. Levels +-------+-------+---------------------+ | Level | SHORT | Description | +-------+-------+---------------------+ | error | ERRO | immediate action | | | | required | +-------+-------+---------------------+ | warn | WARN | symptoms for mis- | | | | configuration, soon | | | | expected failure, | | | | etc. | +-------+-------+---------------------+ | info | INFO | explains what hap- | | | | pens without too | | | | much detail | +-------+-------+---------------------+ | debug | DEBG | tracing informa- | | | | tion, state dumps, | | | | etc. useful for de- | | | | bugging. | +-------+-------+---------------------+ Incorrectly classified messages are considered a bug and should be re- ported. Formats +--------+----------------------------+ | Format | Description | +--------+----------------------------+ | human | prints job and subsystem | | | into brackets before the | | | actual message, followed | | | by remaining fields in | | | logfmt style | +--------+----------------------------+ | logfmt | logfmt output. zrepl uses | | | this Go package. | +--------+----------------------------+ | json | JSON formatted output. | | | Each line is a valid JSON | | | document. Fields are mar- | | | shaled by encod- | | | ing/json.Marshal(), which | | | is particularly useful for | | | processing in log aggrega- | | | tion or when processing | | | state dumps. | +--------+----------------------------+ Outlets Outlets are the destination for log entries. stdout Outlet +-----------+----------------------------+ | Parameter | Comment | +-----------+----------------------------+ | type | stdout | +-----------+----------------------------+ | level | minimum log level | +-----------+----------------------------+ | format | output format | +-----------+----------------------------+ | time | always include time in | | | output (true or false) | +-----------+----------------------------+ | color | colorize output according | | | to log level (true or | | | false) | +-----------+----------------------------+ Writes all log entries with minimum level level formatted by format to stdout. If stdout is a tty, interactive usage is assumed and both time and color are set to true. Can only be specified once. syslog Outlet +----------------+----------------------------+ | Parameter | Comment | +----------------+----------------------------+ | type | syslog | +----------------+----------------------------+ | level | minimum log level | +----------------+----------------------------+ | format | output format | +----------------+----------------------------+ | facility | Which syslog facility to | | | use (default = local0) | +----------------+----------------------------+ | retry_interval | Interval between reconnec- | | | tion attempts to syslog | | | (default = 0) | +----------------+----------------------------+ Writes all log entries formatted by format to syslog. On normal se- tups, you should not need to change the retry_interval. Can only be specified once. tcp Outlet +----------------+----------------------------+ | Parameter | Comment | +----------------+----------------------------+ | type | tcp | +----------------+----------------------------+ | level | minimum log level | +----------------+----------------------------+ | format | output format | +----------------+----------------------------+ | net | tcp in most cases | +----------------+----------------------------+ | address | remote network, e.g. | | | logs.example.com:10202 | +----------------+----------------------------+ | retry_interval | Interval between reconnec- | | | tion attempts to address | +----------------+----------------------------+ | tls | TLS config (see below) | +----------------+----------------------------+ Establishes a TCP connection to address and sends log messages with minimum level level formatted by format. If tls is not specified, an unencrypted connection is established. If tls is specified, the TCP connection is secured with TLS + Client Authentication. The latter is particularly useful in combination with log aggregation services. +-----------+----------------------------+ | Parameter | Description | +-----------+----------------------------+ | ca | PEM-encoded certificate | | | authority that signed the | | | remote server's TLS cer- | | | tificate | +-----------+----------------------------+ | cert | PEM-encoded client cer- | | | tificate identifying this | | | zrepl daemon toward the | | | remote server | +-----------+----------------------------+ | key | PEM-encoded, unencrypted | | | client private key identi- | | | fying this zrepl daemon | | | toward the remote server | +-----------+----------------------------+ WARNING: zrepl drops log messages to the TCP outlet if the underlying connec- tion is not fast enough. Note that TCP buffering in the kernel must first run full before messages are dropped. Make sure to always configure a stdout outlet as the special error outlet to be informed about problems with the TCP outlet (see above ). NOTE: zrepl uses Go's crypto/tls and crypto/x509 packages and leaves all but the required fields in tls.Config at their default values. In case of a security defect in these packages, zrepl has to be rebuilt because Go binaries are statically linked. Monitoring Monitoring endpoints are configured in the global.monitoring section of the config file. Prometheus & Grafana zrepl can expose Prometheus metrics via HTTP. The listen attribute is a net.Listen string for tcp, e.g. :9811 or 127.0.0.1:9811 (port 9811 was reserved to zrepl on the official list). The listen_freebind at- tribute is explained here. The Prometheus monitoring job appears in the zrepl control job list and may be specified at most once. zrepl also ships with an importable Grafana dashboard that consumes the Prometheus metrics: see dist/grafana. The dashboard also contains some advice on which metrics are important to monitor. NOTE: At the time of writing, there is no stability guarantee on the ex- ported metrics. global: monitoring: - type: prometheus listen: ':9811' listen_freebind: true # optional, default false Miscellaneous Runtime Directories & UNIX Sockets The zrepl daemon needs to open various UNIX sockets in a runtime direc- tory: • a control socket that the CLI commands use to interact with the dae- mon • the ssh+stdinserver Transport listener opens one socket per config- ured client, named after client_identity parameter There is no authentication on these sockets except the UNIX permis- sions. The zrepl daemon will refuse to bind any of the above sockets in a directory that is world-accessible. The following sections of the global config shows the default paths. The shell script below shows how the default runtime directory can be created. global: control: sockpath: /var/run/zrepl/control serve: stdinserver: sockdir: /var/run/zrepl/stdinserver mkdir -p /var/run/zrepl/stdinserver chmod -R 0700 /var/run/zrepl Durations & Intervals Interval & duration fields in job definitions, pruning configurations, etc. must match the following regex: var durationStringRegex *regexp.Regexp = regexp.MustCompile(`^\s*(\d+)\s*(s|m|h|d|w)\s*$`) // s = second, m = minute, h = hour, d = day, w = week (7 days) Usage CLI Overview NOTE: The zrepl binary is self-documenting: run zrepl help for an overview of the available subcommands or zrepl SUBCOMMAND --help for informa- tion on available flags, etc. +-------------------------+-------------------------------------------------+ | Subcommand | Description | +-------------------------+-------------------------------------------------+ | zrepl help | show subcommand overview | +-------------------------+-------------------------------------------------+ | zrepl daemon | run the daemon, required | | | for all zrepl functional- | | | ity | +-------------------------+-------------------------------------------------+ | zrepl status | show job activity, or with | | | --mode raw for JSON output | +-------------------------+-------------------------------------------------+ | zrepl stdinserver | see ssh+stdinserver Trans- | | | port | +-------------------------+-------------------------------------------------+ | zrepl signal wakeup JOB | manually trigger replica- | | | tion + pruning of JOB | +-------------------------+-------------------------------------------------+ | zrepl signal reset JOB | manually abort current | | | replication + pruning of | | | JOB | +-------------------------+-------------------------------------------------+ | zrepl configcheck | check if config can be | | | parsed without errors | +-------------------------+-------------------------------------------------+ | zrepl migrate | perform on-disk state / ZFS property migrations | | | (see changelog for details) | +-------------------------+-------------------------------------------------+ | zrepl zfs-abstraction | list and remove zrepl's abstractions on top of | | | ZFS, e.g. holds and step bookmarks (see | | | overview ) | +-------------------------+-------------------------------------------------+ zrepl daemon All actual work zrepl does is performed by a daemon process. The dae- mon supports structured logging and provides monitoring endpoints. When installing from a package, the package maintainer should have pro- vided an init script / systemd.service file. You should thus be able to start zrepl daemon using your init system. Alternatively, or for running zrepl in the foreground, simply execute zrepl daemon. Note that you won't see much output with the default logging configuration: ATTENTION: Make sure to actually monitor the error level output of zrepl: some configuration errors will not make the daemon exit. Example: if the daemon cannot create the ssh+stdinserver Transport sockets in the runtime directory, it will emit an error message but not exit because other tasks such as periodic snapshots & pruning are of equal importance. Restarting The daemon handles SIGINT and SIGTERM for graceful shutdown. Graceful shutdown means at worst that a job will not be rescheduled for the next interval. The daemon exits as soon as all jobs have reported shut down. Systemd Unit File A systemd service definition template is available in dist/systemd. Note that some of the options only work on recent versions of systemd. Any help & improvements are very welcome, see issue #145. Ops Runbooks Migrating Sending Side Objective: Move sending-side zpool to new hardware. Make the move fully transparent to the sending-side jobs. After the move is done, all sending-side zrepl jobs should continue to work as if the move had not happened. In particular, incremental replication should be able to pick up where it left before the move. Suppose we want to migrate all data from one zpool oldpool to another zpool newpool. A possible reason might be that we want to change RAID levels, ashift, or just migrate over to next-gen hardware. If the pool names are different, zrepl's matching between sender and receiver dataset will break becase the receive-side dataset names con- tain oldpool. To avoid this, we will need the name of the new pool to match that of the old pool. The following steps will accomplish this: 1. Stop zrepl. 2. Create the new pool: zpool create newpool ... 3. Take a snapshot of the old pool so that you have something that you can zfs send. For example, run zfs snapshot -r oldpool@migra- tion_oldpool_newpool. 4. Send all of the oldpool's datasets to the new pool: zfs send -R oldpool@migration_oldpool_newpool | zfs recv -F newpool 5. Export the old pool: zpool export oldpool 6. Export the new pool: zpool export newpool 7. (Optional) Change the name of the old pool to something that does not conflict with the new pool. We are going to use the name oldoldpool in this example. Use zpool import with no arguments to see the pool id. Then zpool import <id> oldoldpool && zpool export oldoldpool. 8. Import the new pool, while changing the name to match the old pool: zpool import newpool oldpool 9. Start zrepl again and wake up the relevant jobs. 10. Use zrepl status or you monitoring to ensure that replication works. The best test is an end-to-end test where you write some junk data on a sender dataset and wait until a snapshot with that data appears on the receiving side. 11. Once you are confident that replication is working, you may dispose of the old pool. Note that, depending on pruning rules, it will not be possible to switch back to the old pool seamlessly, i.e., without a full re-repli- cation. Platform Tests Along with the main zrepl binary, we release the platformtest binaries. The zrepl platform tests are an integration test suite that is comple- mentary to the pure Go unit tests. Any test that needs to interact with ZFS is a platform test. The platform need to run as root. For each test, we create a fresh dummy zpool backed by a file-based vdev. The file path, and a root mountpoint for the dummy zpool, must be specified on the command line: mkdir -p /tmp/zreplplatformtest ./platformtest \ -poolname 'zreplplatformtest' \ # <- name must contain zreplplatformtest -imagepath /tmp/zreplplatformtest.img \ # <- zrepl will create the file -mountpoint /tmp/zreplplatformtest # <- must exist WARNING: platformtest will unconditionally overwrite the file at imagepath and unconditionally zpool destroy $poolname. So, don't use a pro- duction poolname, and consider running the test in a VM. It'll be a lot faster as well because the underlying operations, zfs list in particular, will be faster. While the platformtests are running, there will be a log of log output. After all tests have run, it prints a summary with a list of tests, grouped by result type (success, failure, skipped): PASSING TESTS: github.com/zrepl/zrepl/platformtest/tests.BatchDestroy github.com/zrepl/zrepl/platformtest/tests.CreateReplicationCursor github.com/zrepl/zrepl/platformtest/tests.GetNonexistent github.com/zrepl/zrepl/platformtest/tests.HoldsWork ... github.com/zrepl/zrepl/platformtest/tests.SendStreamNonEOFReadErrorHandling github.com/zrepl/zrepl/platformtest/tests.UndestroyableSnapshotParsing SKIPPED TESTS: github.com/zrepl/zrepl/platformtest/tests.SendArgsValidationEncryptedSendOfUnencryptedDatasetForbidden__EncryptionSupported_false FAILED TESTS: [] If there is a failure, or a skipped test that you believe should be passing, re-run the test suite, capture stderr & stdout to a text file, and create an issue on GitHub. To run a specific test case, or a subset of tests matched by regex, use the -run REGEX command line flag. To stop test execution at the first failing test, and prevent cleanup of the dummy zpool, use the -failure.stop-and-keep-pool flag. To build the platformtests yourself, use make test-platform-bin. There's also the make test-platform target to run the platform tests with a default command line. Talks & Presentations • Talk at OpenZFS Developer Summit 2018 of pre-release 0.1 ( 25min Recording , Slides , Event ) • Talk at EuroBSDCon2017 FreeBSD DevSummit with live demo of zrepl 0.0.3 ( 55min Recording, Slides, Event ) • Note: The remarks on keep_bookmarks are irrelevant as of zrepl 0.1 which introduced the zrepl-managed replication cursor bookmark. Read the Overview section to learn more. Changelog The changelog summarizes bugfixes that are deemed relevant for users and package maintainers. Developers should consult the git commit log or GitHub issue tracker. Next Release The plan for the next release is to revisit how zrepl does snapshot management. High-level goals: • Make it easy to decouple snapshot management (snapshotting, pruning) from replication. • Ability to include/exclude snapshots from replication. This is use- ful for aforementioned decoupling, e.g., separate snapshot prefixes for local & remote replication. Also, it makes explicit that by de- fault, zrepl replicates all snapshots, and that replication has no concept of "zrepl-created snapshots", which is a common misconcep- tion. • Use of zfs snapshot comma syntax or channel programs to take snap- shots of multiple datasets atomically. • Provide an alternative to the grid pruning policy. Most likely some- thing based on hourly/daily/weekly/monthly "trains" plus a count. • Ability to prune at the granularity of the group of snapshots created at a given time, as opposed to the individual snapshots within a dataset. Maybe this will be addressed by the alternative to the grid pruning policy, as it will likely be more predictable. Those changes will likely come with some breakage in the config. How- ever, I want to avoid breaking use cases that are satisfied by the cur- rent design. There will be beta/RC releases to give users a chance to evaluate. 0.6.1 • [FEATURE] add metric to detect filesystems rules that don't match any local dataset (thanks, @gmekicaxcient). • [BUG] zrepl status: hide progress bar once all filesystems reach ter- minal state (thanks, @0x3333). • [BUG] handling of tenative cursor presence if protection strategy doesn't use it (issue #714). • [DOCS] address setup with two or more external disks (thanks, @se-jaeger). • [DOCS] document replication and conflict_resolution options (thanks, @InsanePrawn). • [DOCS] docs: talks: add note on keep_bookmarks option (thanks, @skirmess). • [MAINT] dist: add openrc service file (thanks, @gramosg). • [MAINT] grafana: update dashboard to Grafana 9.3.6. • [MAINT] run platform tests as part of CI. • [MAINT] build: upgrade to Go 1.21 and update golangci-lint; minimum Go version for builds is now 1.20 NOTE: zrepl is a spare-time project primarily developed by Christian Schwarz. You can support maintenance and feature development through one of the following services: Donate via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal Note that PayPal processing fees are relatively high for small donations. For SEPA wire transfer and commercial support, please contact Christian directly. 0.6 • [FEATURE] Schedule-based snapshotting using cron syntax instead of an interval. • [FEATURE] Configurable initial replication policy. When a filesystem is first replicated to a receiver, this control whether just the newest snapshot will be replicated vs. all existing snapshots. Learn more in the docs. • [FEATURE] Configurable timestamp format for snapshot names via timestamp_format (Thanks, @ydylla). • [FEATURE] Add ZREPL_DESTROY_MAX_BATCH_SIZE env var (default 0=unlim- ited) (Thanks, @3nprob). • [FEATURE] Add zrepl configcheck --skip-cert-check flag (Thanks, @cole-h). • [BUG] Fix resuming from interrupted replications that use send.raw on unencrypted datasets. • The send options introduced in zrepl 0.4 allow users to specify ad- ditional zfs send flags for zrepl to use. Before this fix, when setting send.raw=true on a job that replicates unencrypted datasets, zrepl would not allow an interrupted replication to re- sume. The reason were overly cautious checks to support the send.encrypted option. • This bugfix removes these checks from the replication planner. This makes send.encrypted a sender-side-only concern, much like all other send.* flags. • However, this means that the zrepl status UI no longer indicates whether a replication step uses encrypted sends or not. The set- ting is still effective though. • [BREAK] convert Prometheus metric zrepl_version_daemon to zrepl_start_time metric • The metric still reports the zrepl version in a label. But the metric value is now the Unix timestamp at the time the daemon was started. The Grafana dashboard in dist/grafana has been updated. • [BUG] transient zrepl status error: Post "http://unix/status": EOF • [BUG] don't treat receive-side bookmarks as a replication conflict. This facilitates chaining of replication jobs. See issue #490. • [BUG] workaround for Go/gRPC problem on Illumos where zrepl would crash when using the local transport type (issue #598). • [BUG] fix active child tasks panic that cold occur during replication plannig (issue #193abbe) • [BUG] zrepl status off-by-one error in display of completed step count (commit ce6701f) • [BUG] Allow using day & week units for snapshotting.interval (commit ffb1d89) • [DOCS] docs/overview improvements (Thanks, @jtagcat). • [MAINT] Update to Go 1.19. 0.5 • [FEATURE] Bandwidth limiting (Thanks, Prominic.NET, Inc.) • [FEATURE] zrepl status: use a * to indicate which filesystem is cur- rently replicating • [FEATURE] include daemon environment variables in zrepl status (cur- rently only in --raw) • [BUG] fix encrypt-on-receive + placeholders use case (issue #504) • Before this fix, plain sends to a receiver with an encrypted root_fs could be received unencrypted if zrepl needed to create placeholders on the receiver. • Existing zrepl users should read the docs and check zfs get -r en- cryption,zrepl:placeholder PATH_TO_ROOTFS on the receiver. • Thanks to @mologie and @razielgn for reporting and testing! • [BUG] Rename mis-spelled send option embbeded_data to embedded_data. • [BUG] zrepl status: replication step numbers should start at 1 • [BUG] incorrect bandwidth averaging in zrepl status. • [BUG] FreeBSD with OpenZFS 2.0: zrepl would wait indefinitely for zfs send to exit on timeouts. • [BUG] fix strconv.ParseInt: value out of range bug (and use the con- trol RPCs). • [DOCS] improve description of multiple pruning rules. • [DOCS] document platform tests. • [DOCS] quickstart: make users aware that prune rules apply to all snapshots. • [MAINT] some platformtests were broken. • [MAINT] FreeBSD: release armv7 and arm64 binaries. • [MAINT] apt repo: update instructions due to apt-key deprecation. Note to all users: please read up on the following OpenZFS bugs, as you might be affected: • ZFS send/recv with ashift 9->12 leads to data corruption. • Various bugs with encrypted send/recv (Leadership meeting notes) Finally, I'd like to point you to the GitHub discussion about which bugfixes and features should be prioritized in zrepl 0.6 and beyond! 0.4.0 • [FEATURE] support setting zfs send / recv flags in the config (send: -wLcepbS , recv: -ox ). Config docs here and here . • [FEATURE] parallel replication is now configurable (disabled by de- fault, config docs here ). • [FEATURE] New zrepl status UI: • Interactive job selection. • Interactively zrepl signal jobs. • Filter filesystems in the job view by name. • An approximation of the old UI is still included as --mode legacy but will be removed in a future release of zrepl. • [BUG] Actually use concurrency when listing zrepl abstractions & do- ing size estimation. These operations were accidentally made sequen- tial in zrepl 0.3. • [BUG] Job hang-up during second replication attempt. • [BUG] Data races conditions in the dataconn rpc stack. • [MAINT] Update to protobuf v1.25 and grpc 1.35. For users who skipped the 0.3.1 update: please make sure your pruning grid config is correct. The following bugfix in 0.3.1 caused problems for some users: • [BUG] pruning: grid: add all snapshots that do not match the regex to the rule's destroy list. 0.3.1 Mostly a bugfix release for zrepl 0.3. • [FEATURE] pruning: add optional regex field to last_n rule • [DOCS] pruning: grid : improve documentation and add an example • [BUG] pruning: grid: add all snapshots that do not match the regex to the rule's destroy list. This brings the implementation in line with the docs. • [BUG] easyrsa script in docs • [BUG] platformtest: fix skipping encryption-only tests on systems that don't support encryption • [BUG] replication: report AttemptDone if no filesystems are repli- cated • [FEATURE] status + replication: warning if replication succeeeded without any filesystem being replicated • [DOCS] update multi-job & multi-host setup section • RPM Packaging • CI infrastructure rework • Continuous deployment of that new stable branch to zrepl.github.io. 0.3 This is a big one! Headlining features: • Resumable Send & Recv Support No knobs required, automatically used where supported. • Encrypted Send & Recv Support for OpenZFS native encryption, configurable at the job level, i.e., for all filesystems a job is re- sponsible for. • Replication Guarantees Automatic use of ZFS holds and bookmarks to protect a replicated filesystem from losing synchronization between sender and receiver. By default, zrepl guarantees that incremental replication will always be possible and interrupted steps will always be resumable. TIP: We highly recommend studying the updated overview section of the configuration chapter to understand how replication works. TIP: Go 1.15 changed the default TLS validation policy to require Subject Alternative Names (SAN) in certificates. The openssl commands we provided in the quick-start guides up to and including the zrepl 0.3 docs seem not to work properly. If you encounter certificate vali- dation errors regarding SAN and wish to continue to use your old certificates, start the zrepl daemon with env var GODEBUG=x509ig- noreCN=0. Alternatively, generate new certificates with SANs (see both options int the TLS transport docs ). Quick-start guides: • We have added another quick-start guide for a typical workstation use case for zrepl. Check it out to learn how you can use zrepl to back up your workstation's OpenZFS natively-encrypted root filesystem to an external disk. Additional changelog: • [BREAK] Go 1.15 TLS changes mentioned above. • [BREAK] [CONFIG] more restrictive job names than in prior zrepl ver- sions Starting with this version, job names are going to be embedded into ZFS holds and bookmark names (see this section for details). Therefore you might need to adjust your job names. Note that jobs cannot be renamed easily once you start using zrepl 0.3. • [BREAK] [MIGRATION] replication cursor representation changed • zrepl now manages the replication cursor bookmark per job-filesys- tem tuple instead of a single replication cursor per filesystem. In the future, this will permit multiple sending jobs to send from the same filesystems. • ZFS does not allow bookmark renaming, thus we cannot migrate the old replication cursors. • zrepl 0.3 will automatically create cursors in the new format for new replications, and warn if it still finds ones in the old for- mat. • Run zrepl migrate replication-cursor:v1-v2 to safely destroy old-format cursors. The migration will ensure that only those old-format cursors are destroyed that have been superseeded by new-format cursors. • [FEATURE] New option listen_freebind (tcp, tls, prometheus listener) • [FEATURE] issue #341 Prometheus metric for failing replications + corresponding Grafana panel • [FEATURE] issue #265 transport/tcp: support for CIDR masks in client IP whitelist • [FEATURE] documented subcommand to generate bash and zsh completions • [FEATURE] issue #307 chrome://trace -compatible activity tracing of zrepl daemon activity • [FEATURE] logging: trace IDs for better log entry correlation with concurrent replication jobs • [FEATURE] experimental environment variable for parallel replication (see issue #306 ) • [BUG] missing logger context vars in control connection handlers • [BUG] improved error messages on zfs send errors • [BUG] [DOCS] snapshotting: clarify sync-up behavior and warn about filesystems • [BUG] transport/ssh: do not leak zombie ssh process on connection failures that will not be snapshotted until the sync-up phase is over • [DOCS] Installation: FreeBSD jail with iocage • [DOCS] Document new replication features in the config overview and replication/design.md. • [MAINTAINER NOTICE] New platform tests in this version, please make sure you run them for your distro! • [MAINTAINER NOTICE] Please add the shell completions to the zrepl packages. 0.2.1 • [FEATURE] Illumos (and Solaris) compatibility and binary builds (thanks, MNX.io ) • [FEATURE] 32bit binaries for Linux and FreeBSD (untested, though) • [BUG] better error messages in ssh+stdinserver transport • [BUG] systemd + ssh+stdinserver: automatically create /var/run/zrepl/stdinserver • [BUG] crash if Prometheus listening socket cannot be opened • [MAINTAINER NOTICE] Makefile refactoring, see commit 080f2c0 0.2 • [FEATURE] Pre- and Post-Snapshot Hooks with built-in support for MySQL and Postgres checkpointing as well as custom scripts (thanks, @overhacked!) • [FEATURE] Use zfs destroy pool/fs@snap1,snap2,... CLI feature if available • [FEATURE] Linux ARM64 Docker build support & binary builds • [FEATURE] zrepl status now displays snapshotting reports • [FEATURE] zrepl status --job <JOBNAME> filter flag • [BUG] i386 build • [BUG] early validation of host:port tuples in config • [BUG] zrepl status now supports TERM=screen (tmux on FreeBSD / FreeNAS) • [BUG] ignore connection reset by peer errors when shutting down con- nections • [BUG] correct error messages when receive-side pool or root_fs dataset is not imported • [BUG] fail fast for misconfigured local transport • [BUG] race condition in replication report generation would crash the daemon when running zrepl status • [BUG] rpc goroutine leak in push mode if zfs recv fails on the sink side • [MAINTAINER NOTICE] Go modules for dependency management both inside and outside of GOPATH (lazy.sh and Makefile force GO111MODULE=on) • [MAINTAINER NOTICE] make platformtest target to check zrepl's ZFS ab- stractions (screen scraping, etc.). These tests only work on a sys- tem with ZFS installed, and must be run as root because they create a file-backed pool for each test case. The pool name zreplplatformtest is reserved for this use case. Only run make platformtest on test systems, e.g. a FreeBSD VM image. 0.1.1 • [BUG] issue #162 commit d6304f4 : fix I/O timeout errors on variable receive rate • A significant reduction or sudden stall of the receive rate (e.g. recv pool has other I/O to do) would cause a writev I/O timeout er- ror after approximately ten seconds. 0.1 This release is a milestone for zrepl and required significant refac- toring if not rewrites of substantial parts of the application. It breaks both configuration and transport format, and thus requires man- ual intervention and updates on both sides of a replication setup. DANGER: The changes in the pruning system for this release require you to explicitly define keep rules: for any snapshot that you want to keep, at least one rule must match. This is different from previous releases where pruning only affected snapshots with the configured snapshotting prefix. Make sure that snapshots to be kept or ignored by zrepl are covered, e.g. by using the regex keep rule. Learn more in the config docs... Notes to Package Maintainers • Notify users about config changes and migrations (see changes attrib- uted with [BREAK] and [MIGRATION] below) • If the daemon crashes, the stack trace produced by the Go runtime and possibly diagnostic output of zrepl will be written to stderr. This behavior is independent from the stdout outlet type. Please make sure the stderr output of the daemon is captured somewhere. To con- serve precious stack traces, make sure that multiple service restarts do not directly discard previous stderr output. • Make it obvious for users how to set the GOTRACEBACK environment variable to GOTRACEBACK=crash. This functionality will cause SIGABRT on panics and can be used to capture a coredump of the panicking process. To that extend, make sure that your package build system, your OS's coredump collection and the Go delve debugger work to- gether. Use your build system to package the Go program in this tu- torial on Go coredumps and the delve debugger , and make sure the symbol resolution etc. work on coredumps captured from the binary produced by your build system. (Special focus on symbol stripping, etc.) • Consider using the zrepl configcheck subcommand in startup scripts to abort a restart that would fail due to an invalid config. Changes • [BREAK] [MIGRATION] Placeholder property representation changed • The placeholder property now uses on|off as values instead of hashes of the dataset path. This permits renames of the sink filesystem without updating all placeholder properties. • Relevant for 0.0.X-0.1-rc* to 0.1 migrations • Make sure your config is valid with zrepl configcheck • Run zrepl migrate 0.0.X:0.1:placeholder • [FEATURE] issue #55 : Push replication (see push job and sink job) • [FEATURE] TCP Transport • [FEATURE] TCP + TLS client authentication transport • [FEATURE] issue #111: RPC protocol rewrite • [BREAK] Protocol breakage; Update and restart of all zrepl daemons is required. • Use gRPC for control RPCs and a custom protocol for bulk data transfer. • Automatic retries for network-temporary errors • Limited to errors during replication for this release. Addresses the common problem of ISP-forced reconnection at night, but will become way more useful with resumable send & recv support. Prun- ing errors are handled per FS, i.e., a prune RPC is attempted at least once per FS. • [FEATURE] Proper timeout handling for the SSH transport • [BREAK] Requires Go 1.11 or later. • [BREAK] [CONFIG]: mappings are no longer supported • Receiving sides (pull and sink job) specify a single root_fs. Re- ceived filesystems are then stored per client in ${root_fs}/${client_identity}. See Jobs & How They Work Together for details. • [FEATURE] [BREAK] [CONFIG] Manual snapshotting + triggering of repli- cation • [FEATURE] issue #69: include manually created snapshots in replica- tion • [CONFIG] manual and periodic snapshotting types • [FEATURE] zrepl signal wakeup JOB subcommand to trigger replication + pruning • [FEATURE] zrepl signal reset JOB subcommand to abort current repli- cation + pruning • [FEATURE] [BREAK] [CONFIG] New pruning system • The active side of a replication (pull or push) decides what to prune for both sender and receiver. The RPC protocol is used to execute the destroy operations on the remote side. • New pruning policies (see configuration documentation ) • The decision what snapshots shall be pruned is now made based on keep rules • [FEATURE] issue #68: keep rule not_replicated prevents divergence of sender and receiver • [FEATURE] [BREAK] Bookmark pruning is no longer necessary • Per filesystem, zrepl creates a single bookmark (#zrepl_replica- tion_cursor) and moves it forward with the most recent success- fully replicated snapshot on the receiving side. • Old bookmarks created by prior versions of zrepl (named like their corresponding snapshot) must be deleted manually. • [CONFIG] keep_bookmarks parameter of the grid keep rule has been removed • [FEATURE] zrepl status for live-updating replication progress (it's really cool!) • [FEATURE] Snapshot- & pruning-only job type (for local snapshot man- agement) • [FEATURE] issue #67: Expose Prometheus metrics via HTTP (config docs) • Compatible Grafana dashboard shipping in dist/grafana • [CONFIG] Logging outlet types must be specified using the type in- stead of outlet key • [BREAK] issue #53: CLI: zrepl control * subcommands have been made direct subcommands of zrepl * • [BUG] Goroutine leak on ssh transport connection timeouts • [BUG] issue #81 issue #77 : handle failed accepts correctly (source job) • [BUG] issue #100: fix incompatibility with ZoL 0.8 • [FEATURE] issue #115: logging: configurable syslog facility • [FEATURE] Systemd unit file in dist/systemd Previous Releases NOTE: Due to limitations in our documentation system, we only show the changelog since the last release and the time this documentation is built. For the changelog of previous releases, use the version se- lection in the hosted version of these docs at zrepl.github.io. Donate via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal zrepl is a spare-time project primarily developed by Christian Schwarz. You can support maintenance and feature development through one of the services listed above. For SEPA wire transfer and commercial support, please contact Christian directly. Thanks for your support! NOTE: PayPal takes a relatively high fixed processing fee plus percentage of the donation. Larger less-frequent donations make more sense there. Supporters We would like to thank the following people and organizations for sup- porting zrepl through monetary and other means: • Max Christian Pohle • Prominic.NET, Inc. • Torsten Blum • Cyberiada GmbH • Gordon Schulz • @jwittlincohen • Michael D. Schmitt • Hans Schulz • Henning Kessler • John Ramsden • DrLuke • Mateusz Kwiatkowski (runhyve.app) • Gaelan D'costa • Tenzin Lhakhang • Lapo Luchini • F. Schmid • MNX.io • Marshall Clyburn • Ross Williams • Mike T. • Justin Scholz • InsanePrawn • Ben Woods • Janis Streib • Anton Schirg AUTHOR Christian Schwarz COPYRIGHT 2017-2023, Christian Schwarz Apr 14, 2025 ZREPL(1)
NAME | GETTING STARTED | MAIN FEATURES | CONTRIBUTING | TABLE OF CONTENTS | AUTHOR | COPYRIGHT
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=zrepl&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>
