Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
ZREPL(1)			     zrepl			      ZREPL(1)

NAME
       zrepl  -	zrepl Documentation GitHub license Language: Go	Twitter	Donate
       via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate  via
       PayPal Matrix

       zrepl is	a one-stop, integrated solution	for ZFS	replication.

GETTING	STARTED
       The 10 minute quick-start guides	give you a first impression.

MAIN FEATURES
        Filesystem replication

	  [x] Pull & Push mode

	  [x] Multiple	transport modes: TCP, TCP + TLS	client auth, SSH

	  Advanced replication	features

	    [x] Automatic retries for temporary network errors

	    [x] Automatic resumable send & receive

	    [x] Automatic ZFS holds during send & receive

	    [x] Automatic bookmark & hold management for guaranteed incremen-
	     tal send &	recv

	    [x]  Encrypted raw	send & receive to untrusted receivers (OpenZFS
	     native encryption)

	    [x] Properties send & receive

	    [x] Compressed send & receive

	    [x] Large blocks send & receive

	    [x] Embedded data send & receive

	    [x] Resume	state send & receive

	    [x] Bandwidth limiting

        Automatic snapshot management

	  [x] Periodic	filesystem snapshots

	  [x] Support for pre-	and  post-snapshot  hooks  with	 builtins  for
	   MySQL & Postgres

	  [x] Flexible	pruning	rule system

	    [x] Age-based fading (grandfathering scheme)

	    [x] Bookmarks to avoid divergence between sender and receiver

        Sophisticated Monitoring & Logging

	  [x] Live progress reporting via zrepl status	subcommand

	  [x] Comprehensive, structured logging

	    human, logfmt and json formatting

	    stdout, syslog and	TCP (+TLS client auth) outlets

	  [x] Prometheus monitoring endpoint

        Maintainable implementation in	Go

	  [x] Cross platform

	  [x] Dynamic feature checking

	  [x] Type safe & testable code

       ATTENTION:
	  zrepl	 as  well as this documentation	is still under active develop-
	  ment.	 There is no stability guarantee on the	RPC protocol  or  con-
	  figuration  format,  but we do our best to document breaking changes
	  in the Changelog.

CONTRIBUTING
       We are happy about any help we can get!

        Financial Support

        Explore the codebase

	  These docs live in the docs/	subdirectory

        Document any non-obvious / confusing /	plain broken behavior you  en-
	 counter when setting up zrepl for the first time

        Check	the  Issues  and Projects sections for things to do.  The good
	 first issues and docs are suitable starting points.

	  Development Workflow

		 The GitHub repository is where	all development	happens.  Make
		 sure to read the Developer Documentation section and open new
		 issues	or pull	requests there.

TABLE OF CONTENTS
   Quick Start by Use Case
       The goal	of this	quick-start guide is to	give you an impression of  how
       zrepl can accomodate your use case.

   Install zrepl
       Follow the OS-specific installation instructions	and come back here.

   Overview Of How zrepl Works
       Check  out the overview section to get a	rough idea of what you are go-
       ing to configure	in the next step, then come back here.

   Configuration Examples
       zrepl  is   configured	through	  a   YAML   configuration   file   in
       /etc/zrepl/zrepl.yml.	We   have  prepared  example  use  cases  that
       show-case typical deployments and different functionality of zrepl.  We
       encourage you to	read through all of the	examples to  get  an  idea  of
       what  zrepl  has	to offer, and how you can mix-and-match	configurations
       for your	use case.  Keep	the full config	documentation handy if a  con-
       fig snippet is unclear.

       Example Use Cases

   Continuous Backup of	a Server
       This config example shows how we	can backup our ZFS-based server	to an-
       other machine using a zrepl push	job.

        Production server prod	with filesystems to back up:

	  The entire pool zroot

	  except zroot/var/tmp	and all	child datasets of it

	  and	except	zroot/usr/home/paranoid	 which belongs to a user doing
	   backups themselves.

        Backup	server backups with a dataset sub-tree for use by zrepl:

	  In our example, that	will be	storage/zrepl/sink/prod.

       Our backup solution should fulfill the following	requirements:

        Periodically snapshot the filesystems on prod every 10	minutes

        Incrementally replicate these snapshots to  storage/zrepl/sink/prod/*
	 on backups

        Keep only very	few snapshots on prod to save disk space

        Keep  a  fading history (24 hourly, 30	daily, 6 monthly) of snapshots
	 on backups

        The network is	untrusted - zrepl should use TLS to protect its	commu-
	 nication and our data.

   Analysis
       We can model this situation as two jobs:

        A push	job on prod

	  Creates the snapshots

	  Keeps a short history of  local  snapshots  to  enable  incremental
	   replication to backups

	  Connects to the zrepl daemon	process	on backups

	  Pushes snapshots backups

	  Prunes snapshots on backups after replication is complete

        A sink	job on backups

	  Accepts connections & responds to requests from prod

	  Limits   client   prod   access   to	  filesystem   sub-tree	 stor-
	   age/zrepl/sink/prod

   Generate TLS	Certificates
       We use the TLS client authentication transport to protect our  data  on
       the  wire.   To	get  things going quickly, we skip setting up a	CA and
       generate	two self-signed	certificates as	described  here.   For	conve-
       nience,	we  generate the key pairs on our local	machine	and distribute
       them using ssh:

	  (name=backups; openssl req -x509 -sha256 -nodes \
	   -newkey rsa:4096 \
	   -days 365 \
	   -keyout $name.key \
	   -out	$name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name")

	  (name=prod; openssl req -x509	-sha256	-nodes \
	   -newkey rsa:4096 \
	   -days 365 \
	   -keyout $name.key \
	   -out	$name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name")

	  ssh root@backups "mkdir /etc/zrepl"
	  scp  backups.key backups.crt prod.crt	root@backups:/etc/zrepl

	  ssh root@prod	"mkdir /etc/zrepl"
	  scp  prod.key	prod.crt backups.crt root@prod:/etc/zrepl

       Note that alternative transports	exist, e.g. via	 TCP  without  TLS  or
       ssh.

   Configure server prod
       We  define  a push job named prod_to_backups in /etc/zrepl/zrepl.yml on
       host prod :

	  jobs:
	  - name: prod_to_backups
	    type: push
	    connect:
	      type: tls
	      address: "backups.example.com:8888"
	      ca: /etc/zrepl/backups.crt
	      cert: /etc/zrepl/prod.crt
	      key:  /etc/zrepl/prod.key
	      server_cn: "backups"
	    filesystems: {
	      "zroot<":	true,
	      "zroot/var/tmp<":	false,
	      "zroot/usr/home/paranoid": false
	    }
	    snapshotting:
	      type: periodic
	      prefix: zrepl_
	      interval:	10m
	    pruning:
	      keep_sender:
	      -	type: not_replicated
	      -	type: last_n
		count: 10
	      keep_receiver:
	      -	type: grid
		grid: 1x1h(keep=all) | 24x1h | 30x1d | 6x30d
		regex: "^zrepl_"

   Configure server backups
       We define a corresponding sink job named	sink  in  /etc/zrepl/zrepl.yml
       on host backups :

	  jobs:
	  - name: sink
	    type: sink
	    serve:
		type: tls
		listen:	":8888"
		ca: "/etc/zrepl/prod.crt"
		cert: "/etc/zrepl/backups.crt"
		key: "/etc/zrepl/backups.key"
		client_cns:
		  - "prod"
	    root_fs: "storage/zrepl/sink"

   Go Back To Quickstart Guide
       Click here to go	back to	the quickstart guide.

   Local Snapshots + Offline Backup to an External Disk
       This  config  example shows how we can use zrepl	to make	periodic snap-
       shots of	our local workstation and back it up to	a zpool	on an external
       disk which we occassionally connect.

       The local snapshots should be taken every 15 minutes for	pain-free  re-
       covery  from CLI	disasters (rm -rf / and	the like).  However, we	do not
       want to keep the	snapshots around for very long because our workstation
       is a little tight on disk space.	 Thus, we only keep one	hour worth  of
       high-resolution snapshots, then fade them out to	one per	hour for a day
       (24 hours), then	one per	day for	14 days.

       At  the	end of each work day, we connect our external disk that	serves
       as our workstation's local offline backup.  We want  zrepl  to  inspect
       the  filesystems	 and  snapshots	on the external	pool, figure out which
       snapshots were created since the	last time we  connected	 the  external
       disk,  and  use incremental replication to efficiently mirror our work-
       station to our backup disk.  Afterwards,	we want	to clean up old	 snap-
       shots  on  the  backup pool: we want to keep all	snapshots younger than
       one hour, 24 for	each hour of the first day, then 360 daily backups.

       A few additional	requirements:

        Snapshot creation and pruning on our workstation should happen	in the
	 background, without interaction from our side.

        However, we want to explicitly	trigger	replication  via  the  command
	 line.

        We  want  to use OpenZFS native encryption to protect our data	on the
	 external disk.	 It is absolutely critical that	 only  encrypted  data
	 leaves	our workstation.  zrepl	should provide an easy config knob for
	 this  and prevent replication of unencrypted datasets to the external
	 disk.

        We want to be able to put off the backups for more than three	weeks,
	 i.e., longer than the lifetime	of the automatically created snapshots
	 on  our workstation.  zrepl should use	bookmarks and holds to achieve
	 this goal.

        When we yank out the drive during replication and go on a long	 vaca-
	 tion,	we  do	not  want  the	partially replicated snapshot to stick
	 around	as it would hold on to too much	disk space over	time.	There-
	 fore,	we  want zrepl to deviate from its default behavior and	sacri-
	 fice resumability, but	nonetheless retain the ability to do incremen-
	 tal replication once we return	from our vacation.  zrepl should  pro-
	 vide an easy config knob to disable step holds	for incremental	repli-
	 cation.

       The following config snippet implements the setup described above.  You
       will likely want	to customize some aspects mentioned in the top comment
       in the file.

	  # This config	serves as an example for a local zrepl installation that
	  # backups the	entire zpool `system` to `backuppool/zrepl/sink`
	  #
	  # The	requirements covered by	this setup are described in the	zrepl documentation's
	  # quick start	section	which inlines this example.
	  #
	  # CUSTOMIZATIONS YOU WILL LIKELY WANT	TO APPLY:
	  # - adjust the name of the production	pool `system` in the `filesystems` filter of jobs `snapjob` and	`push_to_drive`
	  # - adjust the name of the backup pool `backuppool` in the `backuppool_sink` job
	  # - adjust the occurences of `myhostname` to the name	of the system you are backing up (cannot be easily changed once	you start replicating)
	  # - make sure	the `zrepl_` prefix is not being used by any other zfs tools you might have installed (it likely isn't)

	  jobs:

	  # this job takes care	of snapshot creation + pruning
	  - name: snapjob
	    type: snap
	    filesystems: {
		"system<": true,
	    }
	    # create snapshots with prefix `zrepl_` every 15 minutes
	    snapshotting:
	      type: periodic
	      interval:	15m
	      prefix: zrepl_
	    pruning:
	      keep:
	      #	fade-out scheme	for snapshots starting with `zrepl_`
	      #	- keep all created in the last hour
	      #	- then destroy snapshots such that we keep 24 each 1 hour apart
	      #	- then destroy snapshots such that we keep 14 each 1 day apart
	      #	- then destroy all older snapshots
	      -	type: grid
		grid: 1x1h(keep=all) | 24x1h | 14x1d
		regex: "^zrepl_.*"
	      #	keep all snapshots that	don't have the `zrepl_`	prefix
	      -	type: regex
		negate:	true
		regex: "^zrepl_.*"

	  # This job pushes to the local sink defined in job `backuppool_sink`.
	  # We trigger replication manually from the command line / udev rules using
	  #  `zrepl signal wakeup push_to_drive`
	  - type: push
	    name: push_to_drive
	    connect:
	      type: local
	      listener_name: backuppool_sink
	      client_identity: myhostname
	    filesystems: {
		"system<": true
	    }
	    send:
	      encrypted: true
	    replication:
	      protection:
		initial: guarantee_resumability
		# Downgrade protection to guarantee_incremental	which uses zfs bookmarks instead of zfs	holds.
		# Thus,	when we	yank out the backup drive during replication
		# - we might not be able to resume the interrupted replication step because the	partially received `to`	snapshot of a `from`->`to` step	may be pruned any time
		# - but	in exchange we get back	the disk space allocated by `to` when we prune it
		# - and	because	we still have the bookmarks created by `guarantee_incremental`,	we can still do	incremental replication	of `from`->`to2` in the	future
		incremental: guarantee_incremental
	    snapshotting:
	      type: manual
	    pruning:
	      #	no-op prune rule on sender (keep all snapshots), job `snapshot`	takes care of this
	      keep_sender:
	      -	type: regex
		regex: ".*"
	      #	retain
	      keep_receiver:
	      #	longer retention on the	backup drive, we have more space there
	      -	type: grid
		grid: 1x1h(keep=all) | 24x1h | 360x1d
		regex: "^zrepl_.*"
	      #	retain all non-zrepl snapshots on the backup drive
	      -	type: regex
		negate:	true
		regex: "^zrepl_.*"

	  # This job receives from job `push_to_drive` into `backuppool/zrepl/sink/myhostname`
	  - type: sink
	    name: backuppool_sink
	    root_fs: "backuppool/zrepl/sink"
	    serve:
	      type: local
	      listener_name: backuppool_sink

   Offline Backups with	two (or	more) External Disks
       It can be desirable to have multiple disk-based backups of the same ma-
       chine.  To accomplish this,

        create	one zpool per external HDD, each with a	unique name, and

        define	 a  pair  of  push and sink job	for each of these zpools, each
	 with a	unique name, listener_name, and	root_fs.

       The unique names	ensure that the	jobs don't step	on each	 others'  toes
       when managing zrepl's ZFS abstractions .

       Click here to go	back to	the quickstart guide.

   Fan-out replication
       This quick-start	example	demonstrates how to implement a	fan-out	repli-
       cation  setup where datasets on a server	(A) are	replicated to multiple
       targets (B, C, etc.).

       This example uses multiple source jobs on server	A and pull jobs	on the
       target servers.

       WARNING:
	  Before implementing this setup, please see the caveats listed	in the
	  fan-out replication configuration overview.

   Overview
       On the source server (A), there should be:

        A snap	job

	  Creates the snapshots

	  Handles the pruning of snapshots

        A source job for target B

	  Accepts connections from server B and B only

        Further source	jobs for each additional target	(C, D, etc.)

	  Listens on a	unique port

	  Only	accepts	connections from the specific target

       On each target server, there should be:

        A pull	job that connects to the corresponding source job on A

	  prune_sender	should keep all	snapshots since	A's snap  job  handles
	   the pruning

	  prune_receiver  can	be  configured	as  appropriate	on each	target
	   server

   Generate TLS	Certificates
       Mutual TLS via the TLS client authentication transport can be  used  to
       secure  the  connections	 between  the  servers.	 In  this  example,  a
       self-signed certificate is created for each server without setting up a
       CA.

	  source=a.example.com
	  targets=(
	      b.example.com
	      c.example.com
	      #	...
	  )

	  for server in	"${source}" "${targets[@]}"; do
	      openssl req -x509	-sha256	-nodes \
		  -newkey rsa:4096 \
		  -days	365 \
		  -keyout "${server}.key" \
		  -out "${server}.crt" \
		  -addext "subjectAltName = DNS:${server}" \
		  -subj	"/CN=${server}"
	  done

	  # Distribute each host's keypair
	  for server in	"${source}" "${targets[@]}"; do
	      ssh root@"${server}" mkdir /etc/zrepl
	      scp "${server}".{crt,key}	root@"${server}":/etc/zrepl/
	  done

	  # Distribute target certificates to the source
	  scp "${targets[@]/%/.crt}" root@"${source}":/etc/zrepl/

	  # Distribute source certificate to the targets
	  for server in	"${targets[@]}"; do
	      scp "${source}.crt" root@"${server}":/etc/zrepl/
	  done

   Configure source server A
	  jobs:
	  # Separate job for snapshots and pruning
	  - name: snapshots
	    type: snap
	    filesystems:
	      'tank<': true # all filesystems
	    snapshotting:
	      type: periodic
	      prefix: zrepl_
	      interval:	10m
	    pruning:
	      keep:
		# Keep non-zrepl snapshots
		- type:	regex
		  negate: true
		  regex: '^zrepl_'
		# Time-based snapshot retention
		- type:	grid
		  grid:	1x1h(keep=all) | 24x1h | 30x1d | 12x30d
		  regex: '^zrepl_'

	  # Source job for target B
	  - name: target_b
	    type: source
	    serve:
	      type: tls
	      listen: :8888
	      ca: /etc/zrepl/b.example.com.crt
	      cert: /etc/zrepl/a.example.com.crt
	      key: /etc/zrepl/a.example.com.key
	      client_cns:
		- b.example.com
	    filesystems:
	      'tank<': true # all filesystems
	    # Snapshots	are handled by the separate snap job
	    snapshotting:
	      type: manual

	  # Source job for target C
	  - name: target_c
	    type: source
	    serve:
	      type: tls
	      listen: :8889
	      ca: /etc/zrepl/c.example.com.crt
	      cert: /etc/zrepl/a.example.com.crt
	      key: /etc/zrepl/a.example.com.key
	      client_cns:
		- c.example.com
	    filesystems:
	      'tank<': true # all filesystems
	    # Snapshots	are handled by the separate snap job
	    snapshotting:
	      type: manual

	  # Source jobs	for remaining targets. Each one	should listen on a different port
	  # and	reference the correct certificate and client CN.
	  # - name: target_c
	  #   ...

   Configure each target server
	  jobs:
	  # Pull from source server A
	  - name: source_a
	    type: pull
	    connect:
	      type: tls
	      #	Use the	correct	port for this specific client (eg. B is	8888, C	is 8889, etc.)
	      address: a.example.com:8888
	      ca: /etc/zrepl/a.example.com.crt
	      #	Use the	correct	key pair for this specific client
	      cert: /etc/zrepl/b.example.com.crt
	      key: /etc/zrepl/b.example.com.key
	      server_cn: a.example.com
	    root_fs: pool0/backup
	    interval: 10m
	    pruning:
	      keep_sender:
		# Source does the pruning in its snap job
		- type:	regex
		  regex: '.*'
	      #	Receiver-side pruning can be configured	as desired on each target server
	      keep_receiver:
		# Keep non-zrepl snapshots
		- type:	regex
		  negate: true
		  regex: '^zrepl_'
		# Time-based snapshot retention
		- type:	grid
		  grid:	1x1h(keep=all) | 24x1h | 30x1d | 12x30d
		  regex: '^zrepl_'

   Go Back To Quickstart Guide
       Click here to go	back to	the quickstart guide.

       Use zrepl configcheck to	validate your configuration.  No output	 indi-
       cates that everything is	fine.

       NOTE:
	  Please open an issue on GitHub if your use case for zrepl is signif-
	  icantly different from those listed above.  Or even better, write it
	  up in	the same style as above	and open a PR!

   Apply Configuration Changes
       We  hope	 that  you have	found a	configuration that fits	your use case.
       Use zrepl configcheck once again	to make	sure  the  config  is  correct
       (output	indicates  that	 everything  is	fine).	Then restart the zrepl
       daemon on all systems involved in the replication, likely using service
       zrepl restart or	systemctl restart zrepl.

       WARNING:
	  Please read up carefully on the pruning rules	 before	 applying  the
	  config.   In particular, note	that most example configs apply	to all
	  snapshots, not just zrepl-created snapshots.	Use the	following keep
	  rule on sender and receiver to prevent this:

	      -	type: regex
		negate:	true
		regex: "^zrepl_.*" # <-	the 'prefix' specified in snapshotting.prefix

   Watch it Work
       Run zrepl status	on the active side of the replication setup to monitor
       snaphotting, replication	and pruning activity.  To re-trigger  replica-
       tion  (snapshots	are separate!),	use zrepl signal wakeup	JOBNAME.  (re-
       fer to the example use case document if you are uncertain which job you
       want to wake up).

       You can also use	basic UNIX tools to inspect see	what's going  on.   If
       you like	tmux, here is a	handy script that works	on FreeBSD:

	  pkg install gnu-watch	tmux
	  tmux new -s zrepl -d
	  tmux split-window -t zrepl "tail -f /var/log/messages"
	  tmux split-window -t zrepl "gnu-watch	'zfs list -t snapshot -o name,creation -s creation'"
	  tmux split-window -t zrepl "zrepl status"
	  tmux select-layout -t	zrepl tiled
	  tmux attach -t zrepl

       The Linux equivalent might look like this:

	  # make sure tmux is installed	& let's	assume you use systemd + journald
	  tmux new -s zrepl -d
	  tmux split-window -t zrepl  "journalctl -f -u	zrepl.service"
	  tmux split-window -t zrepl "watch 'zfs list -t snapshot -o name,creation -s creation'"
	  tmux split-window -t zrepl "zrepl status"
	  tmux select-layout -t	zrepl tiled
	  tmux attach -t zrepl

   What	Next?
        Read more about configuration format, options & job types

        Configure logging & monitoring.

   Installation
       TIP:
	  Note:	 check	out the	quick-start guides if you want a first impres-
	  sion of zrepl.

   User	Privileges
       It is possible to run zrepl as an unprivileged user in combination with
       ZFS delegation.	Also, there is the possibility to run it in a jail  on
       FreeBSD by delegating a dataset to the jail.

       TIP:
	  Note:	 check out the FreeBSD Jail With iocage	for FreeBSD jail setup
	  instructions.

   Packages
       zrepl source releases are signed	& tagged by  the  author  in  the  git
       repository.   Your  OS  vendor  may  provide  binary  packages of zrepl
       through the package manager.  Additionally, binary  releases  are  pro-
       vided  on  GitHub.   The	following list may be incomplete, feel free to
       submit a	PR with	an update:
+---------------------+--------------------+--------------------------------------------+
| OS / Distro	      |	Install	Command	   | Link					|
+---------------------+--------------------+--------------------------------------------+
| FreeBSD	      |	pkg install zrepl  | https://www.freshports.org/sysutils/zrepl/	|
|		      |			   |						|
|		      |			   | FreeBSD  Jail  With			|
|		      |			   | iocage					|
+---------------------+--------------------+--------------------------------------------+
| FreeNAS	      |			   | FreeBSD Jail With iocage			|
+---------------------+--------------------+--------------------------------------------+
| MacOS		      |	brew install zrepl | Available on homebrew			|
+---------------------+--------------------+--------------------------------------------+
| Arch Linux	      |	yay install zrepl  | Available on AUR				|
+---------------------+--------------------+--------------------------------------------+
| Fedora,     CentOS, |	dnf install zrepl  | RPM repository config			|
| RHEL,	OpenSUSE      |			   |						|
+---------------------+--------------------+--------------------------------------------+
| Debian + Ubuntu     |	apt install zrepl  | APT repository config			|
+---------------------+--------------------+--------------------------------------------+
| OmniOS	      |	pkg install zrepl  | Available since r151030			|
+---------------------+--------------------+--------------------------------------------+
| Void Linux	      |	xbps-install zrepl | Available since a88a2a4			|
+---------------------+--------------------+--------------------------------------------+
| Others	      |			   | Use binary	releases or build from source.	|
+---------------------+--------------------+--------------------------------------------+

   Debian / Ubuntu APT repositories
       We maintain APT repositories for	Debian,	Ubuntu and  derivatives.   The
       fingerprint  of	the signing key	is E101	418F D3D6 FBCB 9D65  A62D 7086
       99FC	  5F2E	     BF16.	  It	   is	    available	    at
       https://zrepl.cschwarz.com/apt/apt-key.asc  .   Please open an issue in
       on GitHub if you	encounter any issues with the repository.

	  (
	  set -ex
	  zrepl_apt_key_url=https://zrepl.cschwarz.com/apt/apt-key.asc
	  zrepl_apt_key_dst=/usr/share/keyrings/zrepl.gpg
	  zrepl_apt_repo_file=/etc/apt/sources.list.d/zrepl.list

	  # Install dependencies for subsequent	commands
	  sudo apt update && sudo apt install curl gnupg lsb-release

	  # Deploy the zrepl apt key.
	  curl -fsSL "$zrepl_apt_key_url" | tee	| gpg --dearmor	| sudo tee "$zrepl_apt_key_dst"	> /dev/null

	  # Add	the zrepl apt repo.
	  ARCH="$(dpkg --print-architecture)"
	  CODENAME="$(lsb_release -i -s	| tr '[:upper:]' '[:lower:]') $(lsb_release -c -s | tr '[:upper:]' '[:lower:]')"
	  echo "Using Distro and Codename: $CODENAME"
	  echo "deb [arch=$ARCH	signed-by=$zrepl_apt_key_dst] https://zrepl.cschwarz.com/apt/$CODENAME main" | sudo tee	/etc/apt/sources.list.d/zrepl.list

	  # Update apt repos.
	  sudo apt update
	  )

       NOTE:
	  Until	zrepl reaches 1.0, the repositories will  be  updated  to  the
	  latest  zrepl	 release  immediately.	This includes breaking changes
	  between zrepl	versions.  Use apt-mark	hold zrepl to prevent upgrades
	  of zrepl.

   RPM repositories
       We provide a single RPM repository for  all  RPM-based  Linux  distros.
       The  zrepl  binary  in  the  repo  is  the same as the one published to
       GitHub.	Since Go binaries are statically linked, the RPM  should  work
       about everywhere.

       The  fingerprint	 of  the  signing key is F6F6 E8EA 6F2F	1462 2878 B5DE
       50E3	4417	 826E	  2CE6.	     It	     is	     available	    at
       https://zrepl.cschwarz.com/rpm/rpm-key.asc  .   Please open an issue on
       GitHub if you encounter any issues with the repository.

       Copy-paste the following	snippet	into your shell	to set	up  the	 zrepl
       repository.   Then  dnf install zrepl and make sure to confirm that the
       signing key matches the one shown above.

	  cat >	/etc/yum.repos.d/zrepl.repo <<EOF
	  [zrepl]
	  name = zrepl
	  baseurl = https://zrepl.cschwarz.com/rpm/repo
	  gpgkey = https://zrepl.cschwarz.com/rpm/rpm-key.asc
	  EOF

       NOTE:
	  Until	zrepl reaches 1.0, the repository will be updated to the  lat-
	  est  zrepl  release immediately.  This includes breaking changes be-
	  tween	zrepl versions.	 If that bothers you, use the dnf  versionlock
	  plugin to pin	the version of zrepl on	your system.

   Compile From	Source
       Producing  a  release  requires	Go 1.11	or newer and Python 3 +	pip3 +
       docs/requirements.txt for the Sphinx documentation.  A tutorial to  in-
       stall Go	is available over at golang.org.  Python and pip3 should prob-
       ably be installed via your distro's package manager.

       ::     cd  to/your/zrepl/checkout python3 -m venv3 source venv3/bin/ac-
	      tivate ./lazy.sh devsetup	make release  #	 build	artifacts  are
	      available	in ./artifacts/release

       The  Python  venv is used for the documentation build dependencies.  If
       you just	want to	build the zrepl	binary,	leave it out and use ./lazy.sh
       godep instead.

       Alternatively, you can use the Docker build process: it is used to pro-
       duce the	official zrepl binary releases and serves as a	reference  for
       build dependencies and procedure:

	  cd to/your/zrepl/checkout
	  # make sure your user	has access to the docker socket
	  make release-docker
	  # if you want	.deb or	.rpm packages, invoke the follwoing
	  # targets _after_ you	invoked	release-docker
	  make deb-docker
	  make rpm-docker
	  # build artifacts are	available in ./artifacts/release
	  # packages are available in ./artifacts

       NOTE:
	  It  is  your	job  to	 install the built binary in the zrepl users's
	  $PATH, e.g. /usr/local/bin/zrepl.  Otherwise,	the  examples  in  the
	  quick-start guides may need to be adjusted.

   FreeBSD Jail	With iocage
       This  tutorial  shows how zrepl can be installed	on FreeBSD, or FreeNAS
       in a jail using iocage.	While this tutorial focuses on	using  iocage,
       much of the setup would be similar using	a different jail manager.

       NOTE:
	  From	a  security  perspective, just keep in mind that zfs send/recv
	  was never designed with jails	in mind, an  attacker  could  probably
	  crash	the receive-side kernel	or worse induce	stateful damage	to the
	  receive-side pool if they were able to get access to the jail.

	  The  jail  doesn't  provide  security	 benefits, but only management
	  ones.

   Requirements
       A dataset that will be delegated	to the jail needs to be	created	if one
       does not	already	exist.	For the	tutorial tank/zrepl will be used.

	  zfs create -o	mountpoint=none	tank/zrepl

       The only	software requirements on the host system are iocage, which can
       be installed from ports or packages.

	  pkg install py37-iocage

       NOTE:
	  By default iocage will "activate" on first use  which	 will  set  up
	  some	defaults  such	as which pool will be used. To activate	iocage
	  manually the iocage activate command can be used.

   Jail	Creation
       There are two options for jail creation using FreeBSD.

       1. Manually set up the jail from	scratch

       2. Create the jail using	the zrepl plugin. On FreeNAS this is  possible
	  from the user	interface using	the community index.

   Manual Jail
       Create  a  jail,	 using the same	release	as the host, called zrepl that
       will be automatically started at	boot.  The jail	will  have  tank/zrepl
       delegated into it.

	  iocage create	--release "$(freebsd-version -k	| cut -d '-' -f	'1,2')"	--name zrepl \
		 boot=on nat=1 \
		 jail_zfs=on \
		 jail_zfs_dataset=zrepl	\
		 jail_zfs_mountpoint='none'

       Enter the jail:

	  iocage console zrepl

       Install zrepl

	  pkg update &&	pkg upgrade
	  pkg install zrepl

       Create the log file /var/log/zrepl.log

	  touch	/var/log/zrepl.log && service newsyslog	restart

       Tell syslogd to redirect	facility local0	to the zrepl.log file:

	  service syslogd reload

       Enable the zrepl	daemon to start	automatically at boot:

	  sysrc	zrepl_enable="YES"

       Now jump	to the summary below.

   Plugin
       When  using the plugin, zrepl will be installed for you in a jail using
       the following iocage properties.

        nat=1

        jail_zfs=on

        jail_zfs_mountpoint=none

       Additionally the	delegated dataset should be specified  upon  creation,
       and  optionally	start  on boot can be set.  This can also be done from
       the FreeNAS webui.

	  fetch	https://raw.githubusercontent.com/ix-plugin-hub/iocage-plugin-index/master/zrepl.json -o /tmp/zrepl.json
	  iocage fetch -P /tmp/zrepl.json --name zrepl jail_zfs_dataset=zrepl boot=on

   Configuration
       Now zrepl can be	configured.

       Enter the jail.

	  iocage console zrepl

       Modify the /usr/local/etc/zrepl/zrepl.yml configuration file.

       TIP:
	  Note:	check out the quick-start guides for examples of a sink	job.

       Now zrepl can be	started.

	  service zrepl	start

       Now jump	to the summary below.

   Summary
       Congratulations,	you have a working jail!

       NOTE:
	  With FreeBSD 13's transition to OpenZFS 2.0, please ensure that your
	  jail's FreeBSD version matches the one in the	kernel module.	If you
	  are getting cryptic errors such as  cannot  receive  new  filesystem
	  stream:  invalid  backup  stream  the	instructions posted here might
	  help.

   What	next?
       Read the	configuration chapter and then continue	with the  usage	 chap-
       ter.

       Reminder: If you	want a quick introduction, please read the quick-start
       guides.

   Configuration
   Overview & Terminology
       All  work  zrepl	does is	performed by the zrepl daemon which is config-
       ured in a single	YAML configuration file	loaded on startup.   The  fol-
       lowing paths are	searched, in this order:

       1. The path specified via the global --config flag

       2. /etc/zrepl/zrepl.yml

       3. /usr/local/etc/zrepl/zrepl.yml

       zrepl  configcheck  can	be used	to validate the	configuration.	If the
       configuration is	valid, it will output nothing and exit	with  code  0.
       The  error  messages vary in quality and	usefulness: please report con-
       fusing config errors to the tracking issue #155.

       Full  example  configs  are  available  at   quick-start	  guides   and
       config/samples/.	  However,  copy-pasting examples is no	substitute for
       reading documentation!

   Config File Structure
	  global: ...
	  jobs:
	  - name: backup
	    type: push
	  - ...

       A zrepl configuration file is divided in	to two main  sections:	global
       and  jobs.   global  has	 sensible  defaults. It	is covered in logging,
       monitoring & miscellaneous.

   Jobs	& How They Work	Together
       A job is	the unit of activity tracked by	the zrepl daemon.  The type of
       a job determines	its role in a replication setup	and in	snapshot  man-
       agement.	  Jobs are identified by their name, both in log files and the
       zrepl status command.

       NOTE:
	  The job name is persisted in several places on disk and thus	cannot
	  be changed easily.

       Replication  always happens between a pair of jobs: one active side and
       one passive side.  The active side connects to the passive side using a
       transport and starts executing the replication logic.  The passive side
       responds	to requests from the active side after	checking  its  permis-
       sions.

       The  following  table  shows how	different job types can	be combined to
       achieve both push and pull mode setups.	 Note  that  snapshot-creation
       denoted	by  "(snap)"  is orthogonal to whether a job is	active or pas-
       sive.
+------------------+---------------------------+------------------+-------------------------------------------------+
| Setup	name	   | active side	       | passive side	  | use	case					    |
+------------------+---------------------------+------------------+-------------------------------------------------+
| Push mode	   | push (snap)	       | sink		  |						    |
|		   |			       |		  |	    Laptop				    |
|		   |			       |		  |	     backup				    |
|		   |			       |		  |						    |
|		   |			       |		  |	    NAS be-				    |
|		   |			       |		  |	     hind				    |
|		   |			       |		  |	     NAT  to				    |
|		   |			       |		  |	     offsite				    |
+------------------+---------------------------+------------------+-------------------------------------------------+
| Pull mode	   | pull		       | source	(snap)	  |						    |
|		   |			       |		  |	    Central				    |
|		   |			       |		  |	     backup-server			    |
|		   |			       |		  |	     for				    |
|		   |			       |		  |	     many				    |
|		   |			       |		  |	     nodes				    |
|		   |			       |		  |						    |
|		   |			       |		  |	    Remote				    |
|		   |			       |		  |	     server				    |
|		   |			       |		  |	     to	 NAS				    |
|		   |			       |		  |	     behind				    |
|		   |			       |		  |	     NAT				    |
+------------------+---------------------------+------------------+-------------------------------------------------+
| Local	  replica- | push + sink in one	config |		  |						    |
| tion		   | with local	transport      |	 Backup  |						    |
|		   |			       |	  to	  |						    |
|		   |			       |	  locally |						    |
|		   |			       |	  at-	  |						    |
|		   |			       |	  tached  |						    |
|		   |			       |	  disk	  |						    |
|		   |			       |		  |						    |
|		   |			       |	 Backup  |						    |
|		   |			       |	  FreeBSD |						    |
|		   |			       |	  boot	  |						    |
|		   |			       |	  pool	  |						    |
+------------------+---------------------------+------------------+-------------------------------------------------+
| Snap		 & | snap (snap)	       | N/A		  |						    |
| prune-only	   |			       |		  |	   					    |
|		   |			       |		  |	     Snapshots & pruning but no	replication |
|		   |			       |		  |	     required				    |
|		   |			       |		  |						    |
|		   |			       |		  |						    |
|		   |			       |		  |	    Workaround				    |
|		   |			       |		  |	     for				    |
|		   |			       |		  |	     source-side			    |
|		   |			       |		  |	     pruning				    |
+------------------+---------------------------+------------------+-------------------------------------------------+

   How the Active Side Works
       The active side (push and pull job) executes the	replication and	 prun-
       ing logic:

       1. Wakeup  after	 snapshotting (push job) or pull interval ticker (pull
	  job).

       2. Connect to the passive side and instantiate an RPC client.

       3. Replicate data from the sender to the	receiver.

       4. Prune	on sender & receiver.

       TIP:
	  The progress of the active side can be watched live using zrepl sta-
	  tus.

   How the Passive Side	Works
       The passive side	(sink and source) waits	for connections	from  the  ac-
       tive  side, on the transport specified with serve in the	job configura-
       tion.  The respective transport then perfoms  authentication  &	autho-
       rization,  resulting in a stable	client identity.  The passive side job
       uses this client	identity as follows:

	   In sink jobs, to map requests from different client	identities  to
	    their respective sub-filesystem tree root_fs/${client_identity}.

	   In	the  future,  ``source``  might	 embed	the client identity in
	    :ref:`zrepl's ZFS abstraction names	<zrepl-zfs-abstractions>`,  to
	    support multi-host replication.

       TIP:
	  The  use of the client identity in the sink job implies that it must
	  be usable as a ZFS ZFS filesystem name component.

   How Replication Works
       One of the major	design goals of	the replication	module is to avoid any
       duplication of the nontrivial logic.  As	such, the code	works  on  ab-
       stract  senders and receiver endpoints, where typically one will	be im-
       plemented by a local program object and the other is an RPC client  in-
       stance.	Regardless of push- or pull-style setup, the logic executes on
       the active side,	i.e. in	the push or pull job.

       The following high-level	steps take place during	replication and	can be
       monitored using zrepl status:

        Plan the replication:

	  Compare sender and receiver filesystem snapshots

	  Build the replication plan

	    Per  filesystem, compute a	diff between sender and	receiver snap-
	     shots

	    Build a list of replication steps

	      If possible, use	incremental and	resumable sends

	      Otherwise, use full send	of most	recent snapshot	on sender

	  Retry on errors that	are likely temporary (i.e. network failures).

	  Give	up on filesystems where	a permanent error  was	received  over
	   RPC.

        Execute the plan

	  Perform  replication	 steps	in  the	 following  order:  Among  all
	   filesystems with pending replication	 steps,	 pick  the  filesystem
	   whose next replication step's snapshot is the oldest.

	  Create  placeholder filesystems on the receiving side to mirror the
	   dataset paths on the	sender to root_fs/${client_identity}.

	  Acquire send-side step-holds	on the step's from and to snapshots.

	  Perform the replication step.

	  Move	the replication	cursor bookmark	on the sending side  (see  be-
	   low).

	  Move	the last-received-hold on the receiving	side (see below).

	  Release the send-side step-holds.

       The idea	behind the execution order of replication steps	is that	if the
       sender snapshots	all filesystems	simultaneously at fixed	intervals, the
       receiver	 will  have  all filesystems snapshotted at time T1 before the
       first snapshot at T2 = T1 + $interval is	replicated.

   ZFS Background Knowledge
       This section gives some background knowledge about  ZFS	features  that
       zrepl uses to provide guarantees	for a replication filesystem.  Specif-
       ically, zrepl guarantees	by default that	incremental replication	is al-
       ways  possible and that started replication steps can always be resumed
       if they are interrupted.

       ZFS Send	Modes &	Bookmarks ZFS supports full sends (zfs send fs@to) and
       incremental sends (zfs send -i @from fs@to).  Full sends	 are  used  to
       create  a  new  filesystem  on the receiver with	the send-side state of
       fs@to.  Incremental sends only transfer the  delta  between  @from  and
       @to.   Incremental sends	require	that @from be present on the receiving
       side when receiving the incremental stream.  Incremental	sends can also
       use a ZFS bookmark as from on the sending side (zfs  send  -i  #bm_from
       fs@to),	 where	 #bm_from  was	created	 using	zfs  bookmark  fs@from
       fs#bm_from.  The	receiving side must always have	 the  actual  snapshot
       @from,  regardless of whether the sending side uses @from or a bookmark
       of it.

       Plain and raw sends By default, zfs send	sends the most generic,	 back-
       wards-compatible	 data  stream format (so-called	'plain send').	If the
       sent uses newer features, e.g. compression or encryption, zfs send  has
       to  un-do these operations on the fly to	produce	the plain send stream.
       If the receiver uses newer features (e.g. compression or	encryption in-
       herited from the	parent FS), it applies the  necessary  transformations
       again on	the fly	during zfs recv.

       Flags  such as -e, -c and -L  tell ZFS to produce a send	stream that is
       closer to how the data is stored	on disk.  Sending with those flags re-
       moves computational overhead from sender	and  receiver.	 However,  the
       receiver	will not apply certain transformations,	e.g., it will not com-
       press with the receive-side compression algorithm.

       The  -w (--raw) flag produces a send stream that	is as raw as possible.
       For unencrypted datasets, its current effect is the same	as -Lce.

       Encrypted datasets can only be sent plain  (unencrypted)	 or  raw  (en-
       crypted)	using the -w flag.

       Resumable  Send	&  Recv	The -s flag for	zfs recv tells zfs to save the
       partially received send stream in case it is  interrupted.   To	resume
       the  replication,  the receiving	side filesystem's receive_resume_token
       must be passed to a new zfs send	-t <value> | zfs recv command.	A full
       send can	only be	resumed	if @to still exists.  An incremental send  can
       only  be	resumed	if @to still exists and	either @from still exists or a
       bookmark	#fbm of	@from still exists.

       ZFS Holds ZFS holds prevent a snapshot from being deleted  through  zfs
       destroy,	 letting  the destroy fail with	a datset is busy error.	 Holds
       are created and referred	to by a	tag. They  can	be  thought  of	 as  a
       named, persistent lock on the snapshot.

   ZFS Abstractions Managed By zrepl
       With  the background knowledge from the previous	paragraph, we now sum-
       marize the different on-disk ZFS	objects	that zrepl manages to  provide
       its functionality.

       Placeholder  filesystems	on the receiving side are regular ZFS filesys-
       tems with the ZFS property  zrepl:placeholder=on.   Placeholders	 allow
       the receiving side to mirror the	sender's ZFS dataset hierarchy without
       replicating  every filesystem at	every intermediary dataset path	compo-
       nent.  Consider the following example: S/H/J  shall  be	replicated  to
       R/sink/job/S/H/J,  but  neither S/H nor S shall be replicated.  ZFS re-
       quires the existence of R/sink/job/S and	R/sink/job/S/H in order	to re-
       ceive into R/sink/job/S/H/J.  Thus, zrepl creates the  parent  filesys-
       tems as placeholders on the receiving side.  If at some point S/H and S
       shall  be  replicated,  the  receiving side invalidates the placeholder
       flag automatically.  The	zrepl test placeholder command can be used  to
       check whether a filesystem is a placeholder.

       The  replication	 cursor	bookmark and last-received-hold	are managed by
       zrepl to	ensure that future replications	can always be  done  incremen-
       tally.	The replication	cursor is a send-side bookmark of the most re-
       cent successfully replicated snapshot, and the last-received-hold is  a
       hold of that snapshot on	the receiving side.  Both are moved atomically
       after  the receiving side has confirmed that a replication step is com-
       plete.

       The replication cursor  has  the	 format	 #zrepl_CUSOR_G_<GUID>_J_<JOB-
       NAME>.	The  last-received-hold	 tag  has  the	format	zrepl_last_re-
       ceived_J_<JOBNAME>.  Encoding the job name in the  names	 ensures  that
       multiple	 sending  jobs	can replicate the same filesystem to different
       receivers without interference.

       Tentative replication cursor bookmarks are short-lived  bookmarks  that
       protect	the  atomic  moving-forward  of	 the  replication  cursor  and
       last-received-hold (see this issue).  They are only necessary  if  step
       holds are not used as per the replication.protection setting.  The ten-
       tative	replication   cursor   has   the   format   #zrepl_CUSORTENTA-
       TIVE_G_<GUID>_J_<JOBNAME>.  The zrepl zfs-abstraction list command pro-
       vides a listing of all bookmarks	and holds managed by zrepl.

       Step holds are zfs holds	managed	by zrepl to ensure that	a  replication
       step  can  always be resumed if it is interrupted, e.g.,	due to network
       outage.	zrepl creates step holds before	it attempts a replication step
       and releases them after the receiver confirms that the replication step
       is complete.  For an initial replication	full @initial_snap, zrepl puts
       a zfs hold on @initial_snap.  For an incremental	 send  @from  ->  @to,
       zrepl  puts  a  zfs hold	on both	@from and @to.	Note that @from	is not
       strictly	necessary for resumability -- a	bookmark on the	 sending  side
       would  be  sufficient --, but size-estimation in	currently used OpenZFS
       versions	only works if @from is a snapshot.  The	hold tag has the  for-
       mat  zrepl_STEP_J_<JOBNAME>.   A	 job only ever has one active send per
       filesystem.  Thus, there	are never more than two	step holds for a given
       pair of (job,filesystem).

       Step bookmarks are zrepl's equivalent for holds on bookmarks (ZFS  does
       not support putting holds on bookmarks).	 They are intended for a situ-
       ation  where a replication step uses a bookmark #bm as incremental from
       where #bm is not	managed	 by  zrepl.   To  ensure  resumability,	 zrepl
       copies  #bm  to step bookmark #zrepl_STEP_G_<GUID>_J_<JOBNAME>.	If the
       replication is interrupted and #bm is deleted by	 the  user,  the  step
       bookmark	remains	as an incremental source for the resumable send.  Note
       that  zrepl  does  not  yet support creating step bookmarks because the
       corresponding ZFS feature for  copying  bookmarks  is  not  yet	widely
       available .  Subscribe to zrepl issue #326 for details.

       The  zrepl zfs-abstraction list command provides	a listing of all book-
       marks and holds managed by zrepl.

       NOTE:
	  More	 details   can	 be   found    in    the    design    document
	  replication/design.md.

   Caveats With	Complex	Setups (More Than 2 Jobs or Machines)
       Most  users  are	served well with a single sender and a single receiver
       job.  This section documents considerations for more complex setups.

       ATTENTION:
	  Before you continue, make sure you have a working  understanding  of
	  how  zrepl  works and	what zrepl does	to ensure that replication be-
	  tween	sender and receiver  is	 always	 possible  without  conflicts.
	  This will help you understand	why certain kinds of multi-machine se-
	  tups do not (yet) work.

       NOTE:
	  If  you  can't  find	your  desired configuration, have questions or
	  would	like to	see improvements to multi-job setups, please  open  an
	  issue	on GitHub.

   Multiple Jobs on One	Machine
       As a general rule, multiple jobs	configured on one machine must operate
       on  disjoint sets of filesystems.  Otherwise, concurrently running jobs
       might interfere when operating on the same filesystem.

       On your setup, ensure that

        all filesystems filter	specifications are disjoint

        no root_fs is a prefix	or equal to another root_fs

        no filesystems	filter matches any root_fs

       Exceptions to the rule:

        A snap	and push job on	the same machine can match the	same  filesys-
	 tems.	 To avoid interference,	only one of the	jobs should be pruning
	 snapshots on the sender, the other one	 should	 keep  all  snapshots.
	 Since	the  jobs  won't  coordinate,  errors in the log are to	be ex-
	 pected, but zrepl's ZFS abstractions ensure that push	and  sink  can
	 always	 replicate incrementally.  This	scenario is detailed in	one of
	 the quick-start guides.

   Two Or More Machines
       This section might be relevant to users who wish	to fan-in (N  machines
       replicate to 1) or fan-out (replicate 1 machine to N machines).

       Working setups:

        Fan-in: N servers replicated to one receiver, disjoint	dataset	trees.

	  This	is the common use case of a centralized	backup server.

	  Implementation:

	    N	push jobs (one per sender server), 1 sink (as long as the dif-
	     ferent push jobs have a different client identity)

	    N source jobs (one	per sender server), N  pull  on	 the  receiver
	     server (unique names, disjoint  root_fs)

	  The	sink  job  automatically  constrains each client to a disjoint
	   sub-tree	of	the	 sink-side	dataset	     hierarchy
	   ${root_fs}/${client_identity}.   Therefore,	the  different clients
	   cannot interfere.

	  The pull job	only pulls from	one host, so it's up to	the zrepl user
	   to ensure that the different	pull jobs don't	interfere.

        Fan-out: 1 server replicated to N receivers

	  Can be implemented either in	a pull or push fashion.

	    pull setup: 1 pull	job on each receiver server, each with a  cor-
	     responding	unique source job on the sender	server.

	    push  setup: 1 sink job on	each receiver server, each with	a cor-
	     responding	unique push job	on the sender server.

	  It is critical that we have one sending-side	job (source, push) per
	   receiver.  The reason  is  that  zrepl's  ZFS  abstractions	(zrepl
	   zfs-abstraction  list) include the name of the source/push job, but
	   not the receive-side	job name or client identity (see issue	#380).
	   As  a counter-example, suppose we used multiple pull	jobs with only
	   one source job.  All	pull jobs would	 share	the  same  replication
	   cursor  bookmark  and  trip	over  each other, breaking incremental
	   replication guarantees quickly.  The	anlogous problem exists	for  1
	   push	to N sink jobs.

	  The	filesystems  matched  by  the sending side jobs	(source, push)
	   need	not necessarily	be disjoint.  For this to  work,  we  need  to
	   avoid  interference between snapshotting and	pruning	of the differ-
	   ent sending jobs.  The solution is to centralize sender-side	 snap-
	   shot	 management  in	 a  separate  snap  job.   Snapshotting	in the
	   source/push job  should  then  be  disabled	(type:	manual).   And
	   sender-side	pruning	 (keep_sender) needs to	be disabled in the ac-
	   tive	side (pull / push), since that'll be done by the snap job.

	  Restore limitations:	when restoring from one	of  the	 pull  targets
	   (e.g.,  using  zfs send -R),	the replication	cursor bookmarks don't
	   exist on the	restored system.  This can break incremental  replica-
	   tion	to all other receive-sides after restore.

	  See	the  fan-out  replication  quick-start guide for an example of
	   this	setup.

       Setups that do not work:

        N pull	identities, 1 source job. Tracking issue #380.

   Job Types in	Detail
   Job Type push
		+---------------------+----------------------------+
		| Parameter	      |	Comment			   |
		+---------------------+----------------------------+
		| type		      |	= push			   |
		+---------------------+----------------------------+
		| name		      |	unique	name  of  the  job |
		|		      |	(must not change)	   |
		+---------------------+----------------------------+
		| connect	      |	connect	specification	   |
		+---------------------+----------------------------+
		| filesystems	      |	filter	specification  for |
		|		      |	filesystems  to	 be  snap- |
		|		      |	shotted	 and pushed to the |
		|		      |	sink			   |
		+---------------------+----------------------------+
		| send		      |	send options, e.g. for en- |
		|		      |	crypted	sends		   |
		+---------------------+----------------------------+
		| snapshotting	      |	snapshotting specification |
		+---------------------+----------------------------+
		| pruning	      |	pruning	specification	   |
		+---------------------+----------------------------+
		| replication	      |	replication options	   |
		+---------------------+----------------------------+
		| conflict_resolution |	conflict  resolution   op- |
		|		      |	tions			   |
		+---------------------+----------------------------+

       Example config: config/samples/push.yml

   Job Type sink
		     +-----------+----------------------------+
		     | Parameter | Comment		      |
		     +-----------+----------------------------+
		     | type	 | = sink		      |
		     +-----------+----------------------------+
		     | name	 | unique  name	 of  the  job |
		     |		 | (must not change)	      |
		     +-----------+----------------------------+
		     | serve	 | serve specification	      |
		     +-----------+----------------------------+
		     | root_fs	 | ZFS	filesystems  are  re- |
		     |		 | ceived		   to |
		     |		 | $root_fs/$client_iden-     |
		     |		 | tity/$source_path	      |
		     +-----------+----------------------------+

       Example config: config/samples/sink.yml

   Job Type pull
+---------------------+----------------------------------------------------------------------------+
| Parameter	      |	Comment									   |
+---------------------+----------------------------------------------------------------------------+
| type		      |	= pull									   |
+---------------------+----------------------------------------------------------------------------+
| name		      |	unique	name  of  the  job						   |
|		      |	(must not change)							   |
+---------------------+----------------------------------------------------------------------------+
| connect	      |	connect	specification							   |
+---------------------+----------------------------------------------------------------------------+
| root_fs	      |	ZFS  filesystems  are  re-						   |
|		      |	ceived			to						   |
|		      |	$root_fs/$source_path							   |
+---------------------+----------------------------------------------------------------------------+
| interval	      |	Interval at which to pull from the source job (e.g. 10m).		   |
|		      |	manual disables	periodic pulling, replication then only	happens	on wakeup. |
+---------------------+----------------------------------------------------------------------------+
| pruning	      |	pruning	specification							   |
+---------------------+----------------------------------------------------------------------------+
| replication	      |	replication options							   |
+---------------------+----------------------------------------------------------------------------+
| conflict_resolution |	conflict resolution options						   |
+---------------------+----------------------------------------------------------------------------+

       Example config: config/samples/pull.yml

   Job Type source
		   +--------------+----------------------------+
		   | Parameter	  | Comment		       |
		   +--------------+----------------------------+
		   | type	  | = source		       |
		   +--------------+----------------------------+
		   | name	  | unique  name  of  the  job |
		   |		  | (must not change)	       |
		   +--------------+----------------------------+
		   | serve	  | serve specification	       |
		   +--------------+----------------------------+
		   | filesystems  | filter  specification  for |
		   |		  | filesystems	 to  be	 snap- |
		   |		  | shotted  and  exposed   to |
		   |		  | connecting clients	       |
		   +--------------+----------------------------+
		   | send	  | send options, e.g. for en- |
		   |		  | crypted sends	       |
		   +--------------+----------------------------+
		   | snapshotting | snapshotting specification |
		   +--------------+----------------------------+

       Example config: config/samples/source.yml

   Local replication
       If you have the need for	local replication (most	likely between two lo-
       cal  storage  pools), you can use the local transport type to connect a
       local push job to a local sink job.

       Example config: config/samples/local.yml.

   Job Type snap (snapshot & prune only)
       Job type	that only takes	snapshots and performs pruning	on  the	 local
       machine.
		   +--------------+----------------------------+
		   | Parameter	  | Comment		       |
		   +--------------+----------------------------+
		   | type	  | = snap		       |
		   +--------------+----------------------------+
		   | name	  | unique  name  of  the  job |
		   |		  | (must not change)	       |
		   +--------------+----------------------------+
		   | filesystems  | filter  specification  for |
		   |		  | filesystems	 to  be	 snap- |
		   |		  | shotted		       |
		   +--------------+----------------------------+
		   | snapshotting | snapshotting specification |
		   +--------------+----------------------------+
		   | pruning	  | pruning specification      |
		   +--------------+----------------------------+

       Example config: config/samples/snap.yml

   Transports
       The zrepl RPC layer uses	transports to  establish  a  single,  bidirec-
       tional  data  stream between an active and passive job.	On the passive
       (serving) side, the transport also provides the client identity to  the
       upper  layers: this string is used for access control and separation of
       filesystem sub-trees in sink jobs.  Transports  are  specified  in  the
       connect or serve	section	of a job definition.

   Contents
        Transports

	  tcp Transport

	    Serve

	    Connect

	  tls Transport

	    Serve

	    Connect

	    Mutual-TLS	between	Two Machines

	    Certificate Authority using EasyRSA

	  ssh+stdinserver Transport

	    Serve

	    Connect

	  local Transport

       ATTENTION:
	  The  client identities must be valid ZFS dataset path	components be-
	  cause	the sink job uses ${root_fs}/${client_identity}	 to  determine
	  the client's subtree.

   tcp Transport
       The  tcp	transport uses plain TCP, which	means that the data is not en-
       crypted on the wire.  Clients are identified by their IPv4 or IPv6  ad-
       dresses,	 and  the  client identity is established through a mapping on
       the server.

       This transport may also be used in conjunction with  network-layer  en-
       cryption	and/or VPN tunnels to provide encryption on the	wire.  To make
       the  IP-based  client  authentication  effective, such solutions	should
       provide authenticated IP	addresses.  Some options to consider:

        WireGuard: Linux-focussed, in-kernel TLS

        OpenVPN: Cross-platform VPN, uses tun on *nix

        IPSec:	Properly standardized, in-kernel network-layer VPN

        spiped: think of it as	an encrypted pipe between two servers

        SSH

	  sshuttle: VPN-like solution,	but using SSH

	  SSH port forwarding:	Systemd	user unit & make it start  before  the
	   zrepl service.

   Serve
	  jobs:
	  - type: sink
	    serve:
	      type: tcp
	      listen: ":8888"
	      listen_freebind: true # optional,	default	false
	      clients: {
		"192.168.122.123" :		  "mysql01",
		"192.168.122.42" :		  "mx01",
		"2001:0db8:85a3::8a2e:0370:7334": "gateway",

		# CIDR masks require a '*' in the client identity string
		# that is expanded to the client's IP address

		"10.23.42.0/24":       "cluster-*"
		"fde4:8dba:82e1::/64": "san-*"
	      }
	    ...

       listen_freebind	controls  whether  the	socket	is  allowed to bind to
       non-local or unconfigured IP addresses  (Linux  IP_FREEBIND  ,  FreeBSD
       IP_BINDANY).  Enable this option	if you want to listen on a specific IP
       address that might not yet be configured	when the zrepl daemon starts.

   Connect
	  jobs:
	   - type: push
	     connect:
	       type: tcp
	       address:	"10.23.42.23:8888"
	       dial_timeout: # optional, default 10s
	     ...

   tls Transport
       The  tls	 transport  uses  TCP  +  TLS with client authentication using
       client certificates.  The client	identity is the	common name (CN)  pre-
       sented in the client certificate.

       It  is  recommended  to	set  up	a dedicated CA infrastructure for this
       transport, e.g. using OpenVPN's EasyRSA.	 For a simple 2-machine	setup,
       mutual TLS might	also be	sufficient.  We	provide	copy-pastable instruc-
       tions to	generate the certificates below.

       The implementation uses Go's TLS	library.  Since	Go binaries are	stati-
       cally linked, you or your distribution need  to	recompile  zrepl  when
       vulnerabilities in that library are disclosed.

       All  file paths are resolved relative to	the zrepl daemon's working di-
       rectory.	 Specify absolute paths	if you are unsure what directory  that
       is (or find out from your init system).

       If  intermediate	CAs are	used, the full chain must be present in	either
       in the ca file or the individual	cert files.  Regardless, the  client's
       certificate  must  be  first in the cert	file, with each	following cer-
       tificate	directly certifying the	one preceding it (see TLS's specifica-
       tion).  This is the common default when using a CA management tool.

       NOTE:
	  As of	Go 1.15	(zrepl 0.3.0 and newer), the Go	TLS / x509 library re-
	  qurires Subject Alternative Names be present	in  certificates.  You
	  might	need to	re-generate your certificates using one	of the two al-
	  ternatives provided below.

	  Note further that zrepl continues to use the CommonName field	to as-
	  sign client identities.  Hence, we recommend to keep the Subject Al-
	  ternative Name and the CommonName in sync.

   Serve
	  jobs:
	    - type: sink
	      root_fs: "pool2/backup_laptops"
	      serve:
		type: tls
		listen:	":8888"
		listen_freebind: true #	optional, default false
		ca:   /etc/zrepl/ca.crt
		cert: /etc/zrepl/prod.fullchain
		key:  /etc/zrepl/prod.key
		client_cns:
		  - "laptop1"
		  - "homeserver"

       The  ca	field  specified  the  certificate  authority used to validate
       client certificates.  The client_cns list specifies a list of  accepted
       client  common  names  (which  are  also	the client identities for this
       transport).  The	listen_freebind	field is explained here.

   Connect
	  jobs:
	  - type: pull
	    connect:
	      type: tls
	      address: "server1.foo.bar:8888"
	      ca:   /etc/zrepl/ca.crt
	      cert: /etc/zrepl/backupserver.fullchain
	      key:  /etc/zrepl/backupserver.key
	      server_cn: "server1"
	      dial_timeout: # optional,	default	10s

       The ca field specifies the CA which  signed  the	 server's  certificate
       (serve.cert).  The server_cn specifies the expected common name (CN) of
       the  server's  certificate.  It overrides the hostname specified	in ad-
       dress.  The connection fails if either do not match.

   Mutual-TLS between Two Machines
       However,	for a two-machine setup, self-signed certificates  distributed
       using an	out-of-band mechanism will also	work just fine:

       Suppose	you  have  a push-mode setup, with backups.example.com running
       the sink	job, and prod.example.com running the push job.	 Run the  fol-
       lowing  OpenSSL	commands  on  each host, substituting HOSTNAME in both
       filenames and the interactive input prompt by OpenSSL:

	  (name=HOSTNAME; openssl req -x509 -sha256 -nodes \
	   -newkey rsa:4096 \
	   -days 365 \
	   -keyout $name.key \
	   -out	$name.crt -addext "subjectAltName = DNS:$name" -subj "/CN=$name")

       Now  copy  each	machine's  HOSTNAME.crt	  to   the   other   machine's
       /etc/zrepl/HOSTNAME.crt,	 for  example  using scp.  The serve & connect
       configuration will thus look like the following:

	  # on backups.example.com
	  - type: sink
	    serve:
	      type: tls
	      listen: ":8888"
	      ca: "/etc/zrepl/prod.example.com.crt"
	      cert: "/etc/zrepl/backups.example.com.crt"
	      key: "/etc/zrepl/backups.example.com.key"
	      client_cns:
		- "prod.example.com"
	    ...

	  # on prod.example.com
	  - type: push
	    connect:
	      type: tls
	      address:"backups.example.com:8888"
	      ca: /etc/zrepl/backups.example.com.crt
	      cert: /etc/zrepl/prod.example.com.crt
	      key:  /etc/zrepl/prod.example.com.key
	      server_cn: "backups.example.com"
	    ...

   Certificate Authority using EasyRSA
       For more	than two machines, it might make sense to set up a  CA	infra-
       structure.  Tools like EasyRSA make this	very easy:

	  #!/usr/bin/env bash
	  set -euo pipefail

	  HOSTS=(backupserver prod1 prod2 prod3)

	  curl -L https://github.com/OpenVPN/easy-rsa/releases/download/v3.0.7/EasyRSA-3.0.7.tgz > EasyRSA-3.0.7.tgz
	  echo "157d2e8c115c3ad070c1b2641a4c9191e06a32a8e50971847a718251eeb510a8  EasyRSA-3.0.7.tgz" | sha256sum -c
	  rm -rf EasyRSA-3.0.7
	  tar -xf EasyRSA-3.0.7.tgz
	  cd EasyRSA-3.0.7
	  ./easyrsa
	  ./easyrsa init-pki
	  ./easyrsa build-ca nopass

	  for host in "${HOSTS[@]}"; do
	      ./easyrsa	build-serverClient-full	$host nopass
	      echo cert	for host $host available at pki/issued/$host.crt
	      echo key for host	$host available	at pki/private/$host.key
	  done
	  echo ca cert available at pki/ca.crt

   ssh+stdinserver Transport
       ssh+stdinserver	 uses  the  ssh	 command  and  some  features  of  the
       server-side SSH authorized_keys file.  It is less efficient than	 other
       transports because the data passes through two more pipes.  However, it
       is  fairly  convenient  to set up and allows the	zrepl daemon to	not be
       directly	exposed	to the internet, because all  traffic  passes  through
       the system's SSH	server.

       The  concept is inspired	by git shell and Borg Backup.  The implementa-
       tion is provided	by the Go package github.com/problame/go-netssh.

       NOTE:
	  ssh+stdinserver generally provides inferior error detection and han-
	  dling	compared to the	tcp and	 tls  transports.   When  encountering
	  such	problems,  consider  using  tcp	or tls transports, or help im-
	  prove	package	go-netssh.

   Serve
	  jobs:
	  - type: source
	    serve:
	      type: stdinserver
	      client_identities:
	      -	"client1"
	      -	"client2"
	    ...

       First of	all, note that type=stdinserver	in this	case: Currently,  only
       connect.type=ssh+stdinserver  can  connect to a serve.type=stdinserver,
       but we want to keep that	option open for	future extensions.

       The serving job opens a UNIX socket named after client_identity in  the
       runtime directory.  In our example above, that is /var/run/zrepl/stdin-
       server/client1 and /var/run/zrepl/stdinserver/client2.

       On  the	same  machine,	the zrepl stdinserver $client_identity command
       connects	 to  /var/run/zrepl/stdinserver/$client_identity.    It	  then
       passes  its  stdin  and stdout file descriptors to the zrepl daemon via
       cmsg(3).	 zrepl daemon in turn combines them into an object  implement-
       ing  net.Conn:  a  Write() turns	into a write to	stdout,	a Read() turns
       into a read from	stdin.

       Interactive use of the stdinserver subcommand does not make much	sense.
       However,	we can force its execution when	a user with a  particular  SSH
       pubkey connects via SSH.	 This can be achieved with an entry in the au-
       thorized_keys file of the serving zrepl daemon.

	  # for	OpenSSH	>= 7.2
	  command="zrepl stdinserver CLIENT_IDENTITY",restrict CLIENT_SSH_KEY
	  # for	older OpenSSH versions
	  command="zrepl stdinserver CLIENT_IDENTITY",no-port-forwarding,no-X11-forwarding,no-pty,no-agent-forwarding,no-user-rc CLIENT_SSH_KEY

        CLIENT_IDENTITY  is  substituted with an entry	from client_identities
	 in our	example

        CLIENT_SSH_KEY	is substituted with the	public part of the SSH keypair
	 specified in the connect.identity_file	directive  on  the  connecting
	 host.

       NOTE:
	  You	may   need   to	  adjust   the	 PermitRootLogin   option   in
	  /etc/ssh/sshd_config to forced-commands-only or higher for  this  to
	  work.	 Refer to sshd_config(5) for details.

       To  recap,  this	 is  of	 how  client  authentication  works  with  the
       ssh+stdinserver transport:

        Connections to	the /var/run/zrepl/stdinserver/${client_identity} UNIX
	 socket	are blindly trusted by zrepl daemon.   The  connection	client
	 identity is the name of the socket, i.e. ${client_identity}.

        Thus,	the  runtime directory must be private to the zrepl user (this
	 is checked by zrepl daemon)

        The admin of the host with the	serving	zrepl daemon controls the  au-
	 thorized_keys file.

        Thus,	the  administrator controls the	mapping	PUBKEY -> CLIENT_IDEN-
	 TITY.

   Connect
	  jobs:
	  - type: pull
	    connect:
	      type: ssh+stdinserver
	      host: prod.example.com
	      user: root
	      port: 22
	      identity_file: /etc/zrepl/ssh/identity
	      #	options: # optional, default [], `-o` arguments	passed to ssh
	      #	- "Compression=yes"
	      #	dial_timeout: 10s # optional, default 10s, max time.Duration until initial handshake is	completed

       The connecting zrepl daemon

       1. Creates a pipe

       2. Forks

       3. In the forked	process

	  1. Replaces forked stdin and stdout with the corresponding pipe ends

	  2. Executes the ssh binary found in $PATH.

	     1.	The identity file (-i) is set to $identity_file.

	     2.	The remote user, host and port correspond to those configured.

	     3.	Further	options	can be	specified  using  the  options	field,
		which appends each entry in the	list to	the command line using
		-o $entry.

       4. Wraps	the pipe ends in a net.Conn and	returns	it to the RPC layer.

       As  discussed in	the section above, the connecting zrepl	daemon expects
       that zrepl stdinserver $client_identity is  executed automatically  via
       an authorized_keys file entry.

       The  known_hosts	file used by the ssh command must contain an entry for
       connect.host prior to starting zrepl.  Thus, run	the following  on  the
       pulling host's command line (substituting connect.host):

	  ssh -i /etc/zrepl/ssh/identity root@prod.example.com

       NOTE:
	  The environment variables of the underlying SSH process are cleared.
	  $SSH_AUTH_SOCK  will	not be available.  It is suggested to create a
	  separate, unencrypted	SSH key	solely for that	purpose.

   local Transport
       The local transport can be used to implement local  replication,	 i.e.,
       push  replication  between a push and sink job defined in the same con-
       figuration file.

       The listener_name is analogous to a hostname  and  must	match  between
       serve  and  connect.   The client_identity is used by the sink as docu-
       mented above.

	  jobs:
	  - type: sink
	    serve:
	      type: local
	      listener_name: localsink
	    ...

	  - type: push
	    connect:
	      type: local
	      listener_name: localsink
	      client_identity: local_backup
	      dial_timeout: 2s # optional, 0 for no timeout
	    ...

   Filter Syntax
       For source, push	and snap jobs, a filesystem  filter  must  be  defined
       (field  filesystems).   A  filter  takes	 a filesystem path (in the ZFS
       filesystem hierarchy) as	parameter and returns  true  (pass)  or	 false
       (block).

       A  filter  is  specified	as a YAML dictionary with patterns as keys and
       booleans	as values.  The	following rules	determine which	result is cho-
       sen for a given filesystem path:

        More specific path patterns win over less specific ones

        Non-wildcard patterns (full path patterns) win	over subtree wildcards
	 (< at end of pattern)

        If the	path in	question does not match	any  pattern,  the  result  is
	 false.

       The  subtree  wildcard <	means "the dataset left	of < and all its chil-
       dren".

       TIP:
	  You can try out patterns for a configured job	using the  zrepl  test
	  filesystems subcommand for push and source jobs.

   Examples
   Full	Access
       The following configuration will	allow access to	all filesystems.

	  jobs:
	  - type: source
	    filesystems: {
	      "<": true,
	    }
	    ...

   Fine-grained
       The following configuration demonstrates	all rules presented above.

	  jobs:
	  - type: source
	    filesystems: {
	      "tank<": true,	      #	rule 1
	      "tank/foo<": false,     #	rule 2
	      "tank/foo/bar": true,  # rule 3
	    }
	    ...

       Which rule applies to given path, and what is the result?

	  tank/foo/bar/loo => 2	   false
	  tank/bar	   => 1	   true
	  tank/foo/bar	   => 3	   true
	  zroot		   => NONE false
	  tank/var/log	   => 1	   true

   Send	& Recv Options
   Send	Options
       Source and push jobs have an optional send configuration	section.

	  jobs:
	  - type: push
	    filesystems: ...
	    send:
	      #	flags from the table below go here
	    ...

       The  following  table  specifies	 the list of (boolean) options.	 Flags
       with an entry in	the zfs	send column map	directly to the	zfs  send  CLI
       flags.	zrepl does not perform feature checks for these	flags.	If you
       enable a	flag that is not supported by the installed  version  of  ZFS,
       the  zfs	 error	will  show up at runtime in the	logs and zrepl status.
       See the upstream	man page (man zfs-send)	for their semantics.
	       +-------------------+----------+---------------------+
	       | send.		   | zfs send |	Comment		    |
	       +-------------------+----------+---------------------+
	       | encrypted	   |	      |	Specific to  zrepl, |
	       |		   |	      |	see below.	    |
	       +-------------------+----------+---------------------+
	       | bandwidth_limit   |	      |	Specific  to zrepl, |
	       |		   |	      |	see below.	    |
	       +-------------------+----------+---------------------+
	       | raw		   | -w	      |	Use  encrypted	 to |
	       |		   |	      |	only	allow	en- |
	       |		   |	      |	crypted	     sends. |
	       |		   |	      |	Mixed sends are	not |
	       |		   |	      |	supported.	    |
	       +-------------------+----------+---------------------+
	       | send_properties   | -p	      |	Be   careful,  read |
	       |		   |	      |	the note  on  prop- |
	       |		   |	      |	erty	replication |
	       |		   |	      |	below.		    |
	       +-------------------+----------+---------------------+
	       | backup_properties | -b	      |	Be  careful,   read |
	       |		   |	      |	the  note  on prop- |
	       |		   |	      |	erty	replication |
	       |		   |	      |	below.		    |
	       +-------------------+----------+---------------------+
	       | large_blocks	   | -L	      |	Potential data loss |
	       |		   |	      |	on  OpenZFS  < 2.0, |
	       |		   |	      |	see warning below.  |
	       +-------------------+----------+---------------------+
	       | compressed	   | -c	      |			    |
	       +-------------------+----------+---------------------+
	       | embedded_data	   | -e	      |			    |
	       +-------------------+----------+---------------------+
	       | saved		   | -S	      |			    |
	       +-------------------+----------+---------------------+

   encrypted
       The encrypted option controls whether the matched filesystems are  sent
       as  OpenZFS  native  encryption	raw  sends.  More specifically,	if en-
       crypted=true, zrepl

        checks	for any	of the filesystems matched by filesystems whether  the
	 ZFS encryption	property indicates that	the filesystem is actually en-
	 crypted with ZFS native encryption and

        invokes the zfs send subcommand with the -w option (raw sends)	and

        expects the receiving side to support OpenZFS native encryption (recv
	 will fail otherwise)

       Filesystems  matched by filesystems that	are not	encrypted are not sent
       and will	cause error log	messages.

       If encrypted=false, zrepl expects that filesystems matching filesystems
       are not encrypted or have loaded	encryption keys.

       NOTE:
	  Use encrypted	instead	of raw to make your intent  clear  that	 zrepl
	  must only replicate filesystems that are actually encrypted by Open-
	  ZFS  native encryption.  It is meant as a safeguard to prevent unin-
	  tended sends of unencrypted filesystems in raw mode.

   properties
       Sends the dataset properties along with snapshots.  Please  be  careful
       with this option	and read the note on property replication below.

   backup_properties
       When  properties	 are modified on a filesystem that was received	from a
       send stream with	send.properties=true, ZFS archives  the	 original  re-
       ceived value internally.	 This also applies to inheriting or overriding
       properties during zfs receive.

       When sending those received filesystems another hop, the	backup_proper-
       ties  flag  instructs  ZFS  to send the original	property values	rather
       than the	current	locally	set values.

       This is useful for replicating properties  across  multiple  levels  of
       backup  machines.   Example: Suppose we want to flow snapshots from Ma-
       chine A to B, then from B to C.	A will enable the properties send  op-
       tion.   B  will want to override	critical properties such as mountpoint
       or canmount.  But the job that replicates from B	to C should be sending
       the original property  values  received	from  A.   Thus,  B  sets  the
       backup_properties option.

       Please be careful with this option and read the note on property	repli-
       cation below.

   large_blocks
       This  flag  should  not be changed after	initial	replication.  Prior to
       OpenZFS commit 7bcb7f08 it was possible to change  this	setting	 which
       resulted	 in  data loss on the receiver.	 The commit in question	is in-
       cluded in OpenZFS 2.0 and works around the problem by  prohibiting  re-
       ceives of incremental streams with a flipped setting.

       WARNING:
	  This	bug has	not been fixed in the OpenZFS 0.8 releases which means
	  that changing	this flag after	initial	replication might  cause  data
	  loss on the receiver.

   Recv	Options
       Sink and	pull jobs have an optional recv	configuration section:

	  jobs:
	  - type: pull
	    recv:
	      properties:
		inherit:
		  - "mountpoint"
		override: {
		  "org.openzfs.systemd:ignore":	"on"
		}
	      bandwidth_limit: ...
	      placeholder:
		encryption: unspecified	| off |	inherit
	    ...

       Jump to properties , bandwidth_limit , and placeholder.

   properties
       override	 maps  directly	 to the	zfs recv -o flag.  Property name-value
       pairs specified in this map will	apply to all received filesystems, re-
       gardless	of whether the send stream contains properties or not.

       inherit maps directly to	the zfs	recv -x	flag.  Property	 names	speci-
       fied  in	 this  list will be inherited from the receiving side's	parent
       filesystem (e.g.	root_fs).

       With both options, the sending side's property value is still stored on
       the receiver, but the local override or inherit is the one  that	 takes
       effect.	 You  can send the original properties from the	first receiver
       to another receiver using send.backup_properties.

   A Note on Property Replication
       If a  send  stream  contains  properties,  as  per  send.properties  or
       send.backup_properties,	the default ZFS	behavior is to use those prop-
       erties on the receiving side, verbatim.

       In many use cases for zrepl, this can  have  devastating	 consequences.
       For  example,  when  backing up a filesystem that has mountpoint=/ to a
       storage server, that storage server's root filesystem will be  shadowed
       by  the received	file system on some platforms.	Also, many scripts and
       tools use ZFS user properties for configuration and do  not  check  the
       property	source (local vs. received).  If they are installed on the re-
       ceiving	side  as  well as the sending side, property replication could
       have unintended effects.

       zrepl currently does not	provide	any automatic safe-guards for property
       replication:

        Make sure to read the entire man page on zfs recv (man	zfs recv)  be-
	 fore enabling this feature.

        Use  recv.properties.override	whenever  possible,  e.g.  for	mount-
	 point=none or canmount=off.

        Use recv.properties.inherit if	that makes more	sense to you.

       Below is	an non-exhaustive list of problematic properties.  Please open
       a pull request if you find a property that is missing from  this	 list.
       (Both  with regards to core ZFS tools and other software	in the broader
       ecosystem.)

   Mount behaviour
        mountpoint

        canmount

        overlay

       Note: Before OpenZFS 2.0.5, inheriting  or  overriding  the  mountpoint
       property	 on ZVOLs fails	in zfs recv.  If you are on such an older ver-
       sion, consider creating separate	zrepl jobs for your ZVOL and  filesys-
       tem datasets.

   Systemd
       With  systemd, you should also consider the properties processed	by the
       zfs-mount-generator .

       Most notably:

        org.openzfs.systemd:ignore

        org.openzfs.systemd:wanted-by

        org.openzfs.systemd:required-by

   Encryption
       If the sender filesystems are encrypted but the sender does plain sends
       and property replication	is enabled, the	receiver must inherit the fol-
       lowing properties:

        keylocation

        keyformat

        encryption

   Placeholders
	  placeholder:
	    encryption:	unspecified | off | inherit

       During replication, zrepl creates placeholder datasets on the receiving
       side if the sending side's  filesystems	filter	creates	 gaps  in  the
       dataset	hierarchy.   This  is generally	fully transparent to the user.
       However,	with OpenZFS Native  Encryption,  placeholders	require	 zrepl
       user attention.	Specifically, the problem is that, when	zrepl attempts
       to  create  the	placeholder  dataset  on the receiver, and that	place-
       holder's	parent dataset is encrypted, ZFS wants to  inherit  encryption
       to  the placeholder.  This is relevant to two use cases that zrepl sup-
       ports:

       1. encrypted-send-to-untrusted-receiver In this use  case,  the	sender
	  sends	an encrypted send stream and the receiver doesn't have the key
	  loaded.

       2. send-plain-encrypt-on-receive	 The  receive-side  root_fs dataset is
	  encrypted, and the senders are unencrypted.  The key of  root_fs  is
	  loaded, and the goal is that the plain sends (e.g., from production)
	  are encrypted	on-the-fly during receive, with	root_fs's key.

       For encrypted-send-to-untrusted-receiver, the placeholder datasets need
       to  be created with -o encryption=off.  Without it, creation would fail
       with an error, indicating that the placeholder's	parent	dataset's  key
       needs  to  be loaded.  But we don't trust the receiver, so we can't ex-
       pect that to ever happen.

       However,	for send-plain-encrypt-on-receive, we cannot  set  -o  encryp-
       tion=off.   The	reason is that if we did, any of the (non-placeholder)
       child datasets below  the  placeholder  would  inherit  encryption=off,
       thereby	silently  breaking  our	 encrypt-on-receive  use case.	So, to
       cover this use case, we need to create placeholders without  specifying
       -o  encryption.	 This will make	zfs create inherit the encryption mode
       from the	parent dataset,	and thereby transitively from root_fs.

       The zrepl config	provides the recv.placeholder.encryption knob to  con-
       trol  this behavior.  In	undefined mode (default), placeholder creation
       bails out and asks the user to configure	a behavior.  In	off mode,  the
       placeholder is created with encryption=off, i.e., encrypted-send-to-un-
       trusted-rceiver	use case.  In inherit mode, the	placeholder is created
       without specifying -o  encryption  at  all,  i.e.,  the	send-plain-en-
       crypt-on-receive	use case.

   Common Options
   Bandwidth Limit (send & recv)
	  bandwidth_limit:
	    max: 23.5 MiB # -1 is the default and disabled rate	limiting
	    bucket_capacity: # token bucket capacity in	bytes; defaults	to 128KiB

       Both  send and recv can be limited to a maximum bandwidth through band-
       width_limit.  For most users, it	should be sufficient to	just set band-
       width_limit.max.	 The  bandwidth_limit.bucket_capacity  refers  to  the
       token bucket size.

       The  bandwidth  limit  only  applies to the payload data, i.e., the ZFS
       send stream.  It	does not account  for  transport  protocol  overheads.
       The  scope is the job level, i.e., all concurrent sends or incoming re-
       ceives of a job share the bandwidth limit.

   Replication Options
	  jobs:
	  - type: push
	    filesystems: ...
	    replication:
	      protection:
		initial:     guarantee_resumability # guarantee_{resumability,incremental,nothing}
		incremental: guarantee_resumability # guarantee_{resumability,incremental,nothing}
	      concurrency:
		size_estimates:	4
		steps: 1

	    ...

   protection option
       The protection variable controls	 the  degree  to  which	 a  replicated
       filesystem is protected from getting out	of sync	through	a zrepl	pruner
       or   external  tools  that  destroy  snapshots.	 zrepl	can  guarantee
       resumability or just incremental	replication.

       guarantee_resumability is the  default  value  and  guarantees  that  a
       replication  step  is always resumable and that incremental replication
       will always be possible.	 The implementation uses replication  cursors,
       last-received-hold and step holds.

       guarantee_incremental only guarantees that incremental replication will
       always  be  possible.   If  a step from -> to is	interrupted and	its to
       snapshot	is destroyed, zrepl will remove	the half-received to's	resume
       state and start a new step from -> to2.	The implementation uses	repli-
       cation cursors, tentative replication cursors and last-received-hold.

       guarantee_nothing  does not make	any guarantees with regards to keeping
       sending and receiving side in sync.  No bookmarks or holds are  created
       to protect sender and receiver from diverging.

       Tradeoffs

       Using guarantee_incremental instead of guarantee_resumability obviously
       removes	the  resumability  guarantee.	This  means  that  replication
       progress	is no longer monotonic which might lead	to a replication setup
       that never makes	progress if mid-step interruptions  are	 too  frequent
       (e.g. frequent network outages).	 However, the advantage	and reason for
       existence  of  the  incremental	mode  is  that it allows the pruner to
       delete snapshots	of interrupted replication steps which	is  useful  if
       replication  happens so rarely (or fails	so frequently) that the	amount
       of disk space exclusively referenced by the  step's  snapshots  becomes
       intolerable.

       NOTE:
	  When changing	this flag, obsoleted zrepl-managed bookmarks and holds
	  will be destroyed on the next	replication step that is attempted for
	  each filesystem.

   concurrency option
       The  concurrency	options	control	the maximum amount of concurrency dur-
       ing replication.	 The default values allow some concurrency during size
       estimation but no parallelism for the actual replication.

        concurrency.steps (default = 1) controls the maximum number  of  con-
	 currently  executed  replication  steps.   The	planning step for each
	 file system is	counted	as a single step.

        concurrency.size_estimates (default = 4) controls the maximum	number
	 of concurrent step size estimations done by the job.

       Note  that  initial replication cannot start replicating	child filesys-
       tems before the parent filesystem's initial replication step  has  com-
       pleted.

       Some notes on tuning these values:

        Disk:	Size  estimation is less I/O intensive than step execution be-
	 cause it does not need	to access the data blocks.

        CPU: Size estimation is usually a dense CPU burst whereas step	execu-
	 tion CPU utilization is stretched out over time because of  disk  IO.
	 Faster	 disks,	 sending  a  compressed	 dataset in plain mode and the
	 zrepl transport mode all contribute to	higher CPU requirements.

        Network  bandwidth:  Size  estimation	does  not  consume  meaningful
	 amounts of bandwidth, step execution does.

        zrepl	ZFS abstractions: for each replication step zrepl needs	to up-
	 date its ZFS abstractions through the zfs command which  often	 waits
	 multiple  seconds  for	the zpool to sync.  Thus, if the actual	send &
	 recv time of a	step is	small compared to the time spent on zrepl  ZFS
	 abstractions  then  increasing	step execution concurrency will	result
	 in a lower overall turnaround time.

   Conflict Resolution Options
	  jobs:
	  - type: push
	    filesystems: ...
	    conflict_resolution:
	      initial_replication: most_recent | all | fail # default: most_recent

	    ...

   initial_replication option
       The initial_replication option  determines  how	many  snapshots	 zrepl
       replicates  if  the  filesystem	has  not  been	replicated before.  If
       most_recent (the	default), the initial replication will	only  transfer
       the  most  recent  snapshot, while ignoring previous snapshots.	If all
       snapshots should	be replicated, specify all.  Use fail to make replica-
       tion of the filesystem fail in case there is no corresponding fileystem
       on the receiver.

       For example, suppose there are snapshosts tank@1, tank@2, tank@3	 on  a
       sender.	 Then  most_recent will	replicate just @3, but all will	repli-
       cate @1,	@2, and	@3.

       If initial replication is interrupted, and there	is at least one	(maybe
       partial)	snapshot on the	receiver, zrepl	will always resume  in	incre-
       mental  mode.   And that	is regardless of where the initial replication
       was interrupted.

       For example, if initial_replication: all	and the	transfer of @1 is  in-
       terrupted,  zrepl  would	 retry/resume  at  @1.	 And  even if the user
       changes the config to initial_replication: most_recent before resuming,
       incremental mode	will still resume at @1.

   Taking Snaphots
       You can configure zrepl to take snapshots of  the  filesystems  in  the
       filesystems field specified in push, source and snap jobs.

       The following snapshotting types	are supported:
		 +-------------------+----------------------------+
		 | snapshotting.type | Comment			  |
		 +-------------------+----------------------------+
		 | periodic	     | Ensure  that snapshots are |
		 |		     | taken at	a particular  in- |
		 |		     | terval.			  |
		 +-------------------+----------------------------+
		 | cron		     | Use   cron  spec	 to  take |
		 |		     | snapshots  at   particular |
		 |		     | points in time.		  |
		 +-------------------+----------------------------+
		 | manual	     | zrepl  does  not	 take any |
		 |		     | snapshots by itself.	  |
		 +-------------------+----------------------------+

       The periodic and	cron snapshotting types	share some common options  and
       behavior:

        Naming: The snapshot names are	composed of a user-defined prefix fol-
	 lowed	by  a UTC date formatted like 20060102_150405_000.  We use UTC
	 because it will avoid name conflicts when switching time zones	or be-
	 tween summer and winter time.

        Hooks:	You can	configure hooks	to run before or after zrepl takes the
	 snapshots. See	below for details.

        Push replication: After creating all snapshots, the snapshotter  will
	 wake  up  the	replication part of the	job, if	it's a push job.  Note
	 that snapshotting is decoupled	from replication, i.e.,	if it is  down
	 or  takes too long, snapshots will still be taken.  Note further that
	 other jobs are	not woken up by	snapshotting.

       NOTE:
	      There is no concept of ownership of the snapshots	that are  cre-
	      ated by periodic or cron.	 Thus, there is	no distinction between
	      zrepl-created snapshots and user-created snapshots during	repli-
	      cation or	pruning.

	      In  particular,  pruning will take all snapshots into considera-
	      tion by default.	To constrain  pruning  to  just	 zrepl-created
	      snapshots:

		 1. Assign a unique prefix to the snapshotter and

		 2. Use	 the  regex  functionality of the various pruning keep
		    rules to just consider snapshots with that prefix.

	  There	 is  currently	no  way	 to  constrain	replication  to	  just
	  zrepl-created	 snapshots.   Follow  and comment at issue #403	if you
	  need this functionality.

       NOTE:
	  The zrepl signal wakeup JOB subcommand does  not  trigger  snapshot-
	  ting.

   periodic Snapshotting
	  jobs:
	  - ...
	    filesystems: { ... }
	    snapshotting:
	      type: periodic
	      prefix: zrepl_
	      interval:	10m
	      #	Timestamp format that is used as snapshot suffix.
	      #	Can be any of "dense" (default), "human", "iso-8601", "unix-seconds" or	a custom Go time format	(see https://go.dev/src/time/format.go)
	      timestamp_format:	dense
	      hooks: ...
	   pruning: ...

       The periodic snapshotter	ensures	that snapshots are taken in the	speci-
       fied  interval.	If you use zrepl for backup, this translates into your
       recovery	point objective	(RPO).	To meet	your RPO, you  still  need  to
       monitor that replication, which happens asynchronously to snapshotting,
       actually	works.

       It  is  desirable to get	all filesystems	snapshotted simultaneously be-
       cause it	results	in a more consistent backup.  To accomplish this while
       still maintaining the interval, the periodic  snapshotter  attempts  to
       get  the	 snapshotting  rhythms	in sync.  To find that sync point, the
       most recent snapshot, created by	the snapshotter, in any	of the matched
       filesystems is used.  A filesystem that does not	have snapshots by  the
       snapshotter  has	lower priority than filesystem that do,	and thus might
       not be snapshotted (and replicated) until it is snapshotted at the next
       sync point.  The	snapshotter uses the prefix to	identify  which	 snap-
       shots it	created.

   cron	Snapshotting
	  jobs:
	  - type: snap
	    filesystems: { ... }
	    snapshotting:
	      type: cron
	      prefix: zrepl_
	      #	(second, optional) minute hour day-of-month month day-of-week
	      #	This example takes snapshots daily at 3:00.
	      cron: "0 3 * * *"
	      #	Timestamp format that is used as snapshot suffix.
	      #	Can be any of "dense" (default), "human", "iso-8601", "unix-seconds" or	a custom Go time format	(see https://go.dev/src/time/format.go)
	      timestamp_format:	dense
	    pruning: ...

       In  cron	 mode, the snapshotter takes snaphots at fixed points in time.
       See  https://en.wikipedia.org/wiki/Cron	for  details  on  the  syntax.
       zrepl  uses  the	 the github.com/robfig/cron/v3 Go package for parsing.
       An optional field for "seconds"	is  supported  to  take	 snapshots  at
       sub-minute frequencies.

   Timestamp Format
       The  cron  and  periodic	snapshotter support configuring	a custom time-
       stamp format that is used as suffix for the snapshot name.  It  can  be
       used by setting timestamp_format	to any of the following	values:

        dense (default) looks like 20060102_150405_000

        human looks like 2006-01-02_15:04:05

        iso-8601 looks	like 2006-01-02T15:04:05.000Z

        unix-seconds looks like 1136214245

        Any custom Go time format accepted by time.Time#Format.

   manual Snapshotting
	  jobs:
	  - type: push
	    snapshotting:
	      type: manual
	    ...

       In  manual mode,	zrepl does not take snapshots by itself.  Manual snap-
       shotting	is most	useful if you have existing infrastructure  for	 snap-
       shot  management.  Or, if you want to decouple snapshot management from
       replication using a zrepl snap job.  See	this quickstart	guide  for  an
       example.

       To  trigger  replication	 after	taking snapshots, use the zrepl	signal
       wakeup JOB command.

   Pre-	and Post-Snapshot Hooks
       Jobs with periodic snapshots can	run hooks before and/or	 after	taking
       the  snapshot  specified	 in  snapshotting.hooks:  Hooks	are called per
       filesystem before and after the snapshot	is taken (pre- and post-edge).
       Pre-edge	invocations are	in configuration order,	post-edge  invocations
       in  reverse  order,  i.e.  like	a stack.  If a pre-snapshot invocation
       fails, err_is_fatal=true	cuts off subsequent hooks,  does  not  take  a
       snapshot,  and  only  invokes post-edges	corresponding to previous suc-
       cessful pre-edges.  err_is_fatal=false logs the failed pre-edge invoca-
       tion but	does not affect	 subsequent  hooks  nor	 snapshotting  itself.
       Post-edges  are	only invoked for hooks whose pre-edges ran without er-
       ror.  Note that hook failures for one  filesystem  never	 affect	 other
       filesystems.

       The  optional  timeout  parameter  specifies a period after which zrepl
       will kill the hook process and report an	error.	The default is 30 sec-
       onds   and   may	  be   specified   in	any   units   understood    by
       time.ParseDuration.

       The  optional  filesystems filter which limits the filesystems the hook
       runs for. This uses the same filter specification as jobs.

       Most hook types take additional parameters, please refer	to the respec-
       tive subsections	below.
	      +---------------------+---------+---------------------+
	      |	Hook type	    | Details |	Description	    |
	      +---------------------+---------+---------------------+
	      |	command		    | Details |	Arbitrary pre-	and |
	      |			    |	      |	post	   snapshot |
	      |			    |	      |	scripts.	    |
	      +---------------------+---------+---------------------+
	      |	postgres-checkpoint | Details |	Execute	   Postgres |
	      |			    |	      |	CHECKPOINT SQL com- |
	      |			    |	      |	mand  before  snap- |
	      |			    |	      |	shot.		    |
	      +---------------------+---------+---------------------+
	      |	mysql-lock-tables   | Details |	Flush and read-Lock |
	      |			    |	      |	MySQL tables  while |
	      |			    |	      |	taking	 the  snap- |
	      |			    |	      |	shot.		    |
	      +---------------------+---------+---------------------+

   command Hooks
	  jobs:
	  - type: push
	    filesystems: {
	      "<": true,
	      "tmp": false
	    }
	    snapshotting:
	      type: periodic
	      prefix: zrepl_
	      interval:	10m
	      hooks:
	      -	type: command
		path: /etc/zrepl/hooks/zrepl-notify.sh
		timeout: 30s
		err_is_fatal: false
	      -	type: command
		path: /etc/zrepl/hooks/special-snapshot.sh
		filesystems: {
		  "tank/special": true
		}
	    ...

       command hooks take a path to an executable script or binary to be  exe-
       cuted  before  and  after  the  snapshot.   path	must be	absolute (e.g.
       /etc/zrepl/hooks/zrepl-notify.sh).  No arguments	may be specified; cre-
       ate a wrapper script if zrepl must call an executable that requires ar-
       guments.	 The process standard output is	logged at level	INFO. Standard
       error is	logged at level	WARN.  The following environment variables are
       set:

        ZREPL_HOOKTYPE: either	"pre_snapshot" or "post_snapshot"

        ZREPL_FS: the ZFS filesystem name being snapshotted

        ZREPL_SNAPNAME:   the	  zrepl-generated    snapshot	 name	 (e.g.
	 zrepl_20380119_031407_000)

        ZREPL_DRYRUN:	set  to	 "true"	if a dry run is	in progress so scripts
	 can print, but	not run, their commands

       An     empty	template      hook	can	 be	 found	    in
       config/samples/hooks/template.sh.

   postgres-checkpoint Hook
       Connects	 to  a	Postgres  server and executes the CHECKPOINT statement
       pre-snapshot.  Checkpointing applies the	WAL contents to	all data files
       and syncs the data files	to disk.  This is not required for  a  consis-
       tent  database  backup: it merely forward-pays the "cost" of WAL	replay
       to the time of snapshotting instead of at restore.  However, the	 Post-
       gres  manual  recommends	against	checkpointing during normal operation.
       Further,	the operation requires Postgres	superuser  privileges.	 zrepl
       users must decide on their own whether this hook	is useful for them (it
       likely isn't).

       ATTENTION:
	  Note	that  WALs and Postgres	data directory (with all database data
	  files) must be  on  the  same	 filesystem  to	 guarantee  a  correct
	  point-in-time	backup with the	ZFS snapshot.

       DSN syntax documented here: https://godoc.org/github.com/lib/pq

	  CREATE USER zrepl_checkpoint PASSWORD	yourpasswordhere;
	  ALTER	ROLE zrepl_checkpoint SUPERUSER;

	  - type: postgres-checkpoint
	    dsn: "host=localhost port=5432 user=postgres password=yourpasswordhere sslmode=disable"
	    filesystems: {
		"p1/postgres/data11": true
	    }

   mysql-lock-tables Hook
       Connects	to MySQL and executes

        pre-snapshot  FLUSH  TABLES  WITH READ	LOCK to	lock all tables	in all
	 databases in the MySQL	server we connect to (docs)

        post-snapshot UNLOCK TABLES  reverse above operation.

       Above procedure is documented in	the MySQL manual as a means to produce
       a consistent backup of a	MySQL DBMS installation	(i.e., all databases).

       DSN	 syntax:       [username[:password]@][protocol[(address)]]/db-
       name[?param1=value1&...&paramN=valueN]

       ATTENTION:
	  All  MySQL databases must be on the same ZFS filesystem to guarantee
	  a consistent point-in-time backup with the ZFS snapshot.

	  CREATE USER zrepl_lock_tables	IDENTIFIED BY 'yourpasswordhere';
	  GRANT	RELOAD ON *.* TO zrepl_lock_tables;
	  FLUSH	PRIVILEGES;

	  - type: mysql-lock-tables
	    dsn: "zrepl_lock_tables:yourpasswordhere@tcp(localhost)/"
	    filesystems: {
	      "tank/mysql": true
	    }

   Pruning Policies
       In zrepl, pruning means destroying snapshots.  Pruning must  happen  on
       both  sides of a	replication or the systems would inevitably run	out of
       disk space at some point.

       Typically, the requirements to temporal resolution and  maximum	reten-
       tion  time differ per side.  For	example, when using zrepl to back up a
       busy database server, you will want high	temporal resolution (snapshots
       every 10	min) for the last 24h in case of administrative	disasters, but
       cannot afford to	store them for much longer because you might have high
       turnover	volume in the database.	 On the	receiving side,	you  may  have
       more  disk  space available, or need to comply with other backup	reten-
       tion policies.

       zrepl uses a set	of  keep rules per sending and receiving side  to  de-
       termine	which snapshots	shall be kept per filesystem.  A snapshot that
       is not kept by any rule is destroyed.  The keep rules are evaluated  on
       the  active  side (push or pull job) of the replication setup, for both
       active and passive side,	after replication completed or was  determined
       to have failed permanently.

       Example Configuration:

	  jobs:
	    - type: push
	      name: ...
	      connect: ...
	      filesystems: {
		"<": true,
		"tmp": false
	      }
	      snapshotting:
		type: periodic
		prefix:	zrepl_
		interval: 10m
	      pruning:
		keep_sender:
		  - type: not_replicated
		  # make sure manually created snapshots by the	administrator are kept
		  - type: regex
		    regex: "^manual_.*"
		  - type: grid
		    grid: 1x1h(keep=all) | 24x1h | 14x1d
		    regex: "^zrepl_.*"
		keep_receiver:
		  - type: grid
		    grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d
		    regex: "^zrepl_.*"
		  # manually created snapshots will be kept forever on receiver
		  - type: regex
		    regex: "^manual_.*"

       DANGER:
	  You might have existing snapshots of filesystems affected by pruning
	  which	 you  want to keep, i.e. not be	destroyed by zrepl.  Make sure
	  to actually add the necessary	regex keep rules on both  sides,  like
	  with manual in the example above.

   Policy not_replicated
	  jobs:
	  - type: push
	    pruning:
	      keep_sender:
	      -	type: not_replicated
	    ...

       not_replicated keeps all	snapshots that have not	been replicated	to the
       receiving  side.	  It  only  makes  sense  to specify this rule for the
       keep_sender.  The reason	is that, by definition,	all snapshots  on  the
       receiver	have already been replicated to	there from the sender.	To de-
       termine	whether	 a  sender-side	 snapshot has already been replicated,
       zrepl uses the replication cursor bookmark  which  corresponds  to  the
       most recent successfully	replicated snapshot.

   Policy grid
	  jobs:
	  - type: pull
	    pruning:
	      keep_receiver:
	      -	type: grid
		regex: "^zrepl_.*"
		grid: 1x1h(keep=all) | 24x1h | 35x1d | 6x30d

		       1 repetition of a one-hour interval with	keep=all

					24 repetitions of a one-hour interval with keep=1

							6 repetitions of a 30-day interval with	keep=1
	    ...

       The  retention  grid can	be thought of as a time-based sieve that thins
       out snapshots as	they get older.

       The grid	field specifies	a list of adjacent time	intervals.   Each  in-
       terval is a bucket with a maximum capacity of keep snapshots.  The fol-
       lowing procedure	happens	during pruning:

       1. The  list  of	 snapshots  is	filtered  by the regular expression in
	  regex.  Only snapshots names that match the regex are	considered for
	  this rule, all others	will be	pruned unless another rule keeps them.

       2. The snapshots	that match regex are placed onto a time	axis according
	  to their creation date.  The youngest	snapshot is on the  left,  the
	  oldest on the	right.

       3. The  first  buckets  are  placed  "under" that axis so that the grid
	  spec's first bucket's	left edge aligns with youngest snapshot.

       4. All subsequent buckets are  placed  adjacent	to  their  predecessor
	  bucket.

       5. Now  each snapshot on	the axis either	falls into one bucket or it is
	  older	than our rightmost bucket.   Buckets  are  left-inclusive  and
	  right-exclusive  which  means	 that a	snapshot on the	edge of	bucket
	  will always 'fall into the right one'.

       6. Snapshots older than the rightmost bucket are	not kept by  the  grid
	  specification.

       7. For each bucket, we only keep	the keep oldest	snapshots.

       The syntax to describe the bucket list is as follows:

	  Repeat x Duration (keep=all)

        The duration specifies	the length of the interval.

        The  keep  count  specifies the number	of snapshots that fit into the
	 bucket.  It can be either a positive integer or  all  (all  snapshots
	 are kept).

        The repeat count repeats the bucket definition	for the	specified num-
	 ber of	times.

       Example:

	  Assume the following grid specification:

	     grid: 1x1h(keep=all) | 2x2h | 1x3h

	  This grid specification produces the following constellation of buckets:

	  0h	    1h	      2h	3h	  4h	    5h	      6h	7h	  8h	    9h
	  |	    |	      |		|	  |	    |	      |		|	  |	    |
	  |-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------|
	  | keep=all|	   keep=1	|	keep=1	    |		 keep=1		  |

	  Now assume that we have a set	of snapshots @a, @b, ..., @D.
	  Snapshot @a is the most recent snapshot.
	  Snapshot @D is the oldest snapshot, it is almost 9 hours older than snapshot @a.
	  We place the snapshots on the	same timeline as the buckets:

	  0h	    1h	      2h	3h	  4h	    5h	      6h	7h	  8h	    9h
	  |	    |	      |		|	  |	    |	      |		|	  |	    |
	  |-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------|
	  | keep=all|	   keep=1	|	keep=1	    |		 keep=1		  |
	  |	    |			|		    |				  |
	  | a  b  c | d	 e  f  g  h  i	j  k  l	 m  n  o  p |q	r  s  t	 u  v  w  x  y	z |A  B	 C  D

	  We obtain the	following mapping of snapshots to buckets:

	  Bucket1:   a,b,c
	  Bucket2:   d,e,f,g,h,i
	  Bucket3:   j,k,l,m,n,o,p
	  Bucket4:   q,r,s,t,u,v,w,x,y,z
	  No bucket: A,B,C,D

	  For each bucket, we now prune	snapshots until	it only	contains `keep`	snapshots.
	  Newer	snapshots are destroyed	first.
	  Snapshots that do not	fall into a bucket are always destroyed.

	  Result after pruning:

	  0h	    1h	      2h	3h	  4h	    5h	      6h	7h	  8h	    9h
	  |	    |	      |		|	  |	    |	      |		|	  |	    |
	  |-Bucket1-|-----Bucket2-------|------Bucket3------|-----------Bucket4-----------|
	  |	    |			|		    |				  |
	  | a  b  c |		     i	|		  p |				z |

   Policy last_n
	  jobs:
	    - type: push
	      pruning:
		keep_receiver:
		- type:	last_n
		  count: 10
		  regex: ^zrepl_.*$ # optional
	    ...

       last_n  filters	the  snapshot list by regex, then keeps	the last count
       snapshots in that list (last = youngest = most  recent  creation	 date)
       All  snapshots  that  don't match regex or exceed count in the filtered
       list are	destroyed unless matched by other rules.

   Policy regex
	  jobs:
	    - type: push
	      pruning:
		keep_receiver:
		# keep all snapshots with prefix zrepl_	or manual_
		- type:	regex
		  regex: "^(zrepl|manual)_.*"

	    - type: push
	      snapshotting:
		prefix:	zrepl_
	      pruning:
		keep_sender:
		# keep all snapshots that were not created by zrepl
		- type:	regex
		  negate: true
		  regex: "^zrepl_.*"

       regex keeps all snapshots whose names are matched by  the  regular  ex-
       pression	 in  regex.  Like all other regular expression fields in prune
       policies, zrepl uses Go's regexp.Regexp Perl-compatible regular expres-
       sions (Syntax).	The optional negate boolean field inverts  the	seman-
       tics:  Use  it  if you want to keep all snapshots that do not match the
       given regex.

   Source-side snapshot	pruning
       A source	jobs takes snapshots on	the system it  runs  on.   The	corre-
       sponding	 pull job on the replication target connects to	the source job
       and replicates the snapshots.  Afterwards,  the	pull  job  coordinates
       pruning on both sender (the source job side) and	receiver (the pull job
       side).

       There  is  no  built-in way to define and execute pruning on the	source
       side independently of the pull side.  The source	job will continue tak-
       ing snapshots which will	not be pruned until the	 pull  side  connects.
       This means that extended	replication downtime will fill up the source's
       zpool with snapshots.

       If  the	above  is a conceivable	situation for you, consider using push
       mode, where pruning happens on the same side where snapshots are	taken.

   Workaround using snap job
       As a workaround (see GitHub issue #102  for  development	 progress),  a
       pruning-only  snap  job can be defined on the source side: The snap job
       is in charge of snapshot	creation &  destruction,  whereas  the	source
       job's  role is reduced to just serving snapshots.  However, since, jobs
       are run independently, it is possible that  the	snap  job  will	 prune
       snapshots  that	are queued for replication / destruction by the	remote
       pull job	that connects to the source job.  Symptoms of such race	condi-
       tions are spurious replication and destroy errors.

       Example configuration:

	  # source side
	  jobs:
	  - type: snap
	    snapshotting:
	      type: periodic
	    pruning:
	      keep:
		# source side pruning rules go here
	    ...

	  - type: source
	    snapshotting:
	      type: manual
	    root_fs: ...

	  # pull side
	  jobs:
	  - type: pull
	    pruning:
	      keep_sender:
		# let the source-side snap job do the pruning
		- type:	regex
		  regex: ".*"
		...
	      keep_receiver:
		# feel free to prune on	the pull side as desired
		...

   Logging
       zrepl uses structured logging to	provide	users with easily  processable
       log messages.

       Logging	outlets	 are  configured  in  the global section of the	config
       file.

	  global:
	    logging:

	      -	type: OUTLET_TYPE
		level: MINIMUM_LEVEL
		format:	FORMAT

	      -	type: OUTLET_TYPE
		level: MINIMUM_LEVEL
		format:	FORMAT

	      ...

	  jobs:	...

       ATTENTION:
	  The first outlet is special: if an error writing to any  outlet  oc-
	  curs,	 the  first outlet receives the	error and can print it.	 Thus,
	  the first outlet must	be the one that	 always	 works	and  does  not
	  block, e.g. stdout, which is the default.

   Default Configuration
       By default, the following logging configuration is used

	  global:
	    logging:

	      -	type: "stdout"
		level:	"warn"
		format:	"human"

   Building Blocks
       The following sections document the semantics of	the different log lev-
       els, formats and	outlet types.

   Levels
		      +-------+-------+---------------------+
		      |	Level |	SHORT |	Description	    |
		      +-------+-------+---------------------+
		      |	error |	ERRO  |	immediate    action |
		      |	      |	      |	required	    |
		      +-------+-------+---------------------+
		      |	warn  |	WARN  |	symptoms  for  mis- |
		      |	      |	      |	configuration, soon |
		      |	      |	      |	expected   failure, |
		      |	      |	      |	etc.		    |
		      +-------+-------+---------------------+
		      |	info  |	INFO  |	explains what  hap- |
		      |	      |	      |	pens   without	too |
		      |	      |	      |	much detail	    |
		      +-------+-------+---------------------+
		      |	debug |	DEBG  |	tracing	   informa- |
		      |	      |	      |	tion,  state dumps, |
		      |	      |	      |	etc. useful for	de- |
		      |	      |	      |	bugging.	    |
		      +-------+-------+---------------------+

       Incorrectly classified messages are considered a	bug and	should be  re-
       ported.

   Formats
		      +--------+----------------------------+
		      |	Format | Description		    |
		      +--------+----------------------------+
		      |	human  | prints	 job  and subsystem |
		      |	       | into brackets	before	the |
		      |	       | actual	 message,  followed |
		      |	       | by  remaining	fields	 in |
		      |	       | logfmt	style		    |
		      +--------+----------------------------+
		      |	logfmt | logfmt	 output. zrepl uses |
		      |	       | this Go package.	    |
		      +--------+----------------------------+
		      |	json   | JSON	formatted   output. |
		      |	       | Each  line is a valid JSON |
		      |	       | document. Fields are  mar- |
		      |	       | shaled	     by	     encod- |
		      |	       | ing/json.Marshal(),  which |
		      |	       | is particularly useful	for |
		      |	       | processing in log aggrega- |
		      |	       | tion  or  when	 processing |
		      |	       | state dumps.		    |
		      +--------+----------------------------+

   Outlets
       Outlets are the destination for log entries.

   stdout Outlet
		     +-----------+----------------------------+
		     | Parameter | Comment		      |
		     +-----------+----------------------------+
		     | type	 | stdout		      |
		     +-----------+----------------------------+
		     | level	 | minimum  log	level	      |
		     +-----------+----------------------------+
		     | format	 | output format	      |
		     +-----------+----------------------------+
		     | time	 | always  include  time   in |
		     |		 | output (true	or false)     |
		     +-----------+----------------------------+
		     | color	 | colorize  output according |
		     |		 | to  log  level  (true   or |
		     |		 | false)		      |
		     +-----------+----------------------------+

       Writes  all log entries with minimum level level	formatted by format to
       stdout.	If stdout is a tty, interactive	usage is assumed and both time
       and color are set to true.

       Can only	be specified once.

   syslog Outlet
		  +----------------+----------------------------+
		  | Parameter	   | Comment			|
		  +----------------+----------------------------+
		  | type	   | syslog			|
		  +----------------+----------------------------+
		  | level	   | minimum  log level		|
		  +----------------+----------------------------+
		  | format	   | output format		|
		  +----------------+----------------------------+
		  | facility	   | Which syslog  facility  to	|
		  |		   | use (default = local0)	|
		  +----------------+----------------------------+
		  | retry_interval | Interval between reconnec-	|
		  |		   | tion  attempts  to	 syslog	|
		  |		   | (default =	0)		|
		  +----------------+----------------------------+

       Writes all log entries formatted	by format to syslog.   On  normal  se-
       tups, you should	not need to change the retry_interval.

       Can only	be specified once.

   tcp Outlet
		  +----------------+----------------------------+
		  | Parameter	   | Comment			|
		  +----------------+----------------------------+
		  | type	   | tcp			|
		  +----------------+----------------------------+
		  | level	   | minimum  log level		|
		  +----------------+----------------------------+
		  | format	   | output format		|
		  +----------------+----------------------------+
		  | net		   | tcp in most cases		|
		  +----------------+----------------------------+
		  | address	   | remote    network,	   e.g.	|
		  |		   | logs.example.com:10202	|
		  +----------------+----------------------------+
		  | retry_interval | Interval between reconnec-	|
		  |		   | tion attempts to address	|
		  +----------------+----------------------------+
		  | tls		   | TLS config	(see below)	|
		  +----------------+----------------------------+

       Establishes a TCP connection to address and  sends  log	messages  with
       minimum	level  level formatted by format.  If tls is not specified, an
       unencrypted connection is established.  If tls is  specified,  the  TCP
       connection  is secured with TLS + Client	Authentication.	 The latter is
       particularly useful in combination with log aggregation services.
		     +-----------+----------------------------+
		     | Parameter | Description		      |
		     +-----------+----------------------------+
		     | ca	 | PEM-encoded	  certificate |
		     |		 | authority  that signed the |
		     |		 | remote server's  TLS	 cer- |
		     |		 | tificate		      |
		     +-----------+----------------------------+
		     | cert	 | PEM-encoded	 client	 cer- |
		     |		 | tificate identifying	 this |
		     |		 | zrepl  daemon  toward  the |
		     |		 | remote server	      |
		     +-----------+----------------------------+
		     | key	 | PEM-encoded,	  unencrypted |
		     |		 | client private key identi- |
		     |		 | fying  this	zrepl  daemon |
		     |		 | toward the remote server   |
		     +-----------+----------------------------+

       WARNING:
	  zrepl	drops log messages to the TCP outlet if	the underlying connec-
	  tion is not fast enough.  Note that TCP buffering in the kernel must
	  first	run full before	messages are dropped.

	  Make sure to always configure	a stdout outlet	as the	special	 error
	  outlet  to be	informed about problems	with the TCP outlet (see above
	  ).

       NOTE:
	  zrepl	uses Go's crypto/tls and crypto/x509 packages and  leaves  all
	  but  the  required fields in tls.Config at their default values.  In
	  case of a security defect in these packages, zrepl has to be rebuilt
	  because Go binaries are statically linked.

   Monitoring
       Monitoring endpoints are	configured in the global.monitoring section of
       the config file.

   Prometheus &	Grafana
       zrepl can expose	Prometheus metrics via HTTP.  The listen attribute  is
       a  net.Listen   string for tcp, e.g. :9811 or 127.0.0.1:9811 (port 9811
       was reserved to zrepl on	the official list).  The  listen_freebind  at-
       tribute	is  explained  here.  The Prometheus monitoring	job appears in
       the zrepl control job list and may be specified at most once.

       zrepl also ships	with an	importable Grafana dashboard that consumes the
       Prometheus metrics: see dist/grafana.  The dashboard also contains some
       advice on which metrics are important to	monitor.

       NOTE:
	  At the time of writing, there	is no stability	guarantee on  the  ex-
	  ported metrics.

	  global:
	    monitoring:
	      -	type: prometheus
		listen:	':9811'
		listen_freebind: true #	optional, default false

   Miscellaneous
   Runtime Directories & UNIX Sockets
       The zrepl daemon	needs to open various UNIX sockets in a	runtime	direc-
       tory:

        a  control socket that	the CLI	commands use to	interact with the dae-
	 mon

        the ssh+stdinserver Transport listener	opens one socket  per  config-
	 ured client, named after client_identity parameter

       There  is  no  authentication  on these sockets except the UNIX permis-
       sions.  The zrepl daemon	will refuse to bind any	of the	above  sockets
       in a directory that is world-accessible.

       The  following  sections	 of the	global config shows the	default	paths.
       The shell script	below shows how	the default runtime directory  can  be
       created.

	  global:
	    control:
	      sockpath:	/var/run/zrepl/control
	    serve:
	      stdinserver:
		sockdir: /var/run/zrepl/stdinserver

	  mkdir	-p /var/run/zrepl/stdinserver
	  chmod	-R 0700	/var/run/zrepl

   Durations & Intervals
       Interval	 & duration fields in job definitions, pruning configurations,
       etc. must match the following regex:

	  var durationStringRegex *regexp.Regexp = regexp.MustCompile(`^\s*(\d+)\s*(s|m|h|d|w)\s*$`)
	  // s = second, m = minute, h = hour, d = day,	w = week (7 days)

   Usage
   CLI Overview
       NOTE:
	  The zrepl binary is self-documenting:	run zrepl help for an overview
	  of the available subcommands or zrepl	SUBCOMMAND --help for informa-
	  tion on available flags, etc.
    +-------------------------+-------------------------------------------------+
    | Subcommand	      |	Description					|
    +-------------------------+-------------------------------------------------+
    | zrepl help	      |	show subcommand	overview			|
    +-------------------------+-------------------------------------------------+
    | zrepl daemon	      |	run the	 daemon,  required			|
    |			      |	for  all zrepl functional-			|
    |			      |	ity						|
    +-------------------------+-------------------------------------------------+
    | zrepl status	      |	show job activity, or with			|
    |			      |	--mode raw for JSON output			|
    +-------------------------+-------------------------------------------------+
    | zrepl stdinserver	      |	see ssh+stdinserver Trans-			|
    |			      |	port						|
    +-------------------------+-------------------------------------------------+
    | zrepl signal wakeup JOB |	manually trigger  replica-			|
    |			      |	tion + pruning of JOB				|
    +-------------------------+-------------------------------------------------+
    | zrepl signal reset JOB  |	manually   abort   current			|
    |			      |	replication +  pruning	of			|
    |			      |	JOB						|
    +-------------------------+-------------------------------------------------+
    | zrepl configcheck	      |	check  if  config  can	be			|
    |			      |	parsed without errors				|
    +-------------------------+-------------------------------------------------+
    | zrepl migrate	      |	perform	on-disk	state /	ZFS property migrations	|
    |			      |	(see changelog for details)			|
    +-------------------------+-------------------------------------------------+
    | zrepl zfs-abstraction   |	list and remove	zrepl's	abstractions on	top  of	|
    |			      |	ZFS,   e.g.   holds  and  step	bookmarks  (see	|
    |			      |	overview )					|
    +-------------------------+-------------------------------------------------+

   zrepl daemon
       All actual work zrepl does is performed by a daemon process.  The  dae-
       mon supports structured logging and provides monitoring endpoints.

       When installing from a package, the package maintainer should have pro-
       vided  an  init script /	systemd.service	file.  You should thus be able
       to start	zrepl daemon using your	init system.

       Alternatively, or for running zrepl in the foreground,  simply  execute
       zrepl  daemon.	Note  that  you	won't see much output with the default
       logging configuration:

       ATTENTION:
	  Make sure to actually	monitor	the error level	output of zrepl:  some
	  configuration	errors will not	make the daemon	exit.

	  Example:  if	the daemon cannot create the ssh+stdinserver Transport
	  sockets in the runtime directory, it will emit an error message  but
	  not  exit  because  other tasks such as periodic snapshots & pruning
	  are of equal importance.

   Restarting
       The daemon handles SIGINT and SIGTERM for graceful shutdown.   Graceful
       shutdown	means at worst that a job will not be rescheduled for the next
       interval.   The	daemon	exits  as  soon	as all jobs have reported shut
       down.

   Systemd Unit	File
       A systemd service definition template  is  available  in	 dist/systemd.
       Note  that some of the options only work	on recent versions of systemd.
       Any help	& improvements are very	welcome, see issue #145.

   Ops Runbooks
   Migrating Sending Side
       Objective: Move sending-side zpool to  new  hardware.   Make  the  move
       fully  transparent  to  the sending-side	jobs.  After the move is done,
       all sending-side	zrepl jobs should continue to work as if the move  had
       not happened.  In particular, incremental replication should be able to
       pick up where it	left before the	move.

       Suppose	we  want to migrate all	data from one zpool oldpool to another
       zpool newpool.  A possible reason might be that we want to change  RAID
       levels, ashift, or just migrate over to next-gen	hardware.

       If  the	pool  names are	different, zrepl's matching between sender and
       receiver	dataset	will break becase the receive-side dataset names  con-
       tain  oldpool.  To avoid	this, we will need the name of the new pool to
       match that of the old pool.  The	following steps	will accomplish	this:

       1.  Stop	zrepl.

       2.  Create the new pool:	zpool create newpool ...

       3.  Take	a snapshot of the old pool so that you have something that you
	   can zfs send.  For example,	run  zfs  snapshot  -r	oldpool@migra-
	   tion_oldpool_newpool.

       4.  Send	 all  of  the  oldpool's datasets to the new pool: zfs send -R
	   oldpool@migration_oldpool_newpool | zfs recv	-F newpool

       5.  Export the old pool:	zpool export oldpool

       6.  Export the new pool:	zpool export newpool

       7.  (Optional) Change the name of the old pool to something  that  does
	   not	conflict  with	the  new  pool.	  We are going to use the name
	   oldoldpool in this example.	Use zpool import with no arguments  to
	   see the pool	id.  Then zpool	import <id> oldoldpool && zpool	export
	   oldoldpool.

       8.  Import the new pool,	while changing the name	to match the old pool:
	   zpool import	newpool	oldpool

       9.  Start zrepl again and wake up the relevant jobs.

       10. Use	zrepl  status  or  you	monitoring  to ensure that replication
	   works.  The best test is an end-to-end test where  you  write  some
	   junk	 data  on a sender dataset and wait until a snapshot with that
	   data	appears	on the receiving side.

       11. Once	you are	confident that replication is working, you may dispose
	   of the old pool.

       Note that, depending on pruning rules,  it  will	 not  be  possible  to
       switch  back to the old pool seamlessly,	i.e., without a	full re-repli-
       cation.

   Platform Tests
       Along with the main zrepl binary, we release the	platformtest binaries.
       The zrepl platform tests	are an integration test	suite that is  comple-
       mentary	to  the	 pure  Go unit tests.  Any test	that needs to interact
       with ZFS	is a platform test.

       The platform need to run	as root.  For each test,  we  create  a	 fresh
       dummy  zpool  backed  by	 a file-based vdev.  The file path, and	a root
       mountpoint for the dummy	zpool, must be specified on the	command	line:

	  mkdir	-p /tmp/zreplplatformtest
	  ./platformtest \
	      -poolname	'zreplplatformtest' \  # <- name must contain zreplplatformtest
	      -imagepath /tmp/zreplplatformtest.img \ #	<- zrepl will create the file
	      -mountpoint /tmp/zreplplatformtest # <- must exist

       WARNING:
	  platformtest will unconditionally overwrite the  file	 at  imagepath
	  and  unconditionally	zpool destroy $poolname.  So, don't use	a pro-
	  duction poolname, and	consider running the test in a VM.  It'll be a
	  lot faster as	well because the underlying operations,	 zfs  list  in
	  particular, will be faster.

       While the platformtests are running, there will be a log	of log output.
       After  all  tests  have	run, it	prints a summary with a	list of	tests,
       grouped by result type (success,	failure, skipped):

	  PASSING TESTS:
	    github.com/zrepl/zrepl/platformtest/tests.BatchDestroy
	    github.com/zrepl/zrepl/platformtest/tests.CreateReplicationCursor
	    github.com/zrepl/zrepl/platformtest/tests.GetNonexistent
	    github.com/zrepl/zrepl/platformtest/tests.HoldsWork
	    ...
	    github.com/zrepl/zrepl/platformtest/tests.SendStreamNonEOFReadErrorHandling
	    github.com/zrepl/zrepl/platformtest/tests.UndestroyableSnapshotParsing
	  SKIPPED TESTS:
	    github.com/zrepl/zrepl/platformtest/tests.SendArgsValidationEncryptedSendOfUnencryptedDatasetForbidden__EncryptionSupported_false
	  FAILED TESTS:	[]

       If there	is a failure, or a skipped test	that  you  believe  should  be
       passing,	re-run the test	suite, capture stderr &	stdout to a text file,
       and create an issue on GitHub.

       To run a	specific test case, or a subset	of tests matched by regex, use
       the -run	REGEX command line flag.

       To  stop	 test execution	at the first failing test, and prevent cleanup
       of the dummy zpool, use the -failure.stop-and-keep-pool flag.

       To  build  the  platformtests  yourself,	 use  make  test-platform-bin.
       There's	also  the  make	test-platform target to	run the	platform tests
       with a default command line.

   Talks & Presentations
        Talk at OpenZFS Developer Summit 2018	of  pre-release	 0.1  (	 25min
	 Recording , Slides , Event )

        Talk  at  EuroBSDCon2017  FreeBSD  DevSummit  with live demo of zrepl
	 0.0.3 ( 55min Recording, Slides, Event	)

	  Note: The remarks on	keep_bookmarks are irrelevant as of zrepl  0.1
	   which  introduced  the  zrepl-managed  replication cursor bookmark.
	   Read	the Overview section to	learn more.

   Changelog
       The changelog summarizes	bugfixes that are deemed  relevant  for	 users
       and  package maintainers.  Developers should consult the	git commit log
       or GitHub issue tracker.

   Next	Release
       The plan	for the	next release is	to revisit  how	 zrepl	does  snapshot
       management.  High-level goals:

        Make  it easy to decouple snapshot management (snapshotting, pruning)
	 from replication.

        Ability to include/exclude snapshots from replication.	 This is  use-
	 ful  for  aforementioned decoupling, e.g., separate snapshot prefixes
	 for local & remote replication.  Also,	it makes explicit that by  de-
	 fault,	 zrepl	replicates  all	snapshots, and that replication	has no
	 concept of "zrepl-created snapshots", which is	 a  common  misconcep-
	 tion.

        Use  of  zfs  snapshot	comma syntax or	channel	programs to take snap-
	 shots of multiple datasets atomically.

        Provide an alternative	to the grid pruning policy.  Most likely some-
	 thing based on	hourly/daily/weekly/monthly "trains" plus a count.

        Ability to prune at the granularity of	the group of snapshots created
	 at a given time, as opposed to	 the  individual  snapshots  within  a
	 dataset.  Maybe this will be addressed	by the alternative to the grid
	 pruning policy, as it will likely be more predictable.

       Those  changes will likely come with some breakage in the config.  How-
       ever, I want to avoid breaking use cases	that are satisfied by the cur-
       rent design.  There will	be beta/RC releases to give users a chance  to
       evaluate.

   0.6.1
        [FEATURE] add metric to detect	filesystems rules that don't match any
	 local dataset (thanks,	@gmekicaxcient).

        [BUG] zrepl status: hide progress bar once all	filesystems reach ter-
	 minal state (thanks, @0x3333).

        [BUG]	handling  of  tenative	cursor presence	if protection strategy
	 doesn't use it	(issue #714).

        [DOCS]	address	 setup	with  two  or  more  external  disks  (thanks,
	 @se-jaeger).

        [DOCS]	 document replication and conflict_resolution options (thanks,
	 @InsanePrawn).

        [DOCS]	docs:  talks:  add  note  on  keep_bookmarks  option  (thanks,
	 @skirmess).

        [MAINT] dist: add openrc service file (thanks,	@gramosg).

        [MAINT] grafana: update dashboard to Grafana 9.3.6.

        [MAINT] run platform tests as part of CI.

        [MAINT]  build:  upgrade to Go	1.21 and update	golangci-lint; minimum
	 Go version for	builds is now 1.20

       NOTE:
	  zrepl	is a spare-time	project	primarily developed by Christian Schwarz.
	  You can support maintenance and feature development through one of the following services:
	  Donate via Patreon Donate via	GitHub Sponsors	Donate via Liberapay Donate via	PayPal
	  Note that PayPal processing fees are relatively high for small donations.
	  For SEPA wire	transfer and commercial	support, please	contact	Christian directly.

   0.6
        [FEATURE] Schedule-based snapshotting using cron syntax instead of an
	 interval.

        [FEATURE] Configurable	initial	replication policy.  When a filesystem
	 is first replicated to	a receiver,  this  control  whether  just  the
	 newest	 snapshot will be replicated vs. all existing snapshots. Learn
	 more in the docs.

        [FEATURE]  Configurable  timestamp  format  for  snapshot  names  via
	 timestamp_format (Thanks, @ydylla).

        [FEATURE]  Add	ZREPL_DESTROY_MAX_BATCH_SIZE env var (default 0=unlim-
	 ited) (Thanks,	@3nprob).

        [FEATURE]  Add	 zrepl	configcheck  --skip-cert-check	flag  (Thanks,
	 @cole-h).

        [BUG] Fix resuming from interrupted replications that use send.raw on
	 unencrypted datasets.

	  The send options introduced in zrepl	0.4 allow users	to specify ad-
	   ditional  zfs  send	flags for zrepl	to use.	 Before	this fix, when
	   setting  send.raw=true  on  a  job  that   replicates   unencrypted
	   datasets,  zrepl  would not allow an	interrupted replication	to re-
	   sume.  The reason  were  overly  cautious  checks  to  support  the
	   send.encrypted option.

	  This	 bugfix	 removes  these	 checks	 from the replication planner.
	   This	makes send.encrypted a sender-side-only	concern, much like all
	   other send.*	flags.

	  However, this means that the	zrepl status UI	 no  longer  indicates
	   whether  a  replication step	uses encrypted sends or	not.  The set-
	   ting	is still effective though.

        [BREAK]   convert   Prometheus	  metric    zrepl_version_daemon    to
	 zrepl_start_time metric

	  The	metric	still  reports	the zrepl version in a label.  But the
	   metric value	is now the Unix	timestamp at the time the  daemon  was
	   started.  The Grafana dashboard in dist/grafana has been updated.

        [BUG] transient zrepl status error: Post "http://unix/status":	EOF

        [BUG]	don't  treat receive-side bookmarks as a replication conflict.
	 This facilitates chaining of replication jobs.	See issue #490.

        [BUG] workaround for Go/gRPC problem on  Illumos  where  zrepl	 would
	 crash when using the local transport type (issue #598).

        [BUG] fix active child	tasks panic that cold occur during replication
	 plannig (issue	#193abbe)

        [BUG]	zrepl  status  off-by-one  error  in display of	completed step
	 count (commit ce6701f)

        [BUG] Allow using day & week units for	snapshotting.interval  (commit
	 ffb1d89)

        [DOCS]	docs/overview improvements (Thanks, @jtagcat).

        [MAINT] Update	to Go 1.19.

   0.5
        [FEATURE] Bandwidth limiting (Thanks, Prominic.NET, Inc.)

        [FEATURE]  zrepl status: use a	* to indicate which filesystem is cur-
	 rently	replicating

        [FEATURE] include daemon environment variables	in zrepl status	 (cur-
	 rently	only in	--raw)

        [BUG] fix encrypt-on-receive +	placeholders use case (issue #504)

	  Before  this	 fix,  plain  sends  to	 a  receiver with an encrypted
	   root_fs could be received unencrypted if  zrepl  needed  to	create
	   placeholders	on the receiver.

	  Existing  zrepl users should	read the docs and check	zfs get	-r en-
	   cryption,zrepl:placeholder PATH_TO_ROOTFS on	the receiver.

	  Thanks to @mologie and @razielgn for	reporting and testing!

        [BUG] Rename mis-spelled send option embbeded_data to embedded_data.

        [BUG] zrepl status: replication step numbers should start at 1

        [BUG] incorrect bandwidth averaging in	zrepl status.

        [BUG] FreeBSD with OpenZFS 2.0: zrepl would wait indefinitely for zfs
	 send to exit on timeouts.

        [BUG] fix strconv.ParseInt: value out of range	bug (and use the  con-
	 trol RPCs).

        [DOCS]	improve	description of multiple	pruning	rules.

        [DOCS]	document platform tests.

        [DOCS]	 quickstart:  make  users  aware that prune rules apply	to all
	 snapshots.

        [MAINT] some platformtests were broken.

        [MAINT] FreeBSD: release armv7	and arm64 binaries.

        [MAINT] apt repo: update instructions due to apt-key deprecation.

       Note to all users: please read up on the	following OpenZFS bugs,	as you
       might be	affected:

        ZFS send/recv with ashift 9->12 leads to data corruption.

        Various bugs with encrypted send/recv (Leadership meeting notes)

       Finally,	I'd like to point you to the  GitHub  discussion  about	 which
       bugfixes	and features should be prioritized in zrepl 0.6	and beyond!

   0.4.0
        [FEATURE]  support setting zfs	send / recv flags in the config	(send:
	 -wLcepbS , recv: -ox ).  Config docs here and here .

        [FEATURE] parallel replication	is now configurable (disabled  by  de-
	 fault,	config docs here ).

        [FEATURE] New zrepl status UI:

	  Interactive job selection.

	  Interactively zrepl signal jobs.

	  Filter filesystems in the job view by name.

	  An  approximation  of the old UI is still included as --mode	legacy
	   but will be removed in a future release of zrepl.

        [BUG] Actually	use concurrency	when listing zrepl abstractions	&  do-
	 ing size estimation.  These operations	were accidentally made sequen-
	 tial in zrepl 0.3.

        [BUG] Job hang-up during second replication attempt.

        [BUG] Data races conditions in	the dataconn rpc stack.

        [MAINT] Update	to protobuf v1.25 and grpc 1.35.

       For  users  who skipped the 0.3.1 update: please	make sure your pruning
       grid config is correct.	The following bugfix in	0.3.1 caused  problems
       for some	users:

        [BUG]	pruning:  grid:	 add all snapshots that	do not match the regex
	 to the	rule's destroy list.

   0.3.1
       Mostly a	bugfix release for zrepl 0.3.

        [FEATURE] pruning: add	optional regex field to	last_n rule

        [DOCS]	pruning: grid :	improve	documentation and add an example

        [BUG] pruning:	grid:  add all snapshots that do not match  the	 regex
	 to  the  rule's destroy list.	This brings the	implementation in line
	 with the docs.

        [BUG] easyrsa script in docs

        [BUG] platformtest: fix skipping  encryption-only  tests  on  systems
	 that don't support encryption

        [BUG]	replication:  report  AttemptDone if no	filesystems are	repli-
	 cated

        [FEATURE] status + replication:  warning  if  replication  succeeeded
	 without any filesystem	being replicated

        [DOCS]	update multi-job & multi-host setup section

        RPM Packaging

        CI infrastructure rework

        Continuous deployment of that new stable branch to zrepl.github.io.

   0.3
       This is a big one! Headlining features:

        Resumable  Send  & Recv Support No knobs required, automatically used
	 where supported.

        Encrypted  Send  &  Recv  Support  for	 OpenZFS  native   encryption,
	 configurable at the job level,	i.e., for all filesystems a job	is re-
	 sponsible for.

        Replication  Guarantees  Automatic  use of ZFS	holds and bookmarks to
	 protect a replicated filesystem from losing  synchronization  between
	 sender	 and  receiver.	 By default, zrepl guarantees that incremental
	 replication will always be possible and interrupted steps will	always
	 be resumable.

       TIP:
	  We highly recommend studying the updated  overview  section  of  the
	  configuration	chapter	to understand how replication works.

       TIP:
	  Go 1.15 changed the default TLS validation policy to require Subject
	  Alternative  Names  (SAN)  in	certificates.  The openssl commands we
	  provided in the quick-start guides up	to and including the zrepl 0.3
	  docs seem not	to work	properly.  If you encounter certificate	 vali-
	  dation  errors  regarding  SAN  and wish to continue to use your old
	  certificates,	start the zrepl	daemon with  env  var  GODEBUG=x509ig-
	  noreCN=0.   Alternatively,  generate new certificates	with SANs (see
	  both options int the TLS transport docs ).

       Quick-start guides:

        We have added another quick-start guide for a typical workstation use
	 case for zrepl.  Check	it out to learn	how you	can use	zrepl to  back
	 up  your  workstation's OpenZFS natively-encrypted root filesystem to
	 an external disk.

       Additional changelog:

        [BREAK] Go 1.15 TLS changes mentioned above.

        [BREAK] [CONFIG] more restrictive job names than in prior zrepl  ver-
	 sions	Starting with this version, job	names are going	to be embedded
	 into ZFS holds	and bookmark names (see	 this  section	for  details).
	 Therefore  you	 might	need to	adjust your job	names.	Note that jobs
	 cannot	be renamed easily once you start using zrepl 0.3.

        [BREAK] [MIGRATION] replication cursor	representation changed

	  zrepl now manages the replication cursor bookmark per  job-filesys-
	   tem	tuple  instead	of a single replication	cursor per filesystem.
	   In the future, this will permit multiple sending jobs to send  from
	   the same filesystems.

	  ZFS	does  not  allow bookmark renaming, thus we cannot migrate the
	   old replication cursors.

	  zrepl 0.3 will automatically	create cursors in the new  format  for
	   new	replications,  and warn	if it still finds ones in the old for-
	   mat.

	  Run	zrepl  migrate	replication-cursor:v1-v2  to  safely   destroy
	   old-format  cursors.	  The  migration  will	ensure that only those
	   old-format cursors are destroyed  that  have	 been  superseeded  by
	   new-format cursors.

        [FEATURE] New option listen_freebind (tcp, tls, prometheus listener)

        [FEATURE]  issue  #341	 Prometheus  metric for	failing	replications +
	 corresponding Grafana panel

        [FEATURE] issue #265 transport/tcp: support for CIDR masks in	client
	 IP whitelist

        [FEATURE] documented subcommand to generate bash and zsh completions

        [FEATURE]  issue  #307	chrome://trace -compatible activity tracing of
	 zrepl daemon activity

        [FEATURE] logging: trace IDs for better log  entry  correlation  with
	 concurrent replication	jobs

        [FEATURE]  experimental environment variable for parallel replication
	 (see issue #306 )

        [BUG] missing logger context vars in control connection handlers

        [BUG] improved	error messages on zfs send errors

        [BUG] [DOCS] snapshotting: clarify sync-up behavior  and  warn	 about
	 filesystems

        [BUG]	transport/ssh:	do  not	 leak zombie ssh process on connection
	 failures that will not	be snapshotted until the sync-up phase is over

        [DOCS]	Installation: FreeBSD jail with	iocage

        [DOCS]	Document new replication features in the config	 overview  and
	 replication/design.md.

        [MAINTAINER  NOTICE]  New platform tests in this version, please make
	 sure you run them for your distro!

        [MAINTAINER NOTICE] Please add	the shell  completions	to  the	 zrepl
	 packages.

   0.2.1
        [FEATURE]  Illumos  (and  Solaris)  compatibility  and	 binary	builds
	 (thanks, MNX.io )

        [FEATURE] 32bit binaries for Linux and	FreeBSD	(untested, though)

        [BUG] better error messages in	ssh+stdinserver	transport

        [BUG]	  systemd    +	  ssh+stdinserver:    automatically	create
	 /var/run/zrepl/stdinserver

        [BUG] crash if	Prometheus listening socket cannot be opened

        [MAINTAINER NOTICE] Makefile refactoring, see commit 080f2c0

   0.2
        [FEATURE]  Pre-  and  Post-Snapshot  Hooks  with built-in support for
	 MySQL and Postgres checkpointing as well as custom  scripts  (thanks,
	 @overhacked!)

        [FEATURE]  Use	 zfs  destroy  pool/fs@snap1,snap2,...	CLI feature if
	 available

        [FEATURE] Linux ARM64 Docker build support & binary builds

        [FEATURE] zrepl status	now displays snapshotting reports

        [FEATURE] zrepl status	--job <JOBNAME>	filter flag

        [BUG] i386 build

        [BUG] early validation	of host:port tuples in config

        [BUG] zrepl status  now  supports  TERM=screen	 (tmux	on  FreeBSD  /
	 FreeNAS)

        [BUG]	ignore connection reset	by peer	errors when shutting down con-
	 nections

        [BUG] correct	error  messages	 when  receive-side  pool  or  root_fs
	 dataset is not	imported

        [BUG] fail fast for misconfigured local transport

        [BUG] race condition in replication report generation would crash the
	 daemon	when running zrepl status

        [BUG]	rpc  goroutine leak in push mode if zfs	recv fails on the sink
	 side

        [MAINTAINER NOTICE] Go	modules	for dependency management both	inside
	 and outside of	GOPATH (lazy.sh	and Makefile force GO111MODULE=on)

        [MAINTAINER NOTICE] make platformtest target to check zrepl's ZFS ab-
	 stractions  (screen scraping, etc.).  These tests only	work on	a sys-
	 tem with ZFS installed, and must be run as root because they create a
	 file-backed pool for each test	case.  The pool	name zreplplatformtest
	 is reserved for this use case.	 Only run make	platformtest  on  test
	 systems, e.g. a FreeBSD VM image.

   0.1.1
        [BUG]	issue #162 commit d6304f4 : fix	I/O timeout errors on variable
	 receive rate

	  A significant reduction or sudden stall of the receive  rate	 (e.g.
	   recv	pool has other I/O to do) would	cause a	writev I/O timeout er-
	   ror after approximately ten seconds.

   0.1
       This  release  is a milestone for zrepl and required significant	refac-
       toring if not rewrites of substantial parts  of	the  application.   It
       breaks  both configuration and transport	format,	and thus requires man-
       ual intervention	and updates on both sides of a replication setup.

       DANGER:
	  The changes in the pruning system for	this release  require  you  to
	  explicitly  define  keep  rules:  for	 any snapshot that you want to
	  keep,	at least one rule must match.  This is different from previous
	  releases where pruning only affected snapshots with  the  configured
	  snapshotting prefix.	Make sure that snapshots to be kept or ignored
	  by zrepl are covered,	e.g. by	using the regex	keep rule.  Learn more
	  in the config	docs...

   Notes to Package Maintainers
        Notify	users about config changes and migrations (see changes attrib-
	 uted with [BREAK] and [MIGRATION] below)

        If the	daemon crashes,	the stack trace	produced by the	Go runtime and
	 possibly  diagnostic output of	zrepl will be written to stderr.  This
	 behavior is independent from the stdout  outlet  type.	  Please  make
	 sure  the stderr output of the	daemon is captured somewhere.  To con-
	 serve precious	stack traces, make sure	that multiple service restarts
	 do not	directly discard previous stderr output.

        Make it obvious for users how	to  set	 the  GOTRACEBACK  environment
	 variable to GOTRACEBACK=crash.	 This functionality will cause SIGABRT
	 on  panics  and  can  be  used	to capture a coredump of the panicking
	 process.  To that extend, make	sure that your package	build  system,
	 your  OS's  coredump  collection  and	the Go delve debugger work to-
	 gether.  Use your build system	to package the Go program in this  tu-
	 torial	 on  Go	 coredumps  and	the delve debugger , and make sure the
	 symbol	resolution etc.	work on	coredumps  captured  from  the	binary
	 produced  by  your  build system. (Special focus on symbol stripping,
	 etc.)

        Consider using	the zrepl configcheck subcommand in startup scripts to
	 abort a restart that would fail due to	an invalid config.

   Changes
        [BREAK] [MIGRATION] Placeholder property representation changed

	  The placeholder property now	 uses  on|off  as  values  instead  of
	   hashes  of  the  dataset  path.  This  permits  renames of the sink
	   filesystem without updating all placeholder properties.

	  Relevant for	0.0.X-0.1-rc* to 0.1 migrations

	  Make	sure your config is valid with zrepl configcheck

	  Run zrepl migrate 0.0.X:0.1:placeholder

        [FEATURE] issue #55 : Push replication	(see push job and sink job)

        [FEATURE] TCP Transport

        [FEATURE] TCP + TLS client authentication transport

        [FEATURE] issue #111: RPC protocol rewrite

	  [BREAK] Protocol breakage; Update and restart of all	zrepl  daemons
	   is required.

	  Use	gRPC  for  control  RPCs  and  a custom	protocol for bulk data
	   transfer.

	  Automatic retries for network-temporary errors

	    Limited to	errors during replication for this release.  Addresses
	     the common	problem	of ISP-forced reconnection at night, but  will
	     become way	more useful with resumable send	& recv support.	 Prun-
	     ing  errors are handled per FS, i.e., a prune RPC is attempted at
	     least once	per FS.

        [FEATURE] Proper timeout handling for the SSH transport

	  [BREAK] Requires Go 1.11 or later.

        [BREAK] [CONFIG]: mappings are	no longer supported

	  Receiving sides (pull and sink job) specify a single	root_fs.   Re-
	   ceived    filesystems    are	   then	   stored    per   client   in
	   ${root_fs}/${client_identity}.  See Jobs & How They	Work  Together
	   for details.

        [FEATURE] [BREAK] [CONFIG] Manual snapshotting	+ triggering of	repli-
	 cation

	  [FEATURE] issue #69:	include	manually created snapshots in replica-
	   tion

	  [CONFIG] manual and periodic	snapshotting types

	  [FEATURE] zrepl signal wakeup JOB subcommand	to trigger replication
	   + pruning

	  [FEATURE] zrepl signal reset	JOB subcommand to abort	current	repli-
	   cation + pruning

        [FEATURE] [BREAK] [CONFIG] New	pruning	system

	  The	active	side  of  a replication	(pull or push) decides what to
	   prune for both sender and receiver.	The RPC	protocol  is  used  to
	   execute the destroy operations on the remote	side.

	  New pruning policies	(see configuration documentation )

	    The  decision what	snapshots shall	be pruned is now made based on
	     keep rules

	    [FEATURE] issue #68: keep rule not_replicated prevents divergence
	     of	sender and receiver

	  [FEATURE] [BREAK] Bookmark pruning is no longer necessary

	    Per filesystem, zrepl creates a single bookmark  (#zrepl_replica-
	     tion_cursor)  and	moves it forward with the most recent success-
	     fully replicated snapshot on the receiving	side.

	    Old bookmarks created by prior  versions  of  zrepl  (named  like
	     their corresponding snapshot) must	be deleted manually.

	    [CONFIG]  keep_bookmarks parameter	of the grid keep rule has been
	     removed

        [FEATURE] zrepl status	for live-updating replication  progress	 (it's
	 really	cool!)

        [FEATURE]  Snapshot- &	pruning-only job type (for local snapshot man-
	 agement)

        [FEATURE] issue #67: Expose Prometheus	metrics	via HTTP (config docs)

	  Compatible Grafana dashboard	shipping in dist/grafana

        [CONFIG] Logging outlet types must be specified using	the  type  in-
	 stead of outlet key

        [BREAK]  issue	 #53:  CLI: zrepl control * subcommands	have been made
	 direct	subcommands of zrepl *

        [BUG] Goroutine leak on ssh transport connection timeouts

        [BUG] issue #81 issue #77 : handle failed accepts  correctly  (source
	 job)

        [BUG] issue #100: fix incompatibility with ZoL	0.8

        [FEATURE] issue #115: logging:	configurable syslog facility

        [FEATURE] Systemd unit	file in	dist/systemd

   Previous Releases
       NOTE:
	  Due  to  limitations	in  our	documentation system, we only show the
	  changelog since the last release and the time	this documentation  is
	  built.   For the changelog of	previous releases, use the version se-
	  lection in the hosted	version	of these docs at zrepl.github.io.
       Donate via Patreon Donate via  GitHub  Sponsors	Donate	via  Liberapay
       Donate via PayPal

       zrepl is	a spare-time project primarily developed by Christian Schwarz.
       You  can	support	maintenance and	feature	development through one	of the
       services	listed above.  For SEPA	wire transfer and commercial  support,
       please contact Christian	directly.

       Thanks for your support!

       NOTE:
	  PayPal  takes	a relatively high fixed	processing fee plus percentage
	  of the donation.  Larger less-frequent  donations  make  more	 sense
	  there.

   Supporters
       We  would like to thank the following people and	organizations for sup-
       porting zrepl through monetary and other	means:

       
	  Max Christian	Pohle

       
	  Prominic.NET,	Inc.

       
	  Torsten Blum

       
	  Cyberiada GmbH

       
	  Gordon Schulz

       
	  @jwittlincohen

       
	  Michael D. Schmitt

       
	  Hans Schulz

       
	  Henning Kessler

       
	  John Ramsden

       
	  DrLuke

       
	  Mateusz Kwiatkowski (runhyve.app)

       
	  Gaelan D'costa

       
	  Tenzin Lhakhang

       
	  Lapo Luchini

       
	  F. Schmid

       
	  MNX.io

       
	  Marshall Clyburn

       
	  Ross Williams

       
	  Mike T.

       
	  Justin Scholz

       
	  InsanePrawn

       
	  Ben Woods

       
	  Janis	Streib

       
	  Anton	Schirg

AUTHOR
       Christian Schwarz

COPYRIGHT
       2017-2023, Christian Schwarz

				 Apr 14, 2025			      ZREPL(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=zrepl&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help