FreeBSD Manual Pages

home | help
bup-split(1)		    General Commands Manual		  bup-split(1)

NAME
       bup-split - save	individual files to bup	backup sets

SYNOPSIS
       bup split [-t] [-c] [-n name] COMMON_OPTIONS

       bup split -b COMMON_OPTIONS

       bup split -copy COMMON_OPTIONS

       bup split -noop [-t|-b] COMMON_OPTIONS

       COMMON_OPTIONS
	      [-r  host:path]  [-v]  [-q]  [-d	seconds-since-epoch] [--bench]
	      [--max-pack-size=bytes] [-#]  [--bwlimit=bytes]  [--max-pack-ob-
	      jects=n] [--fanout=count]	[--keep-boundaries] [--git-ids | file-
	      names...]

DESCRIPTION
       bup  split concatenates the contents of the given files (or if no file-
       names are given,	reads from stdin), splits the content into  chunks  of
       around 8k using a rolling checksum algorithm, and saves the chunks into
       a  bup  repository.   Chunks  which have	previously been	stored are not
       stored again (ie.  they are `deduplicated').

       Because of the way the rolling checksum works, chunks tend to  be  very
       stable  across changes to a given file, including adding, deleting, and
       changing	bytes.

       For example, if you use bup split to back up an XML dump	of a database,
       and the XML file	changes	slightly from one run to the next, nearly  all
       the  data  will still be	deduplicated and the size of each backup after
       the first will typically	be quite small.

       Another technique is to pipe the	output of the tar(1) or	 cpio(1)  pro-
       grams  to  bup  split.	When  individual  files	 in the	tarball	change
       slightly	or are added or	removed, bup still processes the remainder  of
       the  tarball  efficiently.  (Note that bup save is usually a more effi-
       cient way to accomplish this, however.)

       To get the data back, use bup-join(1).

MODES
       These options select the	primary	behavior of the	command, with -n being
       the most	likely choice.

       -n, --name=name
	      after creating the dataset, create a git branch  named  name  so
	      that  it	can  be	accessed using that name.  If name already ex-
	      ists, the	new dataset will be considered a descendant of the old
	      name.  (Thus, you	can continually	create new datasets  with  the
	      same name, and later view	the history of that dataset to see how
	      it  has  changed	over  time.)   The  original data will also be
	      available	as a top-level file named "data" in the	VFS,  accessi-
	      ble via bup fuse,	bup ftp, etc.

       -t, --tree
	      output the git tree id of	the resulting dataset.

       -c, --commit
	      output the git commit id of the resulting	dataset.

       -b, --blobs
	      output a series of git blob ids that correspond to the chunks in
	      the dataset.  Incompatible with -n, -t, and -c.

       --noop read  the	 data and split	it into	blocks based on	the "bupsplit"
	      rolling checksum algorithm, but  don't  store  anything  in  the
	      repo.   Can be combined with -b or -t to compute (but not	store)
	      the git blobs or tree ids	for the	dataset.  This is mostly  use-
	      ful for benchmarking and validating the bupsplit algorithm.  In-
	      compatible with -n and -c.

       --copy like  --noop,  but  also	write the data to stdout.  This	can be
	      useful for benchmarking the  speed  of  read+bupsplit+write  for
	      large amounts of data.  Incompatible with	-n, -t,	-c, and	-b.

OPTIONS
       -r, --remote=host:path
	      save  the	 backup	 set  to  the given remote server.  If path is
	      omitted, uses the	default	path on	the remote server  (you	 still
	      need  to	include	the `:').  The connection to the remote	server
	      is made with SSH.	 If you'd like to specify which	port, user  or
	      private  key to use for the SSH connection, we recommend you use
	      the ~/.ssh/config	file.  Even though the destination is  remote,
	      a	local bup repository is	still required.

       -d, --date=seconds-since-epoch
	      specify	the  date  inscribed  in  the  commit  (seconds	 since
	      1970-01-01).

       -q, --quiet
	      disable progress messages.

       -v, --verbose
	      increase verbosity (can be used more than	once).

       --git-ids
	      stdin is a list of git object ids	 instead  of  raw  data.   bup
	      split will read the contents of each named git object (if	it ex-
	      ists  in the bup repository) and split it.  This might be	useful
	      for converting a git repository with large binary	files  to  use
	      bup-style	 hashsplitting	instead.  This option is probably most
	      useful when combined with	--keep-boundaries.

       --keep-boundaries
	      if multiple filenames are	given on the command  line,  they  are
	      normally concatenated together as	if the content all came	from a
	      single  file.  That is, the set of blobs/trees produced is iden-
	      tical to what it would have been if there	had been a single  in-
	      put  file.   However, if you use --keep-boundaries, each file is
	      split separately.	 You still only	get a single tree or commit or
	      series of	blobs, but each	blob comes from	only one of the	files;
	      the end of one of	the input files	always ends a blob.

       --bench
	      print benchmark timings to stderr.

       --max-pack-size=bytes
	      never create git packfiles  larger  than	the  given  number  of
	      bytes.   Default is 1 billion bytes.  Usually there is no	reason
	      to change	this.

       --max-pack-objects=numobjs
	      never create git packfiles with more than	the  given  number  of
	      objects.	 Default is 200	thousand objects.  Usually there is no
	      reason to	change this.

       --fanout=numobjs
	      when splitting very large	files, try and keep the	number of ele-
	      ments in trees to	an average of numobjs.

       --bwlimit=bytes/sec
	      don't transmit more than	bytes/sec  bytes  per  second  to  the
	      server.	This  is  good for making your backups not suck	up all
	      your network bandwidth.  Use a suffix like k, M, or G to specify
	      multiples	of 1024, 1024*1024, 1024*1024*1024 respectively.

       -#, --compress=#
	      set the compression level	to # (a	value from 0-9,	where 9	is the
	      highest and 0 is no compression).	 The default is	1 (fast, loose
	      compression)

EXAMPLES
	      $	tar -cf	- /etc | bup split -r myserver:	-n mybackup-tar
	      tar: Removing leading /' from member names
	      Indexing objects:	100% (196/196),	done.

	      $	bup join -r myserver: mybackup-tar | tar -tf - | wc -l
	      1961

SEE ALSO
       bup-join(1), bup-index(1), bup-save(1), bup-on(1), ssh_config(5)

BUP
       Part of the bup(1) suite.

AUTHORS
       Avery Pennarun <apenwarr@gmail.com>

Bup 0.32			  2021-01-09			  bup-split(1)
Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=bup-split&sektion=1&manpath=FreeBSD+Ports+15.1.quarterly>
home | help
Header And Logo

Peripheral Links

Site Navigation

FreeBSD Manual Pages

Header And Logo

Peripheral Links

Search

Site Navigation

FreeBSD Manual Pages