Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
NCCOPY(1)		       UNIDATA UTILITIES		     NCCOPY(1)

NAME
       nccopy  -  Copy a netCDF	file, optionally changing format, compression,
       or chunking in the output.

SYNOPSIS

       nccopy [-k  kind_name ] [-kind_code] [-d	 n ]  [-s]  [-c	  chunkspec  ]
	      [-u]  [-w]  [-[v|V] var1,...]  [-[g|G] grp1,...]	[-m  bufsize ]
	      [-h  chunk_cache ] [-e  cache_elems ] [-r] [-F  filterspec ] [-L
	      n	] [-M  n ]  infile  outfile

DESCRIPTION
       The nccopy utility copies an input netCDF file in any supported	format
       variant	to  an output netCDF file, optionally converting the output to
       any compatible netCDF format variant, compressing the data, or rechunk-
       ing the data.  For example, if  built  with  the	 netCDF-3  library,  a
       netCDF  classic file may	be copied to a netCDF 64-bit offset file, per-
       mitting larger variables.  If built with	the netCDF-4 library, a	netCDF
       classic file may	be copied to a netCDF-4	file or	to a netCDF-4  classic
       model  file  as	well,  permitting  data	 compression, efficient	schema
       changes,	larger variable	sizes, and use of other	netCDF-4 features.

       If  no  output  format  is  specified,  with  either  -k	 kind_name  or
       -kind_code,  then the output will use the same format as	the input, un-
       less the	input is classic or 64-bit offset and either chunking or  com-
       pression	 is specified, in which	case the output	will be	netCDF-4 clas-
       sic model format.  Attempting some kinds	of format conversion will  re-
       sult  in	 an error, if the conversion is	not possible.  For example, an
       attempt to copy a netCDF-4 file that uses features of the enhanced mod-
       el, such	as groups or variable-length strings,  to  any	of  the	 other
       kinds  of  netCDF  formats that use the classic model will result in an
       error.

       nccopy also serves as an	example	of a generic  netCDF-4	program,  with
       its  ability  to	 read  any valid netCDF	file and handle	nested groups,
       strings,	and user-defined types,	including arbitrarily nested  compound
       types, variable-length types, and data of any valid netCDF-4 type.

       If  DAP	support	 was  enabled when nccopy was built, the file name may
       specify a DAP URL. This may be used to convert data on DAP  servers  to
       local netCDF files.

OPTIONS
	-k   kind_name
	      Use  format  name	to specify the kind of file to be created and,
	      by  inference,  the  data	 model	(i.e.  netcdf-3	 (classic)  or
	      netcdf-4 (enhanced)).  The possible arguments are:

		     'nc3' or 'classic'	=> netCDF classic format

		     'nc6' or '64-bit offset' => netCDF	64-bit format

		     'nc4' or 'netCDF-4' => netCDF-4 format (enhanced data
		     model)

		     'nc7' or 'netCDF-4	classic	model' => netCDF-4 classic
		     model format

	      Note:  The  old format numbers '1', '2', '3', '4', equivalent to
	      the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are
	      also still accepted but deprecated, due to  easy	confusion  be-
	      tween format numbers and format names.

       [-kind_code]
	      Use  format numeric code (instead	of format name)	to specify the
	      kind of file to be created and, by  inference,  the  data	 model
	      (i.e.  netcdf-3  (classic) versus	netcdf-4 (enhanced)).  The nu-
	      meric codes are:

		     3 => netcdf classic format

		     6 => netCDF 64-bit	format

		     4 => netCDF-4 format (enhanced data model)

		     7 => netCDF-4 classic model format
       The numeric code	"7" is used because  "7=3+4",  specifying  the	format
       that  uses  the netCDF-3	data model for compatibility with the netCDF-4
       storage format for performance. Credit is due to	NCO for	use  of	 these
       numeric codes instead of	the old	and confusing format numbers.

	-d   n
	      For  netCDF-4  output, including netCDF-4	classic	model, specify
	      deflation	level (level of	compression) for variable data output.
	      0	corresponds to no compression and 9  to	 maximum  compression,
	      with higher levels of compression	requiring marginally more time
	      to  compress  or	uncompress than	lower levels. As a side	effect
	      specifying a compression level of	0 (via "-d 0") actually	 turns
	      off  deflation altogether.  Compression achieved may also	depend
	      on output	chunking parameters.  If this option is	specified  for
	      a	 classic  format or 64-bit offset format input file, it	is not
	      necessary	to also	specify	that the  output  should  be  netCDF-4
	      classic  model,  as that will be the default.  If	this option is
	      not specified and	the input file has compressed  variables,  the
	      compression  will	 still	be  preserved in the output, using the
	      same chunking as in the input by default.

	      Note that	nccopy requires	all variables to be  compressed	 using
	      the same compression level, but the API has no such restriction.
	      With  a  program you can customize compression for each variable
	      independently.

	-s    For netCDF-4 output, including netCDF-4 classic  model,  specify
	      shuffling	of variable data bytes before compression or after de-
	      compression.   Shuffling	refers	to  interlacing	 of bytes in a
	      chunk so that the	first bytes of all values  are	contiguous  in
	      storage,	followed by all	the second bytes, and so on, which of-
	      ten improves compression.	 This option is	ignored	unless a  non-
	      zero  deflation level is specified.  Using -d0 to	specify	no de-
	      flation on input data that  has  been  compressed	 and  shuffled
	      turns off	both compression and shuffling in the output.

	-u    Convert any unlimited size dimensions in the input to fixed size
	      dimensions  in the output.  This can speed up variable-at-a-time
	      access, but slow down record-at-a-time access to multiple	 vari-
	      ables along an unlimited dimension.

	-w    Keep  output  in memory (as a diskless netCDF file) until	output
	      is closed, at which time output file is written to  disk.	  This
	      can  greatly speedup operations such as converting unlimited di-
	      mension to fixed size (-u	option), chunking, rechunking, or com-
	      pressing the input.  It requires that available memory is	 large
	      enough to	hold the output	file.  This option may provide a larg-
	      er speedup than careful tuning of	the -m,	-h, or -e options, and
	      it's certainly a lot simpler.

	-c  chunkspec
	      For  netCDF-4  output, including netCDF-4	classic	model, specify
	      chunking (multidimensional tiling) for variable data in the out-
	      put.  This is useful to specify the units	of disk	 access,  com-
	      pression,	 or  other  filters  such  as checksums.  Changing the
	      chunking in a netCDF file	can also greatly  speedup  access,  by
	      choosing	chunk  shapes that are appropriate for the most	common
	      access patterns.

	      The chunkspec argument has several forms.	The first form is  the
	      original,	deprecated form	and is a string	of comma-separated as-
	      sociations,  each	 specifying a dimension	name, a	'/' character,
	      and optionally the corresponding chunk length  for  that	dimen-
	      sion.   No  blanks should	appear in the chunkspec	string,	except
	      possibly escaped blanks that are part of a  dimension  name.   A
	      chunkspec	 names at least	one dimension, and may omit dimensions
	      which are	not to be chunked  or  for  which  the	default	 chunk
	      length  is  desired.   If	 a dimension name is followed by a '/'
	      character	but no subsequent chunk	length,	the  actual  dimension
	      length  is  assumed.   If	 copying  a  classic  model  file to a
	      netCDF-4 output file  and	 not  naming  all  dimensions  in  the
	      chunkspec, unnamed dimensions will also use the actual dimension
	      length  for  the	chunk  length.	 An example of a chunkspec for
	      variables	that use 'm' and 'n' dimensions	might be 'm/100,n/200'
	      to specify 100 by	200 chunks. To see the chunking	resulting from
	      copying with a chunkspec,	use the	'-s' option of ncdump  on  the
	      output file.

	      The chunkspec '/'	that omits all dimension names and correspond-
	      ing  chunk lengths specifies that	no chunking is to occur	in the
	      output, so can be	used to	unchunk	all the	chunked	variables.  To
	      see the chunking resulting from copying with  a  chunkspec,  use
	      the '-s' option of ncdump	on the output file.

	      As  an  I/O optimization,	nccopy has a threshold for the minimum
	      size of non-record variables that	get  chunked,  currently  8192
	      bytes. The -M flag can be	used to	override this value.

	      Note  that  nccopy  requires variables that share	a dimension to
	      also share the chunk size	associated with	 that  dimension,  but
	      the  programming interface has no	such restriction.  If you need
	      to customize chunking for	variables independently, you will need
	      to use the  second  form	of  chunkspec.	This  second  form  of
	      chunkspec	has this syntax:  var:n1,n2,...,nn . This assumes that
	      the  variable named "var"	has rank n. The	chunking to be applied
	      to each dimension	of the variable	is specified by	the values  of
	      n1 through nn. This second form of chunking specification	can be
	      repeated	multiple  times	to specify the exact chunking for dif-
	      ferent variables.	 If the	variable is  specified	but  no	 chunk
	      sizes  are  specified (i.e.  -c var: ) then chunking is disabled
	      for that variable.  If the same variable is specified more  than
	      once,  the  second  and later specifications are ignored.	 Also,
	      this second form,	per-variable chunking, takes  precedence  over
	      any per-dimension	chunking except	the bare "/" case.

	      The  third form of the chunkspec has the syntax:	var:compact or
	      var:contiguous.  This explicitly attempts	to  set	 the  variable
	      storage  type  as	compact	or contiguous, respectively. These may
	      be overridden if other flags require the variable	to be chunked.

	-v   var1,...
	      The output will include data values for the specified variables,
	      in addition to the declarations of  all  dimensions,  variables,
	      and  attributes. One or more variables must be specified by name
	      in the comma-delimited list following this option. The list must
	      be a single argument to the command, hence  cannot  contain  un-
	      escaped  blanks or other white space characters. The named vari-
	      ables must be valid netCDF variables in the input-file. A	 vari-
	      able  within a group in a	netCDF-4 file may be specified with an
	      absolute path name, such as  "/GroupA/GroupA2/var".   Use	 of  a
	      relative	path  name  such  as  'var' or "grp/var" specifies all
	      matching variable	names in the file.  The	default, without  this
	      option,  is  to  include	data values for	 all  variables	in the
	      output.

	-V   var1,...
	      The output will include the specified variables only but all di-
	      mensions and global or group attributes. One or  more  variables
	      must  be specified by name in the	comma-delimited	list following
	      this option. The list must be a single argument to the  command,
	      hence cannot contain unescaped blanks or other white space char-
	      acters.  The  named  variables must be valid netCDF variables in
	      the input-file. A	variable within	a group	in a netCDF-4 file may
	      be   specified   with   an   absolute   path   name,   such   as
	      '/GroupA/GroupA2/var'.   Use  of	a  relative  path name such as
	      'var' or 'grp/var' specifies all matching	variable names in  the
	      file.   The  default,  without  this  option, is to include  all
	      variables	in the output.

	-g   grp1,...
	      The output will include  data  values  only  for	the  specified
	      groups.	One  or	 more  groups must be specified	by name	in the
	      comma-delimited list following this option. The list must	 be  a
	      single  argument	to the command.	The named groups must be valid
	      netCDF groups in the input-file. The default, without  this  op-
	      tion, is to include data values for all groups in	the output.

	-G   grp1,...
	      The  output will include only the	specified groups.  One or more
	      groups must be specified by name	in  the	 comma-delimited  list
	      following	this option. The list must be a	single argument	to the
	      command. The named groups	must be	valid netCDF groups in the in-
	      put-file.	 The  default,	without	this option, is	to include all
	      groups in	the output.

	-m   bufsize
	      An integer or floating-point number that specifies the size,  in
	      bytes,  of the copy buffer used to copy large variables.	A suf-
	      fix of K,	M, G, or T multiplies the  copy	 buffer	 size  by  one
	      thousand,	 million, billion, or trillion,	respectively.  The de-
	      fault is 5 Mbytes, but will be increased if necessary to hold at
	      least one	chunk of netCDF-4 chunked variables in the input file.
	      You may want to specify a	value  larger  than  the  default  for
	      copying  large files over	high latency networks.	Using the '-w'
	      option may provide better	performance, if	 the  output  fits  in
	      memory.

	-h   chunk_cache
	      For  netCDF-4 output, including netCDF-4 classic model, an inte-
	      ger or floating-point number that	specifies the size in bytes of
	      chunk cache allocated for	each chunked variable.	This is	not  a
	      property	of the file, but merely	a performance tuning parameter
	      for avoiding compressing or decompressing	the same data multiple
	      times while copying and changing chunk shapes.  A	suffix	of  K,
	      M, G, or T multiplies the	chunk cache size by one	thousand, mil-
	      lion,  billion,  or  trillion,  respectively.   The  default  is
	      4.194304 Mbytes (or whatever was specified  for  the  configure-
	      time  constant  CHUNK_CACHE_SIZE	when  the  netCDF  library was
	      built).  Ideally,	the nccopy utility should accept only one mem-
	      ory buffer size and divide it optimally between  a  copy	buffer
	      and  chunk cache,	but no general algorithm for computing the op-
	      timum chunk cache	size has been implemented yet. Using the  '-w'
	      option  may  provide  better  performance, if the	output fits in
	      memory.

	-e   cache_elems
	      For netCDF-4 output, including netCDF-4 classic model, specifies
	      number of	chunks that the	chunk cache can	hold. A	suffix	of  K,
	      M,  G,  or T multiplies the number of chunks that	can be held in
	      the cache	by one thousand, million, billion,  or	trillion,  re-
	      spectively.   This  is  not a property of	the file, but merely a
	      performance tuning parameter for avoiding	compressing or	decom-
	      pressing the same	data multiple times while copying and changing
	      chunk  shapes.   The  default is 1009 (or	whatever was specified
	      for the  configure-time  constant	 CHUNK_CACHE_NELEMS  when  the
	      netCDF  library  was built).  Ideally, the nccopy	utility	should
	      determine	an optimum value for this parameter,  but  no  general
	      algorithm	 for  computing	the optimum number of chunk cache ele-
	      ments has	been implemented yet.

	-r    Read netCDF classic or 64-bit offset input file into a  diskless
	      netCDF  file in memory before copying.  Requires that input file
	      be small enough to fit into memory.  For	nccopy,	 this  doesn't
	      seem  to provide any significant speedup,	so may not be a	useful
	      option.

	-L  n Set the log level; only usable if	nccopy supports	netCDF-4  (en-
	      hanced).

	-M  n Set  the	minimum	 chunk	size;  only  usable if nccopy supports
	      netCDF-4 (enhanced).

	-F  filterspec
	      For netCDF-4 output, including netCDF-4 classic model, specify a
	      filter to	apply to a specified set of variables in  the  output.
	      As  a  rule, the filter is a compression/decompression algorithm
	      with a unique numeric identifier assigned	by the HDF Group  (see
	      https://support.hdfgroup.org/services/filters.html).

	      The filterspec argument has this general form.
	      fqn1|fqn2...,filterid,param1,param2...paramn	or	*,fil-
	      terid,param1,param2...paramn
       An fqn (fully qualified name) is	the name of a variable prefixed	by its
       containing groups with the  group  names	 separated  by	forward	 slash
       ('/').	An  example might be /g1/g2/var. Alternatively,	just the vari-
       able name can be	given if it is in the root group: e.g. var.  Backslash
       escapes may be used as needed.  A note of warning: the '|' separator is
       a  bash reserved	character, so you will probably	need to	put the	filter
       spec in some kind of quotes or otherwise	escape it.

	      The filterid is an unsigned positive integer representing	the id
	      assigned by the HDFgroup to the filter. Following	the  id	 is  a
	      sequence	of  parameters	defining  the operation	of the filter.
	      Each parameter is	a 32-bit unsigned integer.

	      This parameter may be repeated  multiple	times  with  different
	      variable names.

EXAMPLES
       Make a copy of foo1.nc, a netCDF	file of	any type, to foo2.nc, a	netCDF
       file of the same	type:

	      nccopy foo1.nc foo2.nc

       Note that the above copy	will not be as fast as use of cp or other sim-
       ple copy	utility, because the file is copied using only the netCDF API.
       If  the	input  file  has extra bytes after the end of the netCDF data,
       those will not be copied, because they are not accessible  through  the
       netCDF interface.  If the original file was generated in	"No fill" mode
       so  that	fill values are	not stored for padding for data	alignment, the
       output file may have different padding bytes.

       Convert a netCDF-4 classic model	file, compressed.nc,  that  uses  com-
       pression, to a netCDF-3 file classic.nc:

	      nccopy -k	classic	compressed.nc classic.nc

       Note that 'nc3' could be	used instead of	'classic'.

       Download	the variable 'time_bnds' and its associated attributes from an
       OPeNDAP server and copy the result to a netCDF file named 'tb.nc':

	      nccopy	      'http://test.opendap.org/opendap/data/nc/sst.mn-
		     mean.nc.gz?time_bnds' tb.nc

       Note that URLs that name	specific variables as  command-line  arguments
       should  generally  be  quoted,  to avoid	the shell interpreting special
       characters such as '?'.

       Compress	all the	variables in the input file foo.nc, a netCDF  file  of
       any type, to the	output file bar.nc:

	      nccopy -d1 foo.nc	bar.nc

       If  foo.nc was a	classic	or 64-bit offset netCDF	file, bar.nc will be a
       netCDF-4	classic	model netCDF file, because the classic and 64-bit off-
       set format  variants  don't  support  compression.   If	foo.nc	was  a
       netCDF-4	 file  with  some variables compressed using various deflation
       levels, the output will also be a netCDF-4 file of the same  type,  but
       all  the	 variables, including any uncompressed variables in the	input,
       will now	use deflation level 1.

       Assume the input	data includes gridded variables	that  use  time,  lat,
       lon  dimensions,	 with 1000 times by 1000 latitudes by 1000 longitudes,
       and that	the time dimension varies most slowly.	Also assume that users
       want quick access to data at all	times  for  a  small  set  of  lat-lon
       points.	 Accessing data	for 1000 times would typically require access-
       ing 1000	disk blocks, which may be slow.

       Reorganizing the	data into chunks on disk that have  all	 the  time  in
       each  chunk  for	 a  few	lat and	lon coordinates	would greatly speed up
       such access.  To	chunk the data in the input  file  slow.nc,  a	netCDF
       file of any type, to the	output file fast.nc, you could use;

	      nccopy -c	time/1000,lat/40,lon/40	slow.nc	fast.nc

       to  specify data	chunks of 1000 times, 40 latitudes, and	40 longitudes.
       If you had enough memory	to contain the output file, you	could speed up
       the rechunking operation	significantly by creating the output in	memory
       before writing it to disk on close (using the -w	flag):

	      nccopy -w	-c time/1000,lat/40,lon/40 slow.nc fast.nc
       Alternatively, one could	write this using the alternate,	 variable-spe-
       cific  chunking specification and assuming that times, lat, and lon are
       variables.

	      nccopy -c	time:1000 -c lat:40 -c lon:40 slow.nc fast.nc

Chunking Rules
       The complete set	of chunking rules is captured here.  As	a rough	summa-
       ry, these rules preserve	all chunking properties	from the  input	 file.
       These  rules apply only when the	selected output	format supports	chunk-
       ing, i.e. for the netcdf-4 variants.

       The variable specific chunking  specification  should  be  obvious  and
       translates  directly  to	 the  corresponding  "nc_def_var_chunking" API
       call.

       The original per-dimension, chunking specification requires some	inter-
       pretation by nccopy.  The following rules are applied in	the given  or-
       der  independently for each variable to be copied from input to output.
       The rules are written assuming we are trying to determine the  chunking
       for a given output variable Vout	that comes from	an input variable Vin.

       1.     If  there	 is  no	'-c' option that applies to a variable and the
	      corresponding input variable is contiguous or the	input is  some
	      netcdf-3	variant, then let the netcdf-c library make all	chunk-
	      ing decisions.

       2.     For each dimension of Vout explicitly specified on  the  command
	      line  (using the '-c' option), apply the chunking	value for that
	      dimension	regardless of input format or input properties.

       3.     For dimensions of	Vout not named on the command line in  a  '-c'
	      option,  preserve	chunk sizes from the corresponding input vari-
	      able, if it is chunked.

       4.     If Vin is	contiguous, and	none of	its dimensions	are  named  on
	      the command line,	and chunking is	not mandated by	other options,
	      then make	Vout be	contiguous.

       5.     If  the  input variable is contiguous (or	is some	netcdf-3 vari-
	      ant) and there are no options requiring  chunking,  or  the  '/'
	      special  case  for the '-c' option is specified, then the	output
	      variable V is marked as contiguous.

       6.     Final, default case: some	or all chunk sizes are not  determined
	      by  the  command	line  or the input variable. This includes the
	      non-chunked input	cases such as  netcdf-3,  cdf5,	 and  DAP.  In
	      these cases retain all chunk sizes determined by previous	rules,
	      and use the full dimension size as the default. The exception is
	      unlimited	dimensions, where the default is 4 megabytes.

SEE ALSO
       ncdump(1),ncgen(1),netcdf(3)

Release	4.2			  2012-03-08			     NCCOPY(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=nccopy&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help