Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
rwuniq(1)			SiLK Tool Suite			     rwuniq(1)

NAME
       rwuniq -	Bin SiLK Flow records by a key and print each bin's volume

SYNOPSIS
	 rwuniq	--fields=KEY [--values=VALUES]
	       [{--threshold=MIN-MAX | --threshold=MIN}]
	       [--presorted-input] [--sort-output]
	       [{--bin-time=SECONDS | --bin-time}]
	       [--timestamp-format=FORMAT] [--epoch-time]
	       [--ip-format=FORMAT] [--integer-ips] [--zero-pad-ips]
	       [--integer-sensors] [--integer-tcp-flags]
	       [--no-titles] [--no-columns] [--column-separator=CHAR]
	       [--no-final-delimiter] [{--delimited | --delimited=CHAR}]
	       [--print-filenames] [--copy-input=PATH] [--output-path=PATH]
	       [--pager=PAGER_PROG] [--temp-directory=DIR_PATH]
	       [{--legacy-timestamps | --legacy-timestamps={1,0}}]
	       [--all-counts] [{--bytes	| --bytes=MIN |	--bytes=MIN-MAX}]
	       [{--packets | --packets=MIN | --packets=MIN-MAX}]
	       [{--flows | --flows=MIN | --flows=MIN-MAX}]
	       [--stime] [--etime]
	       [{--sip-distinct	| --sip-distinct=MIN | --sip-distinct=MIN-MAX}]
	       [{--dip-distinct	| --dip-distinct=MIN | --dip-distinct=MIN-MAX}]
	       [--ipv6-policy={ignore,asv4,mix,force,only}]
	       [--site-config-file=FILENAME]
	       [--plugin=PLUGIN	[--plugin=PLUGIN ...]]
	       [--python-file=PATH [--python-file=PATH ...]]
	       [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [--pmap-column-width=NUM]
	       {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

	 rwuniq	[--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [--plugin=PLUGIN	...] [--python-file=PATH ...] --help

	 rwuniq	[--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [--plugin=PLUGIN	...] [--python-file=PATH ...] --help-fields

	 rwuniq	--version

DESCRIPTION
       rwuniq reads SiLK Flow records and groups them by a key composed	of
       user-specified attributes of the	flows.	For each group (or bin), a
       collection of user-specified aggregate values is	computed; these	values
       are typically related to	the volume of the bin, such as the sum of the
       bytes fields for	all records that match the key.	 Once all the SiLK
       Flow records are	read, the key fields and the aggregate values are
       printed.	 For some of the built-in aggregate values, it is possible to
       limit the output	to the bins where the aggregate	value meets a user-
       specified minimum and/or	maximum.

       There is	no need	to sort	the input to rwuniq since rwuniq normally
       rearranges the records as they are read.	 To have rwuniq	sort its
       output, use the --sort-output switch.

       rwuniq reads SiLK Flow records from the files named on the command line
       or from the standard input when no file names are specified and --xargs
       is not present.	To read	the standard input in addition to the named
       files, use "-" or "stdin" as a file name.  If an	input file name	ends
       in ".gz", the file is uncompressed as it	is read.  When the --xargs
       switch is provided, rwuniq reads	the names of the files to process from
       the named text file or from the standard	input if no file name argument
       is provided to the switch.  The input to	--xargs	must contain one file
       name per	line.

       The user	must provide the --fields switch to select the flow
       attribute(s) (or	field(s)) that comprise	the key	for each bin.  The
       available fields	are similar to those supported by rwcut(1); see	the
       description of the --fields switch in the "OPTIONS" section below for
       the details.  The list of fields	can be extended	by loading PySiLK
       files (see silkpython(3)) or plug-ins (silk-plugin(3)).	The fields are
       printed in the order in which they occur	in the --fields	switch.	 The
       size of the key is limited to 256 octets.  A larger key more quickly
       uses the	available the memory leading to	slower performance.

       The aggregate value(s) to compute for each bin are also chosen by the
       user.  As with the key fields, the user can extend the list of
       aggregate fields	by using PySiLK	or plug-ins.  Specify the aggregate
       fields with the --values	switch;	the aggregate fields are printed in
       the order they occur in the --values switch.  If	the user does not
       provide --values	or a --threshold switch	(described next), rwuniq
       defaults	to computing the number	of flow	records	for each bin.  As with
       the key fields, requesting more aggregate values	slows performance.

       The --threshold switch (added in	SiLK 3.17.0) allows the	user to	print
       only bins where a value field is	within a certain range.	 The switch's
       argument	contains the name of the value field, an equals	sign, the
       minimum value (start of the range), and optionally a hyphen and the
       maximum value (end of the range); e.g., "--threshold=bytes=1000-2000".
       The upper bound is unlimited when no maximum is specified.  The
       --threshold switch may be repeated to set multiple thresholds, and only
       those bins that meet all	thresholds are printed.	 Each field named by
       --threshold is appended to the set of aggregate value fields unless
       that field was named in the --values switch.

       The --presorted-input switch may	allow rwuniq to	process	data more
       efficiently by causing rwuniq to	assume the input has been previously
       sorted with the rwsort(1) command.  With	this switch, rwuniq typically
       does not	need large amounts of memory because it	does not bin each
       flow; instead, it keeps a running summation and outputs the bin
       whenever	the key	changes.  For the output to be meaningful, rwsort and
       rwuniq must be invoked with the same --fields value.  When multiple
       input files are specified and --presorted-input is given, rwuniq	merge-
       sorts the flow records from the input files.  rwuniq typically runs
       faster if you do	not include the	--presorted-input switch when counting
       distinct	values,	even when reading sorted input.	 Finally, you may get
       unusual results with --presorted-input when the --fields	switch
       contains	multiple time-related key fields ("sTime", "duration",
       "eTime"), or when the time-related key is not the final key listed in
       --fields; see the "NOTES" section for details.

       rwuniq attempts to keep all key and aggregate value data	in the
       computer's memory.  If rwuniq runs out of memory, the current key and
       aggregate value data is written to a temporary file.  Once all input
       has been	processed, the data from the temporary files is	merged to
       produce the final output.  By default, these temporary files are	stored
       in the /tmp directory.  Because these files can be large, it is
       strongly	recommended that /tmp not be used as the temporary directory.
       To modify the temporary directory used by rwuniq, provide the
       --temp-directory	switch,	set the	SILK_TMPDIR environment	variable, or
       set the TMPDIR environment variable.

OPTIONS
       Option names may	be abbreviated if the abbreviation is unique or	is an
       exact match for an option.  A parameter to an option may	be specified
       as --arg=param or --arg param, though the first form is required	for
       options that take optional parameters.

       The --fields switch is required.	 rwuniq	fails when it is not provided.

       --fields=KEY
	   KEY contains	the list of flow attributes (a.k.a. fields or columns)
	   that	make up	the key	into which flows are binned.  The columns are
	   displayed in	the order the fields are specified.  Each field	may be
	   specified once only.	 KEY is	a comma	separated list of field-names,
	   field-integers, and ranges of field-integers; a range is specified
	   by separating the start and end of the range	with a hyphen (-).
	   Field-names are case	insensitive.  Example:

	    --fields=stime,10,1-5

	   There is no default value for the --fields switch; the switch must
	   be specified.

	   The complete	list of	built-in fields	that the SiLK tool suite
	   supports follows, though note that not all fields are present in
	   all SiLK file formats; when a field is not present, its value is 0.

	   sIP,1
	       source IP address

	   dIP,2
	       destination IP address

	   sPort,3
	       source port for TCP and UDP, or equivalent

	   dPort,4
	       destination port	for TCP	and UDP, or equivalent.	 See note at
	       "iType".

	   protocol,5
	       IP protocol

	   packets,pkts,6
	       packet count

	   bytes,7
	       byte count

	   flags,8
	       bit-wise	OR of TCP flags	over all packets

	   sTime,9
	       starting	time of	flow (seconds resolution unless	--bin-time
	       includes	fractional seconds). When the time-related fields
	       "sTime","duration","eTime" are all in use, rwuniq ignores the
	       final time field	when binning the records.

	   duration,10
	       duration	of flow	(seconds resolution unless --bin-time includes
	       fractional seconds).  This field	is not adjusted	by --bin-time
	       unless --fields includes	both "sTime" and "eTime".  See note at
	       "sTime,9".

	   eTime,11
	       end time	of flow	(seconds resolution unless --bin-time includes
	       fractional seconds).  See note at "sTime,9".

	   sensor,12
	       name or ID of the sensor	where the flow was collected

	   class,20
	       class assigned to the flow by rwflowpack(8).  Binning by
	       "class" and/or "type" equates to	binning	by the integer value
	       used internally to represent the	class/type pair.  When
	       --fields	contains "class" but not "type", rwuniq's output
	       contains	multiple rows with the same value(s) for the key
	       field(s).

	   type,21
	       type assigned to	the flow by rwflowpack(8).  See	note on
	       previous	entry.

	   iType
	       the ICMP	type value for ICMP or ICMPv6 flows and	empty
	       (numerically zero) for non-ICMP flows.  Internally, SiLK	stores
	       the ICMP	type and code in the "dPort" field.  To	avoid getting
	       very odd	results, either	do not use the "dPort" field when your
	       key includes ICMP field(s) or be	certain	to include the
	       "protocol" field	as part	of your	key.  This field was
	       introduced in SiLK 3.8.1.

	   iCode
	       the ICMP	code value for ICMP or ICMPv6 flows and	empty for non-
	       ICMP flows.  See	note at	"iType".

	   icmpTypeCode,25
	       equivalent to "iType","iCode" when used in --fields.  This
	       field may not be	mixed with "iType" or "iCode", and this	field
	       is deprecated as	of SiLK	3.8.1.	As of SiLK 3.8.1,
	       "icmpTypeCode" may no longer be used as the argument to the
	       "Distinct:" value field;	the "dPort" field provides an
	       equivalent result as long as the	input is limited to ICMP flow
	       records.

	   Many	SiLK file formats do not store the following fields and	their
	   values are always be	0; they	are listed here	for completeness:

	   in,13
	       router SNMP input interface or vlanId if	packing	tools were
	       configured to capture it	(see sensor.conf(5))

	   out,14
	       router SNMP output interface or postVlanId

	   nhIP,15
	       router next hop IP

	   SiLK	can store flows	generated by enhanced collection software that
	   provides more information than NetFlow v5.  These flows may support
	   some	or all of these	additional fields; for flows without this
	   additional information, the field's value is	always 0.

	   initialFlags,26
	       TCP flags on first packet in the	flow

	   sessionFlags,27
	       bit-wise	OR of TCP flags	over all packets except	the first in
	       the flow

	   attributes,28
	       flow attributes set by the flow generator:

	       "S" all the packets in this flow	record are exactly the same
		   size

	       "F" flow	generator saw additional packets in this flow
		   following a packet with a FIN flag (excluding ACK packets)

	       "T" flow	generator prematurely created a	record for a long-
		   running connection due to a timeout.	 (When the flow
		   generator yaf(1) is run with	the --silk switch, it
		   prematurely creates a flow and mark it with "T" if the byte
		   count of the	flow cannot be stored in a 32-bit value.)

	       "C" flow	generator created this flow as a continuation of long-
		   running connection, where the previous flow for this
		   connection met a timeout (or	a byte threshold in the	case
		   of yaf).

	       Consider	a long-running ssh session that	exceeds	the flow
	       generator's active timeout.  (This is the active	timeout	since
	       the flow	generator creates a flow for a connection that still
	       has activity).  The flow	generator will create multiple flow
	       records for this	ssh session, each spanning some	portion	of the
	       total session.  The first flow record will be marked with a "T"
	       indicating that it hit the timeout.  The	second through next-
	       to-last records will be marked with "TC"	indicating that	this
	       flow both timed out and is a continuation of a flow that	timed
	       out.  The final flow will be marked with	a "C", indicating that
	       it was created as a continuation	of an active flow.

	   application,29
	       guess as	to the content of the flow.  Some software that
	       generates flow records from packet data,	such as	yaf, will
	       inspect the contents of the packets that	make up	a flow and use
	       traffic signatures to label the content of the flow.  SiLK
	       calls this label	the application; yaf refers to it as the
	       appLabel.  The application is the port number that is
	       traditionally used for that type	of traffic (see	the
	       /etc/services file on most UNIX systems).  For example, traffic
	       that the	flow generator recognizes as FTP will have a value of
	       21, even	if that	traffic	is being routed	through	the standard
	       HTTP/web	port (80).

	   The following fields	provide	a way to label the IPs or ports	on a
	   record.  These fields require external files	to provide the mapping
	   from	the IP or port to the label:

	   sType,16
	       for the source IP address, the value 0 if the address is	non-
	       routable, 1 if it is internal, or 2 if it is routable and
	       external.  Uses the mapping file	specified by the
	       SILK_ADDRESS_TYPES environment variable,	or the
	       address_types.pmap mapping file,	as described in	addrtype(3).

	   dType,17
	       as sType	for the	destination IP address

	   scc,18
	       for the source IP address, a two-letter country code
	       abbreviation denoting the country where that IP address is
	       located.	 Uses the mapping file specified by the
	       SILK_COUNTRY_CODES environment variable,	or the
	       country_codes.pmap mapping file,	as described in	ccfilter(3).
	       The abbreviations are those defined by ISO 3166-1 (see for
	       example <https://www.iso.org/iso-3166-country-codes.html> or
	       <https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2>) or the
	       following special codes:	-- N/A (e.g. private and experimental
	       reserved	addresses); a1 anonymous proxy;	a2 satellite provider;
	       o1 other

	   dcc,19
	       as scc for the destination IP

	   src-map-name
	       label contained in the prefix map file associated with map-
	       name.  If the prefix map	is for IP addresses, the label is that
	       associated with the source IP address.  If the prefix map is
	       for protocol/port pairs,	the label is that associated with the
	       protocol	and source port.  See also the description of the
	       --pmap-file switch below	and the	pmapfilter(3) manual page.

	   dst-map-name
	       as src-map-name for the destination IP address or the protocol
	       and destination port.

	   sval
	       as src-map-name when no map-name	is associated with the prefix
	       map file

	   dval
	       as dst-map-name when no map-name	is associated with the prefix
	       map file

	   Finally, the	list of	built-in fields	may be augmented by the	run-
	   time	loading	of PySiLK code or plug-ins written in C	(also called
	   shared object files or dynamic libraries), as described by the
	   --python-file and --plugin switches.

       --values=VALUES
	   Specify the aggregate values	to compute for each bin	as a comma
	   separated list of names.  Names are case insensitive.  When the
	   --threshold switch specifies	an aggregate value field that does
	   appear in VALUES, that field	is appended to VALUES.	When neither
	   the --values	switch nor any --threshold switch is specified,	rwuniq
	   counts the number of	flow records for each bin.  The	aggregate
	   fields are printed in the order they	occur in VALUES.  The names of
	   the built-in	value fields follow.  This list	can be augmented
	   through the use of PySiLK and plug-ins.

	   Records
	       Count the number	of flow	records	that mapped to each bin.

	   Packets
	       Sum the number of packets across	all records that mapped	to
	       each bin.

	   Bytes
	       Sum the number of bytes across all records that mapped to each
	       bin.

	   sTime-Earliest
	       Keep track of the earliest start	time (minimum time) seen
	       across all records that mapped to each bin, in seconds
	       resolution.  The	--bin-time switch does not normally affect
	       this value; however, this value uses milliseconds resolution
	       when --bin-time includes	fractional seconds.

	   eTime-Latest
	       Keep track of the latest	end time (maximum time)	seen across
	       all records that	mapped to each bin, in seconds resolution.
	       The --bin-time switch does not normally affect this value;
	       however,	this value uses	milliseconds resolution	when
	       --bin-time includes fractional seconds.

	   sIP-Distinct
	       Count the number	of distinct source IP addresses	that were seen
	       for each	bin, an	alias for Distinct:sIP.

	   dIP-Distinct
	       Count the number	of distinct destination	IP addresses that were
	       seen for	each bin, an alias for Distinct:dIP.

	   Distinct:KEY_FIELD
	       Count the number	of distinct values for KEY_FIELD, where
	       KEY_FIELD is any	field that can be used as an argument to
	       --fields	except "icmpTypeCode".	For example, "Distinct:sPort"
	       counts the number of distinct source ports for each bin.	 When
	       this aggregate value field is used, the specified KEY_FIELD
	       cannot be present in the	argument to --fields.

	   Flows
	       Count the number	of flow	records	that mapped to each bin; an
	       alias for Records.

       --plugin=PLUGIN
	   Augment the list of key fields and/or aggregate value fields	by
	   using run-time loading of the plug-in (shared object) whose path is
	   PLUGIN.  The	switch may be repeated to load multiple	plug-ins.  The
	   creation of plug-ins	is described in	the silk-plugin(3) manual
	   page.  When PLUGIN does not contain a slash ("/"), rwuniq attempts
	   to find a file named	PLUGIN in the directories listed in the
	   "FILES" section.  If	rwuniq finds the file, it uses that path.  If
	   PLUGIN contains a slash or if rwuniq	does not find the file,	rwuniq
	   relies on your operating system's dlopen(3) call to find the	file.
	   When	the SILK_PLUGIN_DEBUG environment variable is non-empty,
	   rwuniq prints status	messages to the	standard error as it attempts
	   to find and open each of its	plug-ins.

       --threshold=VALUE_FIELD=MIN-MAX
       --threshold=VALUE_FIELD=MIN
	   Limit the output of rwuniq to the bins where	the value of the
	   aggregate value field VALUE_FIELD is	not less than MIN and not more
	   than	MAX.  If MAX is	not given, limit the output to the bins	where
	   the value of	VALUE_FIELD is at least	MIN.  The VALUE_FIELD argument
	   is case insensitive and may be abbreviated to the shortest unique
	   prefix.  This switch	may be repeated	to set thresholds for multiple
	   fields, and rwuniq only prints bins that meet all thresholds.  A
	   MIN of 0 is treated as 1.  If VALUE_FIELD is	not present in the
	   argument to the --values switch, it is appended to those aggregate
	   values.  VALUE_FIELD	may be Records (or Flows), Packets, Bytes,
	   sIP-Distinct, dIP-Distinct, or Distinct:KEY_FIELD.  Setting
	   thresholds for aggregate value fields defined by plug-ins is	not
	   supported.  Since SiLK 3.17.0.

       Miscellaneous options:

       --presorted-input
	   Cause rwuniq	to assume that it is reading sorted input; i.e., that
	   rwuniq's input file(s) were generated by rwsort(1) using the	exact
	   same	value for the --fields switch.	When no	distinct counts	are
	   being computed, rwuniq can process its input	without	needing	to
	   write temporary files.  When	multiple input files are specified,
	   rwuniq merge-sorts the flow records from the	input files.  See the
	   "NOTES" section for issues that may occur when using
	   --presorted-input.

       --sort-output
	   Cause rwuniq	to present the output in sorted	numerical order.  The
	   key rwuniq uses for sorting is the same key it uses to index	each
	   bin.

       --bin-time=SECONDS
       --bin-time
	   Adjust the times in the key fields "sTime" and "eTime" to appear on
	   SECONDS-second boundaries (the floor	of the time is used).  As of
	   SiLK	3.17.0,	SECONDS	may be a fractional value of 0.001 or greater,
	   and rwuniq uses millisecond timestamps when SECONDS includes	a
	   fractional value that is non-zero.  When this switch	is not
	   specified, times appear on 1-second boundaries.  When the switch is
	   used	but no argument	is given, rwuniq uses 60-second	time bins.
	   (When the start-time	is the only key	field and time binning is
	   desired, consider using rwcount(1) instead.)

       --timestamp-format=FORMAT
	   Specify the format and/or timezone to use when printing timestamps.
	   When	this switch is not specified, the SILK_TIMESTAMP_FORMAT
	   environment variable	is checked for a default format	and/or
	   timezone.  If it is empty or	contains invalid values, timestamps
	   are printed in the default format, and the timezone is UTC unless
	   SiLK	was compiled with local	timezone support.  FORMAT is a comma-
	   separated list of a format and/or a timezone.  The format is	one
	   of:

	   default
	       Print the timestamps as "YYYY/MM/DDThh:mm:ss".

	   iso Print the timestamps as "YYYY-MM-DD hh:mm:ss".

	   m/d/y
	       Print the timestamps as "MM/DD/YYYY hh:mm:ss".

	   epoch
	       Print the timestamps as the number of seconds since 00:00:00
	       UTC on 1970-01-01.

	   When	a timezone is specified, it is used regardless of the default
	   timezone support compiled into SiLK.	 The timezone is one of:

	   utc Use Coordinated Universal Time to print timestamps.

	   local
	       Use the TZ environment variable or the local timezone.

       --epoch-time
	   Print timestamps as epoch time (number of seconds since midnight
	   GMT on 1970-01-01).	This switch is equivalent to
	   --timestamp-format=epoch, it	is deprecated as of SiLK 3.0.0,	and it
	   will	be removed in the SiLK 4.0 release.

       --ip-format=FORMAT
	   Specify how IP addresses are	printed, where FORMAT is a comma-
	   separated list of the arguments described below.  When this switch
	   is not specified, the SILK_IP_FORMAT	environment variable is
	   checked for a value and that	format is used if it is	valid.	The
	   default FORMAT is "canonical".  Since SiLK 3.7.0.

	   canonical
	       Print IP	addresses in the canonical format.  If the key only
	       contains	IPv4 addresses,	use dot-separated decimal (192.0.2.1).
	       Otherwise, use colon-separated hexadecimal ("2001:db8::1") or a
	       mixed IPv4-IPv6 representation for IPv4-mapped IPv6 addresses
	       (the ::ffff:0:0/96 netblock, e.g., "::ffff:192.0.2.1") and
	       IPv4-compatible IPv6 addresses (the ::/96 netblock other	than
	       ::/127, e.g., "::192.0.2.1").

	   no-mixed
	       Print IP	addresses in the canonical format (192.0.2.1 or
	       "2001:db8::1") but do not used the mixed	IPv4-IPv6
	       representations.	 For example, use "::ffff:c000:201" instead of
	       "::ffff:192.0.2.1".  Since SiLK 3.17.0.

	   decimal
	       Print IP	addresses as integers in decimal format.  For example,
	       print 192.0.2.1 and "2001:db8::1" as 3221225985 and
	       42540766411282592856903984951653826561, respectively.

	   hexadecimal
	       Print IP	addresses as integers in hexadecimal format.  For
	       example,	print 192.0.2.1	and "2001:db8::1" as "c00000201" and
	       "20010db8000000000000000000000001", respectively.

	   zero-padded
	       Make all	IP address strings contain the same number of
	       characters by padding numbers with leading zeros.  For example,
	       print 192.0.2.1 and "2001:db8::1" as 192.000.002.001 and
	       "2001:0db8:0000:0000:0000:0000:0000:0001", respectively.	 For
	       IPv6 addresses, this setting implies "no-mixed",	so that
	       "::ffff:192.0.2.1" is printed as
	       "0000:0000:0000:0000:0000:ffff:c000:0201".  As of SiLK 3.17.0,
	       may be combined with any	of the above, including	"decimal" and
	       "hexadecimal".

	   The following arguments modify certain IP addresses prior to
	   printing.  These arguments may be combined with the above formats.

	   map-v4
	       Change IPv4 addresses to	IPv4-mapped IPv6 addresses (addresses
	       in the ::ffff:0:0/96 netblock) prior to formatting.  Since SiLK
	       3.17.0.

	   unmap-v6
	       When the	key contains IPv6 addresses, change any	IPv4-mapped
	       IPv6 addresses (addresses in the	::ffff:0:0/96 netblock)	to
	       IPv4 addresses prior to formatting.  Since SiLK 3.17.0.

	   The following argument is also available:

	   force-ipv6
	       Set FORMAT to "map-v4","no-mixed".

       --integer-ips
	   Print IP addresses as integers.  This switch	is equivalent to
	   --ip-format=decimal,	it is deprecated as of SiLK 3.7.0, and it will
	   be removed in the SiLK 4.0 release.

       --zero-pad-ips
	   Print IP addresses as fully-expanded, zero-padded values in their
	   canonical form.  This switch	is equivalent to
	   --ip-format=zero-padded, it is deprecated as	of SiLK	3.7.0, and it
	   will	be removed in the SiLK 4.0 release.

       --integer-sensors
	   Print the integer ID	of the sensor rather than its name.

       --integer-tcp-flags
	   Print the TCP flag fields (flags, initialFlags, sessionFlags) as an
	   integer value.  Typically, the characters "F,S,R,P,A,U,E,C" are
	   used	to represent the TCP flags.

       --no-titles
	   Turn	off column titles.  By default,	titles are printed.

       --no-columns
	   Disable fixed-width columnar	output.

       --column-separator=C
	   Use specified character between columns and after the final column.
	   When	this switch is not specified, the default of '|' is used.

       --no-final-delimiter
	   Do not print	the column separator after the final column.  Normally
	   a delimiter is printed.

       --delimited
       --delimited=C
	   Run as if --no-columns --no-final-delimiter --column-sep=C had been
	   specified.  That is,	disable	fixed-width columnar output; if
	   character C is provided, it is used as the delimiter	between
	   columns instead of the default '|'.

       --print-filenames
	   Print to the	standard error the names of input files	as they	are
	   opened.

       --copy-input=PATH
	   Copy	all binary SiLK	Flow records read as input to the specified
	   file	or named pipe.	PATH may be "stdout" or	"-" to write flows to
	   the standard	output as long as the --output-path switch is
	   specified to	redirect rwuniq's textual output to a different
	   location.

       --output-path=PATH
	   Write the textual output to PATH, where PATH	is a filename, a named
	   pipe, the keyword "stderr" to write the output to the standard
	   error, or the keyword "stdout" or "-" to write the output to	the
	   standard output (and	bypass the paging program).  If	PATH names an
	   existing file, rwuniq exits with an error unless the	SILK_CLOBBER
	   environment variable	is set,	in which case PATH is overwritten.  If
	   this	switch is not given, the output	is either sent to the pager or
	   written to the standard output.

       --pager=PAGER_PROG
	   When	output is to a terminal, invoke	the program PAGER_PROG to view
	   the output one screen full at a time.  This switch overrides	the
	   SILK_PAGER environment variable, which in turn overrides the	PAGER
	   variable.  If the --output-path switch is given or if the value of
	   the pager is	determined to be the empty string, no paging is
	   performed and all output is written to the terminal.

       --ipv6-policy=POLICY
	   Determine how IPv4 and IPv6 flows are handled when SiLK has been
	   compiled with IPv6 support.	When the switch	is not provided, the
	   SILK_IPV6_POLICY environment	variable is checked for	a policy.  If
	   it is also unset or contains	an invalid policy, the POLICY is mix.
	   When	SiLK has not been compiled with	IPv6 support, IPv6 flows are
	   always ignored, regardless of the value passed to this switch or in
	   the SILK_IPV6_POLICY	variable.  The supported values	for POLICY
	   are:

	   ignore
	       Ignore any flow record marked as	IPv6, regardless of the	IP
	       addresses it contains.

	   asv4
	       Convert IPv6 flow records that contain addresses	in the
	       ::ffff:0:0/96 netblock (that is,	IPv4-mapped IPv6 addresses) to
	       IPv4 and	ignore all other IPv6 flow records.

	   mix Process the input as a mixture of IPv4 and IPv6 flow records.
	       When an IP address is used as part of the key or	value, this
	       policy is equivalent to force.

	   force
	       Convert IPv4 flow records to IPv6, mapping the IPv4 addresses
	       into the	::ffff:0:0/96 netblock.

	   only
	       Process only flow records that are marked as IPv6 and ignore
	       IPv4 flow records in the	input.

       --temp-directory=DIR_PATH
	   Specify the name of the directory in	which to store data files
	   temporarily when the	memory is not large enough to store all	the
	   bins	and their aggregate values.  This switch overrides the
	   directory specified in the SILK_TMPDIR environment variable,	which
	   overrides the directory specified in	the TMPDIR variable, which
	   overrides the default, /tmp.

       --site-config-file=FILENAME
	   Read	the SiLK site configuration from the named file	FILENAME.
	   When	this switch is not provided, rwuniq searches for the site
	   configuration file in the locations specified in the	"FILES"
	   section.

       --legacy-timestamps
       --legacy-timestamps=NUM
	   When	NUM is not specified or	is 1, this switch is equivalent	to
	   --timestamp-format=m/d/y.  Otherwise, the switch has	no effect.
	   This	switch is deprecated as	of SiLK	3.0.0, and it will be removed
	   in the SiLK 4.0 release.

       --xargs
       --xargs=FILENAME
	   Read	the names of the input files from FILENAME or from the
	   standard input if FILENAME is not provided.	The input is expected
	   to have one filename	per line.  rwuniq opens	each named file	in
	   turn	and reads records from it as if	the filenames had been listed
	   on the command line.

       --help
	   Print the available options and exit.  Specifying switches that add
	   new fields, values, or additional switches before --help allows the
	   output to include descriptions of those fields or switches.

       --help-fields
	   Print the description and alias(es) of each field and value and
	   exit.  Specifying switches that add new fields before --help-fields
	   allows the output to	include	descriptions of	those fields.

       --version
	   Print the version number and	information about how SiLK was
	   configured, then exit the application.

       --pmap-file=PATH
       --pmap-file=MAPNAME:PATH
	   Load	the prefix map file located at PATH and	create fields named
	   src-map-name	and dst-map-name where map-name	is either the MAPNAME
	   part	of the argument	or the map-name	specified when the file	was
	   created (see	rwpmapbuild(1)).  If no	map-name is available, rwuniq
	   names the fields "sval" and "dval".	Specify	PATH as	"-" or "stdin"
	   to read from	the standard input.  The switch	may be repeated	to
	   load	multiple prefix	map files, but each prefix map must use	a
	   unique map-name.  The --pmap-file switch(es)	must precede the
	   --fields switch.  See also pmapfilter(3).

       --pmap-column-width=NUM
	   When	printing a label associated with a prefix map, this switch
	   gives the maximum number of characters to use when displaying the
	   textual value of the	field.

       --python-file=PATH
	   When	the SiLK Python	plug-in	is used, rwuniq	reads the Python code
	   from	the file PATH to define	additional fields that can be used as
	   part	of the key or as an aggregate value.  This file	should call
	   register_field() for	each field it wishes to	define.	 For details
	   and examples, see the silkpython(3) and pysilk(3) manual pages.

   Deprecated volume switches
       These options add the named aggregate field(s) to --values if the field
       is not present.	When an	argument is specified, the switch is
       equivalent to a --threshold switch.  Use	of these switches is
       deprecated.

       --all-counts
	   Append the following	fields to the argument of the --values switch
	   unless the field is already present:	Bytes, Packets,	Records,
	   sTime-Earliest, and eTime-Latest.  Deprecated since SiLK 2.0.0.

       --bytes
	   Append Bytes	to the argument	of the --values	switch unless it is
	   already present.  Deprecated	since SiLK 2.0.0.

       --bytes=MIN
	   Add --threshold=bytes=MIN to	the options.  Deprecated since SiLK
	   3.17.0.

       --bytes=MIN-MAX
	   Add --threshold=bytes=MIN-MAX to the	options.  Deprecated since
	   SiLK	3.17.0.

       --packets
	   Append Packets to the argument of the --values switch unless	it is
	   already present.  Deprecated	since SiLK 2.0.0.

       --packets=MIN
	   Add --threshold=packets=MIN to the options.	Deprecated since SiLK
	   3.17.0.

       --packets=MIN-MAX
	   Add --threshold=packets=MIN-MAX to the options.  Deprecated since
	   SiLK	3.17.0.

       --flows
	   Append Records to the argument of the --values switch unless	it is
	   already present.  Deprecated	since SiLK 2.0.0.

       --flows=MIN
	   Add --threshold=records=MIN to the options.	Deprecated since SiLK
	   3.17.0.

       --flows=MIN-MAX
	   Add --threshold=records=MIN-MAX to the options.  Deprecated since
	   SiLK	3.17.0.

       --sip-distinct
	   Append Distinct:sIP to the argument of the --values switch unless
	   it is already present.  Deprecated since SiLK 2.0.0.

       --sip-distinct=MIN
	   Add --threshold=distinct:sip=MIN to the options.  Deprecated	since
	   SiLK	3.17.0.

       --sip-distinct=MIN-MAX
	   Add --threshold=distinct:sip=MIN-MAX	to the options.	 Deprecated
	   since SiLK 3.17.0.

       --dip-distinct
	   Append Distinct:dIP to the argument of the --values switch unless
	   it is already present.  Deprecated since SiLK 2.0.0.

       --dip-distinct=MIN
	   Add --threshold=distinct:dip=MIN to the options.  Deprecated	since
	   SiLK	3.17.0.

       --dip-distinct=MIN-MAX
	   Add --threshold=distinct:dip=MIN-MAX	to the options.	 Deprecated
	   since SiLK 3.17.0.

       --stime
	   Append sTime-Earliest to the	argument of the	--values switch	unless
	   it is already present.  Deprecated since SiLK 2.0.0.

       --etime
	   Append eTime-Latest to the argument of the --values switch unless
	   it is already present.  Deprecated since SiLK 2.0.0.

EXAMPLES
       In these	examples, the dollar sign ("$")	represents the shell prompt
       and a backslash ("\") is	used to	continue a line	for better
       readability.  Many examples assume previous rwfilter(1) commands	have
       written data files named	data.rw	and data-v6.rw.

       The --fields switch is required to specify which	field(s) comprise the
       key.  By	default, rwuniq	counts the number of records for each key.
       This example uses the source port as the	key.

	$ rwuniq --fields=sport	data.rw	| head
	sPort|	 Records|
	   53|	   62216|
	   22|	   27994|
	   67|	    7807|
	29897|	      78|
	28816|	      24|
	   80|	   27044|
	28925|	      22|
	    0|	    7801|
	29246|	      63|

       Notice how the keys are printed in an arbitrary order.  Use the
       --sort-output switch to arrange the keys	from lowest to highest.

	$ rwuniq --fields=sport	--sort-output data.rw |	head
	sPort|	 Records|
	    0|	    7801|
	   22|	   27994|
	   25|	   15568|
	   53|	   62216|
	   67|	    7807|
	   80|	   27044|
	  123|	    7741|
	  443|	    7917|
	 8080|	    3946|

       To sort the output by a volume field (such as the number	of records),
       use rwstats(1).

	$ rwstats --fields=sport --count=10 data.rw
	INPUT: 250928 Records for 4739 Bins and	250928 Total Records
	OUTPUT:	Top 10 Bins by Records
	sPort|	 Records|  %Records|   cumul_%|
	   53|	   62216| 24.794363| 24.794363|
	   22|	   27994| 11.156188| 35.950552|
	   80|	   27044| 10.777594| 46.728145|
	   25|	   15568|  6.204170| 52.932315|
	  443|	    7917|  3.155088| 56.087404|
	   67|	    7807|  3.111251| 59.198655|
	    0|	    7801|  3.108860| 62.307515|
	  123|	    7741|  3.084949| 65.392463|
	 8080|	    3946|  1.572563| 66.965026|
	29921|	     117|  0.046627| 67.011653|

       Alternatively, process the textual output of rwuniq with	the UNIX
       sort(1) utility.

	$ rwuniq --fields=sport	data.rw	 \
	  | sort -r -t '|' -k 2	| head
	sPort|	 Records|
	   53|	   62216|
	   22|	   27994|
	   80|	   27044|
	   25|	   15568|
	  443|	    7917|
	   67|	    7807|
	    0|	    7801|
	  123|	    7741|
	 8080|	    3946|

       Use the --values	field to change	the volume that	rwuniq computes	for
       each key.  This example prints the byte-, packet-, and record-counts
       for each	protocol, sorting the results by protocol.

	$ rwuniq --fields=proto	--values=bytes,packets,records --sort data.rw
	pro|		   Bytes|	 Packets|   Records|
	  1|		 5344836|	   73473|      7801|
	  6|	     59945492930|	72127917|    165363|
	 17|		17553593|	   77764|     77764|

       The --threshold switch limits the output	to rows	where a	value field
       meets a minimum value or	falls within a specific	range.	For example,
       print the number	of records and packets seen for	each source port for
       bins having at least 1000 records.

	$ rwuniq --fields=sport	--values=records,packets \
	       --threshold=records=1000	data.rw
	sPort|	 Records|	 Packets|
	   53|	   62216|	   62216|
	   22|	   27994|	23434615|
	   67|	    7807|	    7807|
	   80|	   27044|	 8271125|
	    0|	    7801|	   73473|
	  123|	    7741|	    7741|
	   25|	   15568|	  427777|
	  443|	    7917|	 2421124|
	 8080|	    3946|	 1202528|

       Multiple	thresholds may be specified.

	$ rwuniq --fields=sport	--values=records,packets		 \
	       --threshold=records=1000-5000 --threshold=packets=1000000 \
	       data.rw
	sPort|	 Records|	 Packets|
	 8080|	    3946|	 1202528|

       The --bin-time switch adjusts the times used by the "sTime" and "eTime"
       key fields.  An argument	of 86400 moves the starting and	ending time to
       day boundaries.

	$ rwuniq --bin-time=86400 --fields=stime,etime data.rw
		      sTime|		  eTime|   Records|
	2009/02/12T00:00:00|2009/02/12T00:00:00|     82969|
	2009/02/12T00:00:00|2009/02/13T00:00:00|       360|
	2009/02/13T00:00:00|2009/02/13T00:00:00|     83594|
	2009/02/13T00:00:00|2009/02/14T00:00:00|       332|
	2009/02/14T00:00:00|2009/02/14T00:00:00|     83673|

       The --bin-time switch does not adjust the "duration" value unless both
       "sTime" and "eTime" are given.

	$ rwuniq --bin-time=86400 --fields=stime,dur --sort data.rw | head -6
		      sTime|durat|   Records|
	2009/02/12T00:00:00|	0|     29523|
	2009/02/12T00:00:00|	1|	4312|
	2009/02/12T00:00:00|	2|	4376|
	2009/02/12T00:00:00|	3|	3986|
	2009/02/12T00:00:00|	4|	 923|

	$ rwuniq --bin-time=86400 --fields=stime,dur,etime data.rw
		      sTime|durat|		eTime|	 Records|
	2009/02/12T00:00:00|	0|2009/02/12T00:00:00|	   82969|
	2009/02/12T00:00:00|86400|2009/02/13T00:00:00|	     360|
	2009/02/13T00:00:00|	0|2009/02/13T00:00:00|	   83594|
	2009/02/13T00:00:00|86400|2009/02/14T00:00:00|	     332|
	2009/02/14T00:00:00|	0|2009/02/14T00:00:00|	   83673|

       As of SiLK 3.17.0, the --bin-time switch	accepts	a floating point
       value.  When the	fractional part	is non-zero, rwuniq uses millisecond
       precision for the times and the duration.

	$ rwuniq --bin-time=0.001 --fields=duration data.rw | head -6
	 duration|   Records|
	    0.000|     85565|
	 1791.045|	   4|
	    2.120|	  19|
	   22.263|	   5|
	   19.902|	   3|

       The --bin-time does not adjust the "sTime-Earliest" and "eTime-Latest"
       aggregate value fields, but it does determine whether those fields
       maintain	millisecond precision.

	$ rwuniq --bin-time=86400 --fields=stime --value=etime data.rw
		      sTime|	   eTime-Latest|
	2009/02/12T00:00:00|2009/02/12T00:29:59|
	2009/02/13T00:00:00|2009/02/13T00:29:58|
	2009/02/14T00:00:00|2009/02/14T00:29:59|

	$ rwuniq --bin-time=0.001 --fields=proto --value=stime,etime data.rw
	pro|	     sTime-Earliest|	       eTime-Latest|
	 17|2009/02/12T00:00:02.745|1970/01/15T06:57:35.997|
	  6|2009/02/12T00:00:03.004|1970/01/15T06:57:35.998|
	  1|2009/02/12T00:00:20.601|1970/01/15T06:57:35.992|

       With an input of	both IPv4 and IPv6 records, rwuniq maps	the IPv4
       records into the	::ffff:0:0/96 netblock.	 The data is normally mapped
       back to IPv4 on output.	Given this input:

	$ rwcut	--fields=sip,packets /tmp/v4v6.rw
					    sIP|   packets|
					    ::1|	45|
				     192.0.2.22|	87|
			   ::ffff:203.0.113.113|      2662|
			 2001:db8:54:32:ab:cd::|       345|

       The rwuniq tool produces:

	$ rwuniq --fields=sip --values=packets /tmp/v4v6.rw
					    sIP|	Packets|
					    ::1|	     45|
				     192.0.2.22|	     87|
				  203.0.113.113|	   2662|
			 2001:db8:54:32:ab:cd::|	    345|

       Set the --ip-format to map-v4 to	leave the values as IPv4-mapped	IPv6.
       (Using an --ipv6-policy of "force-ipv6" has the same effect.)

	$ rwuniq --fields=sip --values=packets --ip-format=map-v4 /tmp/v4v6.rw
					    sIP|	Packets|
					    ::1|	     45|
			      ::ffff:192.0.2.22|	     87|
			   ::ffff:203.0.113.113|	   2662|
			 2001:db8:54:32:ab:cd::|	    345|

       Print the source	addresses that sent more than 10,000,000 bytes,	and
       for each	address	print the number of unique destination hosts it
       contacted:

	$ rwuniq --fields=sip --values=bytes,distinct:dip \
	       --threshold=bytes=10000000 data-v6.rw
			      sIP|		 Bytes|dIP-Distin|
	     2001:db8:a:fd::90:bd|	      14529210|		2|

       Print the number	of bytes that host shared with each destination	(first
       use rwfilter to limit the input to that host):

	$ rwfilter --saddr=2001:db8:a:fd::90:bd	--pass=- data-v6.rw	   \
	  | rwuniq --fields=dip	--values=bytes
			      dIP|		 Bytes|
	    2001:db8:c0:a8::fa:5d|	       7097847|
	     2001:db8:c0:a8::dd:6|	       7431363|

       Print the packet	and byte counts	for each IPv4 source-destination pair,
       where the prefix	length is 16 (use rwnetmask(1) on the input to
       rwuniq):

	$ rwnetmask --4sip-prefix=16 --4dip-prefix=16 data.rw	   \
	  | rwuniq --fields=sip,dip --values=packet,byte | head
		   sIP|		   dIP|	 Packets|	 Bytes|
	    10.139.0.0|	   192.168.0.0|	   33490|     22950353|
	     10.40.0.0|	   192.168.0.0|	     258|	 18544|
	    10.204.0.0|	   192.168.0.0|	  353233|    288736424|
	    10.106.0.0|	   192.168.0.0|	   13051|      3843693|
	     10.71.0.0|	   192.168.0.0|	    4355|      1391194|
	     10.98.0.0|	   192.168.0.0|	    7312|      7328359|
	    10.114.0.0|	   192.168.0.0|	    2538|      4137927|
	    10.168.0.0|	   192.168.0.0|	   92094|     86883062|
	    10.176.0.0|	   192.168.0.0|	  122101|    116555051|

       Given a file of scan traffic, print the source of TCP traffic with no
       more than 3 packets and which also appears at least 4 times.  First use
       rwfilter	to limit the traffic to	TCP and	find the flow records where
       the packet count	in that	flow record is no more than 3.

	$ rwfilter --proto=6 --packets=1-3 --pass=- scandata.rw		 \
	  | rwuniq --field=sip --values=flow,packets --threshold=flows=4 \
	  | head -5
		    sIP|   Records|	   Packets|
	  10.249.216.38|       256|	       256|
	   10.155.55.93|       256|	       256|
	  10.61.255.154|       256|	       256|
	   10.60.122.82|       256|	       256|

       The silkpython(3) manual	page provides examples that use	PySiLK to
       create arbitrary	fields to use as part of the key for rwuniq.

       When using rwuniq on input that contains	both incoming and outgoing
       flow records, consider using the	int-ext-fields(3) plug-in which
       defines four additional fields representing the external	IP address,
       the external port, the internal IP address, and the internal port.  The
       plug-in requires	the user to specify which class/type pairs are
       incoming	and which are outgoing.	 See its manual	page for additional
       information.  As	an example, here we run	rwuniq on a file containing
       incoming	and outgoing web traffic.

	$ rwuniq --fields=sip,sport,dip,dport --values=bytes \
	       --sort-output data.rw | head -7
		    sIP|sPort|		  dIP|dPort|		   Bytes|
	    10.4.52.235|29631|192.168.233.171|	 80|		   18260|
	   10.5.231.251|   80|192.168.226.129|28770|		  536169|
	    10.9.77.117|29906| 192.168.184.65|	 80|		   55386|
	    10.11.88.88|   80|192.168.251.222|28902|		  433198|
	  10.14.110.214|29989| 192.168.249.96|	 80|		   25903|
	   10.15.224.27|  443| 192.168.231.49|29779|		  163759|

       Here the	int-ext-fields plug-in is used:

	$ export INCOMING_FLOWTYPES=all/in,all/inweb
	$ export OUTGOING_FLOWTYPES=all/out,all/outweb
	$ rwuniq --plugin=int-ext-fields.so \
	       --fields=ext-ip,ext-port,int-ip,int-port	--value=bytes \
	       --sort-output data.rw | head -7
		 ext-ip|ext-p|	       int-ip|int-p|		   Bytes|
	    10.4.52.235|29631|192.168.233.171|	 80|		  726111|
	   10.5.231.251|   80|192.168.226.129|28770|		  561654|
	    10.9.77.117|29906| 192.168.184.65|	 80|		 1811738|
	    10.11.88.88|   80|192.168.251.222|28902|		  444277|
	  10.14.110.214|29989| 192.168.249.96|	 80|		  393068|
	   10.15.224.27|  443| 192.168.231.49|29779|		  167696|

ENVIRONMENT
       SILK_IPV6_POLICY
	   This	environment variable is	used as	the value for --ipv6-policy
	   when	that switch is not provided.

       SILK_IP_FORMAT
	   This	environment variable is	used as	the value for --ip-format when
	   that	switch is not provided.	 Since SiLK 3.11.0.

       SILK_TIMESTAMP_FORMAT
	   This	environment variable is	used as	the value for
	   --timestamp-format when that	switch is not provided.	 Since SiLK
	   3.11.0.

       SILK_PAGER
	   When	set to a non-empty string, rwuniq automatically	invokes	this
	   program to display its output a screen at a time.  If set to	an
	   empty string, rwuniq	does not automatically page its	output.

       PAGER
	   When	set and	SILK_PAGER is not set, rwuniq automatically invokes
	   this	program	to display its output a	screen at a time.

       SILK_TMPDIR
	   When	set and	--temp-directory is not	specified, rwuniq writes the
	   temporary files it creates to this directory.  SILK_TMPDIR
	   overrides the value of TMPDIR.

       TMPDIR
	   When	set and	SILK_TMPDIR is not set,	rwuniq writes the temporary
	   files it creates to this directory.

       PYTHONPATH
	   This	environment variable is	used by	Python to locate modules.
	   When	--python-file is specified, rwuniq must	load the Python	files
	   that	comprise the PySiLK package, such as silk/__init__.py.	If
	   this	silk/ directory	is located outside Python's normal search path
	   (for	example, in the	SiLK installation tree), it may	be necessary
	   to set or modify the	PYTHONPATH environment variable	to include the
	   parent directory of silk/ so	that Python can	find the PySiLK
	   module.

       SILK_PYTHON_TRACEBACK
	   When	set, Python plug-ins print traceback information on Python
	   errors to the standard error.

       SILK_COUNTRY_CODES
	   This	environment variable allows the	user to	specify	the country
	   code	mapping	file that rwuniq uses when computing the scc and dcc
	   fields.  The	value may be a complete	path or	a file relative	to the
	   SILK_PATH.  See the "FILES" section for standard locations of this
	   file.

       SILK_ADDRESS_TYPES
	   This	environment variable allows the	user to	specify	the address
	   type	mapping	file that rwuniq uses when computing the sType and
	   dType fields.  The value may	be a complete path or a	file relative
	   to the SILK_PATH.  See the "FILES" section for standard locations
	   of this file.

       SILK_CLOBBER
	   The SiLK tools normally refuse to overwrite existing	files.
	   Setting SILK_CLOBBER	to a non-empty value removes this restriction.

       SILK_CONFIG_FILE
	   This	environment variable is	used as	the value for the
	   --site-config-file when that	switch is not provided.

       SILK_DATA_ROOTDIR
	   This	environment variable specifies the root	directory of data
	   repository.	As described in	the "FILES" section, rwuniq may	use
	   this	environment variable when searching for	the SiLK site
	   configuration file.

       SILK_PATH
	   This	environment variable gives the root of the install tree.  When
	   searching for configuration files and plug-ins, rwuniq may use this
	   environment variable.  See the "FILES" section for details.

       TZ  When	the argument to	the --timestamp-format switch includes "local"
	   or when a SiLK installation is built	to use the local timezone, the
	   value of the	TZ environment variable	determines the timezone	in
	   which rwuniq	displays timestamps.  (If both of those	are false, the
	   TZ environment variable is ignored.)	 If the	TZ environment
	   variable is not set,	the machine's default timezone is used.
	   Setting TZ to the empty string or 0 causes timestamps to be
	   displayed in	UTC.  For system information on	the TZ variable, see
	   tzset(3) or environ(7).  (To	determine if SiLK was built with
	   support for the local timezone, check the "Timezone support"	value
	   in the output of rwuniq --version.)

       SILK_PLUGIN_DEBUG
	   When	set to 1, rwuniq prints	status messages	to the standard	error
	   as it attempts to find and open each	of its plug-ins.  In addition,
	   when	an attempt to register a field fails, rwuniq prints a message
	   specifying the additional function(s) that must be defined to
	   register the	field in rwuniq.  Be aware that	the output can be
	   rather verbose.

       SILK_TEMPFILE_DEBUG
	   When	set to 1, rwuniq prints	debugging messages to the standard
	   error as it creates,	re-opens, and removes temporary	files.

       SILK_UNIQUE_DEBUG
	   When	set to 1, the binning engine used by rwuniq prints debugging
	   messages to the standard error.

FILES
       ${SILK_ADDRESS_TYPES}
       ${SILK_PATH}/share/silk/address_types.pmap
       ${SILK_PATH}/share/address_types.pmap
       /usr/local/share/silk/address_types.pmap
       /usr/local/share/address_types.pmap
	   Possible locations for the address types mapping file required by
	   the sType and dType fields.

       ${SILK_CONFIG_FILE}
       ${SILK_DATA_ROOTDIR}/silk.conf
       /data/silk.conf
       ${SILK_PATH}/share/silk/silk.conf
       ${SILK_PATH}/share/silk.conf
       /usr/local/share/silk/silk.conf
       /usr/local/share/silk.conf
	   Possible locations for the SiLK site	configuration file which are
	   checked when	the --site-config-file switch is not provided.

       ${SILK_COUNTRY_CODES}
       ${SILK_PATH}/share/silk/country_codes.pmap
       ${SILK_PATH}/share/country_codes.pmap
       /usr/local/share/silk/country_codes.pmap
       /usr/local/share/country_codes.pmap
	   Possible locations for the country code mapping file	required by
	   the scc and dcc fields.

       ${SILK_PATH}/lib64/silk/
       ${SILK_PATH}/lib64/
       ${SILK_PATH}/lib/silk/
       ${SILK_PATH}/lib/
       /usr/local/lib64/silk/
       /usr/local/lib64/
       /usr/local/lib/silk/
       /usr/local/lib/
	   Directories that rwuniq checks when attempting to load a plug-in.

       ${SILK_TMPDIR}/
       ${TMPDIR}/
       /tmp/
	   Directory in	which to create	temporary files.

NOTES
       If multiple thresholds are given	(e.g., "--threshold=bytes=80
       --threshold=flows=2"), the values must meet all thresholds before the
       record is printed.  For example,	if a given key saw a single 100-byte
       flow, the entry would not printed given the switches above.

       rwuniq functionally replaces the	combination of

	rwcut |	sort | uniq -c

       To get a	list of	unique IP addresses in a data set without the counting
       or threshold abilities of rwuniq, consider using	the IPset tools
       rwset(1)	and rwsetcat(1)	for improved performance:

	rwset --sip-set=stdout | rwsetcat --print-ips

       For situations where the	key and	value are each a single	field, the Bag
       tools (rwbag(1),	rwbagcat(1)) often provide better performance,
       especially when the key length is one or	two bytes:

	rwbag --bag-file=sport,bytes,stdout | rwbagcat

       To create a binary file that contains rwuniq-like output, use
       rwaggbag(1) or rwaggbagbuild(1).	 The content of	these files may	be
       printed with rwaggbagcat(1).

       rwgroup(1) works	similarly to rwuniq, except the	data remains in	the
       form of SiLK Flow records, and the next-hop-IP field is modified	to
       denote the records that form a bin.

       rwstats(1) can do the same binning as rwuniq, and then sort the data by
       an aggregate field.

       When the	--bin-time switch is given and the three time fields
       (starting-time ("sTime"), ending-time ("eTime"),	and duration
       ("duration")) are present in the	key, the duration field's value	will
       be modified to be the difference	between	the ending and starting	times.

       When the	three time-related key fields ("sTime","duration","eTime") are
       all in use, rwuniq will ignore the final	time field when	binning	the
       records,	but the	field will appear in the output.  Due to truncation of
       the milliseconds	values,	rwuniq will print a different number of	rows
       depending on the	order in which those three values appear in the
       --fields	switch.

       rwuniq supports counting	distinct source	and/or destination IPs.	 To
       see the number of distinct sources for each 10 minute bin, run:

	rwuniq --fields=stime --values=distinct:sip --bin-time=600 --sort-output

       When computing distinct counts over a field, the	field may not be part
       of the key; that	is, you	cannot have "--fields=sip
       --values=sip-distinct".

       Using the --presorted-input switch sometimes introduces more issues
       than it solves, and --presorted-input is	less necessary now that	rwuniq
       can use temporary files while processing	input.

       When computing distinct IP counts, rwuniq will typically	run faster if
       you do not use the --presorted-input switch, even if the	data was
       previously sorted.

       When using the --presorted-input	switch,	it is highly recommended that
       you use no more than one	time-related key field ("sTime", "duration",
       "eTime")	in the --fields	switch and that	the time-related key appear
       last in --fields.  The issue is caused by rwsort	considering the
       millisecond values on the times when sorting, while rwuniq truncates
       the millisecond value.  The result may be unsorted output and multiple
       rows in the output that have the	same values for	the key	fields:

	$ rwsort --fields=stime,duration data.rw       \
	  | rwuniq --fields=stime,dur --presorted
		      sTime|durat|   Records|
	...
	2009/02/12T00:00:57|	0|	   2|
	2009/02/12T00:00:57|   29|	   2|
	2009/02/12T00:00:57|	0|	   2|
	2009/02/12T00:00:57|   13|	   2|
	...

       rwuniq's	strength is its	ability	to build arbitrary keys	and aggregate
       fields.	For a key of a single IP address, see rwaddrcount(1) and
       rwbag(1); for a key made	up of a	single CIDR block (/8, /16, /24	only),
       a single	port, or a single protocol, use	rwtotal(1) or rwbag(1).

       As of SiLK 3.17.0, fields that are specified with the legacy
       thresholding switches (e.g., --bytes) and not with --values are printed
       in the order in which those switches appear.  Previously, the order was
       always bytes, packets, flows, stime, etime, sip-distinct, dip-distinct.

SEE ALSO
       rwfilter(1), rwbag(1), rwbagcat(1), rwaggbag(1),	rwaggbagbuild(1),
       rwaggbagcat(1), rwcut(1), rwset(1), rwsetcat(1),	rwaddrcount(1),
       rwgroup(1), rwstats(1), rwnetmask(1), rwsort(1),	rwtotal(1),
       rwcount(1), rwpmapbuild(1), addrtype(3),	ccfilter(3),
       int-ext-fields(3), pmapfilter(3), pysilk(3), silkpython(3),
       silk-plugin(3), sensor.conf(5), rwflowpack(8), silk(7), yaf(1),
       dlopen(3), tzset(3), environ(7)

SiLK 3.19.1			  2020-08-27			     rwuniq(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | EXAMPLES | ENVIRONMENT | FILES | NOTES | SEE ALSO

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=rwuniq&sektion=1&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help