Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
RABINS(1)		    General Commands Manual		     RABINS(1)

NAME
       rabins -	process	argus(8) data within specified bins.

SYNOPSIS
       rabins [-B secs]	-M splitmode [options]]	[raoptions] [--	filter-expres-
       sion]

DESCRIPTION
       Rabins reads argus data from an argus-data source, and adjusts the data
       so that it is aligned to	a set of bins, or slots, that are based	on ei-
       ther  time, input size, or count.  The resulting	output is split, modi-
       fied, and optionally aggregated so that	the  data  fits	 to  the  con-
       straints	of the specified bins.	rabins is designed to be a combination
       of rasplit and racluster, acting	on multiple contexts of	argus data.

       The  principal function of rabins is to align input data	to a series of
       bins, and then process the data within the context of each  bin.	  This
       is the basis for	real-time stream block processing.  Time series	stream
       block processing	is cricital for	flow data graphing, comparing, analyz-
       ing, and	correlation.  Fixed load stream	block processing, based	on the
       number  of  argus  data	records	 ('count'),  or	a fixed	volume of data
       ('size')	allows for control of resources	 in  processing.   While  load
       based  options  are very	useful,	they are rather	esoteric.  See the on-
       line examples and rasplit.1 for examples	of using these modes of	opera-
       tion.

Time Series Bins
       Time series bin'ing is specified	using the -M time option.   Time  bins
       are  specified by the size and granularity of the time bin.  The	granu-
       larity, 's'econds, 'm'inutes, 'h'ours, 'd'ays, 'w'eeks,	'M'onths,  and
       'y'ears,	 dictates  where  the bin boundaries lie.  To ensure that 0.5d
       and 12h start on	the same point in time,	second,	minute,	hour, and  day
       based bins start	at midnight, Jan 1st of	the year of processing.	 Week,
       month  and  year	bins all start on natural time boundaries, for the pe-
       riod.

       rabins provides a separate processing context for each bin, so that ag-
       gregation and sorting occur only	within the context of  each  time  pe-
       riod.   Records	are  placed into bins based on load or time.  For load
       based bins, input records are processed in received order and  are  not
       modified.  When	using  time  based  bins, records are placed into bins
       based on	the starting time of the record.   By  default,	 records  that
       span  a	time  boundary are split into as many records as needed	to fit
       the record into appropriate bin sizes, using  the  algorithms  used  by
       rasplit.1.   Metrics are	distributed uniformly within all the appropri-
       ate bins. The result is a series	of data	and/or fragments that are time
       aligned,	appropriate for	time seried analysis, and visualization.

       When a record is	split to conform to a time series bin,	the  resulting
       starting	 and  ending timestamps	may or may not coincide	with the time-
       stamps of the bins themselves. For some applications, this treatment is
       critical	to the analytics that are working on the resulting data,  such
       as transaction duration,	and flow traffic burst behavior.  However, for
       other  analytics,  like	average	load, and rate analysis	and reporting,
       the timestamps need to be modified so that they reflect the time	 range
       of  the	actual time bin	boundaries.  Rabins supports the optional hard
       option to specify that timestamps should	 conform  to  bin  boundaries.
       One  of	the  results  of  this	is  that all durations in the reported
       records will be the bin duration.  This	is  extremely  important  when
       processing certain time series metrics, like load.

Load Based Bins
       Load  based bin'ing is specified	using the -M size or -M	count options.
       Load bins are used to constrain the resource used  in  bin  processing.
       So  much	load is	input, aggregation is performed	on the input load, and
       when a threshold	is reached, the	entire aggregation  cache  is  dumped,
       reinitiallized,	and  reused.  These can	be used	effectively to provide
       realtime	data reduction,	but within a fixed amount of memory.

Output Processing
       rabins has two basic modes of output, the default holds all  output  in
       main memory until EOF is	encountered on input, where each sorted	bin is
       written	out.  The  second output mode, has rabins writing out the con-
       tents of	individual sorted bins,	periodically based on a	holding	 time,
       specified  using	 the  -B secs option.  The secs	value should be	chosen
       such that rabins	will have seen all the appropriate incoming  data  for
       that time period.  This is determined by	the ARGUS_FLOW_STATUS_INTERVAL
       used  by	the collection of argus	data sources in	the input data stream,
       as well as any time drift that may exist	amoung	argus  data  processin
       elements.   When	 there	is good	time sync, and with an ARGUS_FLOW_STA-
       TUS_INTERVAL of 5 seconds, appropriate secs  values  are	 between  5-15
       seconds.

       The  output  of rabins when using the -B	secs option, is	appropriate to
       drive a number of processing elements, such as near  real-time  visual-
       izations	and alarm and reporting.

Output Stream
       Like  all ra.1 client programs, the output of rabins.1 is an argus data
       stream, that can	be written as binary data to a file or	standard  out-
       put,  or	can be printed.	 rabins	supports all the output	functions pro-
       vided by	rasplit.1.

       The output files	name consists of a prefix, which  is  specified	 using
       the  -w	ra option, and for all modes except time mode, a suffix, which
       is created for each resulting file.  If no prefix is provided, then ra-
       bins will use 'x' as the	default	prefix.	 The suffix that  is  used  is
       determined  by the mode of operation.  When rabins is using the default
       count mode or the size mode, the	suffix is a  group  of	letters	 'aa',
       'ab', and so on,	such that concatenating	the output files in sorted or-
       der by file name	produces the original input file.  If rabins will need
       to  create  more	 output	 files	than are allowed by the	default	suffix
       strategy, more letters will be added, in	order to accomodate the	needed
       files.

       When rabins is spliting based on	time, rabins uses a default  extension
       of  %Y.%m.%d.%h.%m.%s.	This  default can be overrided by adding a '%'
       extension to the	name provided using the	-w option.

       When standard out is specified, using -w	-, rabins will output a	single
       argus-stream with START and STOP	argus management records inserted  ap-
       propriately  to	indicate  where	the output is split.  See argus(8) for
       more information	on output stream formats.

       When rabins is spliting on output record	count (the default), the  num-
       ber  of records is specified as an ordinal counter, the default is 1000
       records.	 When rabins is	spliting based	on  the	 maximum  output  file
       size,  the  size	 is specified as bytes.	 The scale of the bytes	can be
       specified by appending 'b', 'k' and 'm' to the number provided.

       When rabins is spliting base on time, the time period is	specified with
       the option, and can be any period based in seconds  (s),	 minutes  (m),
       hours  (h),  days (d), weeks (w), months	(M) or years (y).  Rabins will
       create and modify records as  required  to  split  on  prescribed  time
       boundaries.   If	 any record spans a time boundary, the record is split
       and the metrics are adjusted using a uniform distribution model to dis-
       tribute the statistics between the two records.

       See rasplit.1 for specifics.

RABINS SPECIFIC	OPTIONS
       rabins, like all	ra based clients, supports a number of ra options  in-
       cluding	remote	data access, reading from multiple files and filtering
       of input	argus records through a	terminating filter expression.	Rabins
       also provides all the functions of racluster.1 and rasplit.1, for  pro-
       cessing and outputing data.  rabins specific options are:

       -B secs
	    Holding  time  in  seconds	before closing a bin and outputing its
	    contents.

       -M splitmode
	    Supported spliting modes are:

	      time <n[smhdwMy]>
		   bin records into time slots of n size.  This	 is  used  for
		   time	 series	 analytics,  especially	graphing.  Records, by
		   default are split, so that their timestamps do not span the
		   time	range specified.  Metrics  are	uniformly  distributed
		   among the resulting records.

	      count <n[kmb]>
		   bin	records	 into  chunks  based on	the number of records.
		   This	is used	for archive management and parallel processing
		   analytics, to limit the size	of data	 processing  to	 fixed
		   numbers of records.

	      size <n[kmb]>
		   bin records into chunks based on the	number of total	bytes.
		   This	is used	for archive management and parallel processing
		   analytics,  to  limit  the size of data processing to fixed
		   byte	limitations.

       -M modes
	    Supported processing modes are:
	      hard split on hard time boundaries.  Each	flow records start and
		   stop	times will be the time boundary	times.	The default is
		   to use the original start  and  stop	 timestamps  from  the
		   records that	make up	the resulting aggregation.
	      nomodify
		   Do  not split the record when including it into a time bin.
		   This	allows a time bin to represent times  outside  of  its
		   defintion.	This option should not be used with the	'hard'
		   option, as you will modify metrics and semantics.
       -m aggregation object
	    Supported aggregation objects are:
	      none	     use a null	flow key.
	      srcid	     argus source identifier.
	      smac	     source mac(ether) addr.
	      dmac	     destination mac(ether) addr.
	      soui	     oui portion of the	source mac(ether) addr.
	      doui	     oui portion of the	destination mac(ether) addr.
	      smpls	     source mpls label.
	      dmpls	     destination label addr.
	      svlan	     source vlan label.
	      dvlan	     destination vlan addr.
	      saddr/[l|m]    source IP addr/[cidr len |	m.a.s.k].
	      daddr/[l|m]    destination IP addr/[cidr len | m.a.s.k].
	      matrix/l	     sorted src	and dst	IP addr/cidr len.
	      proto	     transaction protocol.
	      sport	     source port number. Implies use of	'proto'.
	      dport	     destination port number. Implies use of 'proto'.
	      stos	     source TOS	byte value.
	      dtos	     destination TOS byte value.
	      sttl	     src -> dst	TTL value.
	      dttl	     dst -> src	TTL value.
	      stcpb	     src -> dst	TCP base sequence number.
	      dtcpb	     dst -> src	TCP base sequence number.
	      inode[/l|m]]   intermediate node IP addr/[cidr len  |  m.a.s.k],
			     source of ICMP mapped events.
	      sco	     source ARIN country code, if present.
	      dco	     destination ARIN country code, if present.
	      sas	     source node origin	AS number, if available.
	      das	     destination node origin AS	number,	if available.
	      ias	     intermediate node origin AS number, if available.

       -P sort field
	    Rabins  can	 sort  its output based	on a sort field	specification.
	    Because the	-m option is used for aggregation fields, -P  is  used
	    to	specify	 the print priority order.  See	rasort(1) for the list
	    of sortable	fields.

       -w filename
	    Rabins supports an extended	 -w  option  that  allows  for	output
	    record  contents  to be inserted into the output filename.	Speci-
	    fied using '$' (dollar) notation, any printable field can be used.
	    Care should	be taken to honor any shell escape  requirements  when
	    specifying	on the command line.  See ra(1)	for the	list of	print-
	    able fields.

	    Another extended  feature,	when  using  time  mode,  rabins  will
	    process  the  supplied  filename  using  strftime(3), so that time
	    fields can be inserted into	the resulting output filename.

INVOCATION
       This invocation aggregates inputfile based on  10  minute  time	bound-
       aries.	Input  is  split  to fit within	a 10 minute time boundary, and
       within those boundaries,	argus records are aggregated.	The  resulting
       output its streamed to a	single file.

	  rabins -r * -M time 10m -w outputfile

       This  next  invocation  aggregates  inputfiles  based  on 5 minute time
       boundaries, and the output is written to	 5  minute  files.   Input  is
       split  such that	all records conform to hard 10 minute time boundaries,
       and within those	boundaries, argus  records  are	 aggregated,  in  this
       case, based on IP address matrix.
       The  resulting  output its streamed to files that are named relative to
       the records output content, a prefix of /matrix/%Y/%m/%d/argus. and the
       suffixes	%H.%M.%S.

	  rabins -r * -M hard time 5m -m matrix	-w "/matrix/%Y/%m/%d/argus.%H.%M.%S"

       This next invocation aggregates input.stream based on matrix/24 into 10
       second time boundaries, holds the data for an additional	5 seconds  af-
       ter  the	 time boundary has passed, and then prints the complete	sorted
       contents	of each	bin to standard	output.	 The output is printed	at  10
       second intervals, and the output	is the content of the previous	10 sec
       time bin.  This example is meant	to provide, every 10 seconds, the sum-
       mary  of	all Class C subnet activity seen.  It is intended to run inde-
       finately	printing out aggregated	summary	records.  By modifying the ag-
       gregation model,	using the "-f racluster.conf" option, you can  achieve
       a great deal of data reduction with a lot of semantic reporting.

       % rabins	-S localhost -m	matrix/24 -B 5s	-M hard	time 10s -p0 -s	+1trans	- ipv4
		  StartTime  Trans  Proto	     SrcAddr   Dir	      DstAddr  SrcPkts	DstPkts	    SrcBytes	 DstBytes State
	2012/02/15.13:37:00	 5     ip     192.168.0.0/24   <->     192.168.0.0/24	    41	     40		2860	    12122   CON
	2012/02/15.13:37:00	 2     ip     192.168.0.0/24	->	 224.0.0.0/24	     2	      0		 319		0   INT
       [ 10 seconds pass]
	2012/02/15.13:37:10	13     ip     192.168.0.0/24   <->    208.59.201.0/24	   269	    351	       97886	   398700   CON
	2012/02/15.13:37:10	14     ip     192.168.0.0/24   <->     192.168.0.0/24	    86	     92		7814	    46800   CON
	2012/02/15.13:37:10	 1     ip    17.172.224.0/24   <->     192.168.0.0/24	    52	     37	       68125	     4372   CON
	2012/02/15.13:37:10	 1     ip     192.168.0.0/24   <->	199.7.55.0/24	     7	      7		 784	     2566   CON
	2012/02/15.13:37:10	 1     ip     184.85.13.0/24   <->     192.168.0.0/24	     6	      5		3952	     2204   CON
	2012/02/15.13:37:10	 2     ip    66.235.132.0/24   <->     192.168.0.0/24	     5	      6		 915	     3732   CON
	2012/02/15.13:37:10	 1     ip    74.125.226.0/24   <->     192.168.0.0/24	     3	      4		 709	      888   CON
	2012/02/15.13:37:10	 3     ip	66.39.3.0/24   <->     192.168.0.0/24	     3	      3		 369	      198   CON
	2012/02/15.13:37:10	 1     ip     192.168.0.0/24   <->     205.188.1.0/24	     1	      1		  54	      356   CON
       [ 10 seconds pass]
	2012/02/15.13:37:20	 6     ip     192.168.0.0/24   <->    208.59.201.0/24	   392	    461	       60531	   623894   CON
	2012/02/15.13:37:20	 8     ip     192.168.0.0/24   <->     192.168.0.0/24	    95	    111		6948	    93536   CON
	2012/02/15.13:37:20	 3     ip     72.14.204.0/24   <->     192.168.0.0/24	    38	     32	       38568	     4414   CON
	2012/02/15.13:37:20	 1     ip    17.112.156.0/24   <->     192.168.0.0/24	    26	     13	       21798	     7116   CON
	2012/02/15.13:37:20	 2     ip    66.235.132.0/24   <->     192.168.0.0/24	     6	      3		1232	     4450   CON
	2012/02/15.13:37:20	 1     ip    66.235.133.0/24   <->     192.168.0.0/24	     1	      2		  82	      132   CON
       [ 10 seconds pass]
	2012/02/15.13:37:30    117     ip     192.168.0.0/24   <->    208.59.201.0/24	   697	    663	      369769	   134382   CON
	2012/02/15.13:37:30	11     ip     192.168.0.0/24   <->     192.168.0.0/24	   147	    187	       11210	   193253   CON
	2012/02/15.13:37:30	 1     ip     184.85.13.0/24   <->     192.168.0.0/24	    13	      9	       13408	     9031   CON
	2012/02/15.13:37:30	 2     ip    66.235.132.0/24   <->     192.168.0.0/24	     8	      7		1920	    11563   CON
	2012/02/15.13:37:30	 1     ip     192.168.0.0/24   <->    207.46.193.0/24	     5	      3		 802	      562   CON
	2012/02/15.13:37:30	 1     ip    17.112.156.0/24   <->     192.168.0.0/24	     5	      2		 646	     3684   CON
	2012/02/15.13:37:30	 2     ip     192.168.0.0/24	->	 224.0.0.0/24	     2	      0		 382		0   REQ
       [ 10 seconds pass]

       This  next  invocation  reads  IP  argus(8)  data  from	inputfile  and
       processes, the argus(8) data stream based on  input  byte  size	of  no
       greater	than  1	Megabyte.  The resulting output	stream is written to a
       single argus.out	data file.

	  rabins -r argusfile -M size 1m -s +1dur -m proto -w argus.out	- ip

       This invocation reads IP	argus(8) data from  inputfile  and  aggregates
       the argus(8) data stream	based on input file size of no greater than 1K
       flows.	The  resulting output stream is	printed	to the screen as stan-
       dard argus records.

	  rabins -r argusfile -M count 1k -m proto -s stime dur	proto spkts dpkts - ip

COPYRIGHT
       Copyright (c) 2000-2016 QoSient.	All rights reserved.

SEE ALSO
       ra(1), racluster(1), rasplit(1),	rarc(5), argus(8),

AUTHORS
       Carter Bullard (carter@qosient.com).

rabins 3.0.8			12 August 2003			     RABINS(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=rabins&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help