Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
RRDCREATE(1)			    rrdtool			  RRDCREATE(1)

NAME
       rrdcreate - Set up a new	Round Robin Database

SYNOPSIS
       rrdtool	 create	  filename   [--start|-b start time]  [--step|-s step]
       [DS:ds-name:DST:dst arguments] [RRA:CF:cf arguments]

DESCRIPTION
       The create function of RRDtool lets you set up new Round	Robin Database
       (RRD) files.  The file is created at its	final, full  size  and	filled
       with *UNKNOWN* data.

       filename
	   The	name  of the RRD you want to create. RRD files should end with
	   the extension .rrd. However,	RRDtool	will accept any	filename.

       --start|-b start	time (default: now - 10s)
	   Specifies the time in seconds since 1970-01-01 UTC when  the	 first
	   value  should be added to the RRD. RRDtool will not accept any data
	   timed before	or at the time specified.

	   See also AT-STYLE TIME SPECIFICATION	section	in the rrdfetch	 docu-
	   mentation for other ways to specify time.

       --step|-s step (default:	300 seconds)
	   Specifies  the base interval	in seconds with	which data will	be fed
	   into	the RRD.

       DS:ds-name:DST:dst arguments
	   A single RRD	can accept input from several data sources  (DS),  for
	   example  incoming  and outgoing traffic on a	specific communication
	   line. With the DS configuration option you must define  some	 basic
	   properties of each data source you want to store in the RRD.

	   ds-name  is the name	you will use to	reference this particular data
	   source from an RRD. A ds-name must be 1 to 19  characters  long  in
	   the characters [a-zA-Z0-9_].

	   DST defines the Data	Source Type. The remaining arguments of	a data
	   source  entry  depend  on the data source type. For GAUGE, COUNTER,
	   DERIVE, and ABSOLUTE	the format for a data source entry is:

	   DS:ds-name:GAUGE | COUNTER |	DERIVE | ABSOLUTE:heartbeat:min:max

	   For COMPUTE data sources, the format	is:

	   DS:ds-name:COMPUTE:rpn-expression

	   In order to decide which data source	type to	use, review the	defin-
	   itions that follow. Also consult the	section	on  "HOW  TO  MEASURE"
	   for further insight.

	   GAUGE
	       is  for	things like temperatures or number of people in	a room
	       or the value of a RedHat	share.

	   COUNTER
	       is for continuous incrementing  counters	 like  the  ifInOctets
	       counter	in  a router. The COUNTER data source assumes that the
	       counter never decreases,	except when a counter overflows.   The
	       update  function	 takes the overflow into account.  The counter
	       is stored as a per-second rate.	When  the  counter  overflows,
	       RRDtool	checks	if the overflow	happened at the	32bit or 64bit
	       border and acts accordingly by adding an	appropriate  value  to
	       the result.

	   DERIVE
	       will  store  the	 derivative of the line	going from the last to
	       the current value of the	data source. This can  be  useful  for
	       gauges,	for example, to	measure	the rate of people entering or
	       leaving a room. Internally, derive works	exactly	 like  COUNTER
	       but  without overflow checks. So	if your	counter	does not reset
	       at 32 or	64 bit you might want to use  DERIVE  and  combine  it
	       with a MIN value	of 0.

	       NOTE on COUNTER vs DERIVE

	       by Don Baarda <don.baarda@baesystems.com>

	       If  you	cannot	tolerate ever mistaking	the occasional counter
	       reset for a legitimate counter  wrap,  and  would  prefer  "Un-
	       knowns" for all legitimate counter wraps	and resets, always use
	       DERIVE with min=0. Otherwise, using COUNTER with	a suitable max
	       will  return  correct  values for all legitimate	counter	wraps,
	       mark some counter resets	as "Unknown",  but  can	 mistake  some
	       counter resets for a legitimate counter wrap.

	       For a 5 minute step and 32-bit counter, the probability of mis-
	       taking  a counter reset for a legitimate	wrap is	arguably about
	       0.8% per	1Mbps of maximum bandwidth. Note that this equates  to
	       80%  for	 100Mbps  interfaces, so for high bandwidth interfaces
	       and a 32bit counter, DERIVE with	min=0 is probably  preferable.
	       If  you	are  using a 64bit counter, just about any max setting
	       will eliminate the possibility  of  mistaking  a	 reset	for  a
	       counter wrap.

	   ABSOLUTE
	       is  for counters	which get reset	upon reading. This is used for
	       fast counters which tend	to overflow.  So  instead  of  reading
	       them  normally you reset	them after every read to make sure you
	       have a maximum time available before the	next overflow. Another
	       usage is	for things you count like number of messages since the
	       last update.

	   COMPUTE
	       is for storing the result of a formula applied  to  other  data
	       sources in the RRD. This	data source is not supplied a value on
	       update,	but rather its Primary Data Points (PDPs) are computed
	       from the	PDPs of	the data sources according to the  rpn-expres-
	       sion that defines the formula. Consolidation functions are then
	       applied	normally  to the PDPs of the COMPUTE data source (that
	       is the rpn-expression is	only applied  to  generate  PDPs).  In
	       database	 software, such	data sets are referred to as "virtual"
	       or "computed" columns.

	   heartbeat defines the maximum number	of seconds that	may  pass  be-
	   tween  two updates of this data source before the value of the data
	   source is assumed to	be *UNKNOWN*.

	   min and max define the expected range values	for data supplied by a
	   data	source.	If min and/or max any value outside the	defined	 range
	   will	be regarded as *UNKNOWN*. If you do not	know or	care about min
	   and	max,  set  them	to U for unknown. Note that min	and max	always
	   refer to the	processed values of the	DS. For	a traffic-COUNTER type
	   DS this would be the	maximum	and minimum  data-rate	expected  from
	   the device.

	   If information on minimal/maximal expected values is	available, al-
	   ways	 set  the min and/or max properties. This will help RRDtool in
	   doing a simple sanity check on the data supplied when  running  up-
	   date.

	   rpn-expression  defines  the	 formula used to compute the PDPs of a
	   COMPUTE data	source from other data sources in the same  <RRD>.  It
	   is  similar	to  defining  a	 CDEF  argument	for the	graph command.
	   Please refer	to that	manual page for	a list and description of  RPN
	   operations  supported.  For COMPUTE data sources, the following RPN
	   operations are not supported: COUNT,	PREV, TIME, and	LTIME. In  ad-
	   dition, in defining the RPN expression, the COMPUTE data source may
	   only	 refer	to  the	 names of data source listed previously	in the
	   create command. This	is similar to the restriction that CDEFs  must
	   refer  only	to DEFs	and CDEFs previously defined in	the same graph
	   command.

       RRA:CF:cf arguments
	   The purpose of an RRD is to store data in the round robin  archives
	   (RRA). An archive consists of a number of data values or statistics
	   for	each  of  the defined data-sources (DS)	and is defined with an
	   RRA line.

	   When	data is	entered	into an	RRD, it	is first fit into  time	 slots
	   of  the  length defined with	the -s option, thus becoming a primary
	   data	point.

	   The data is also processed with the consolidation function (CF)  of
	   the archive.	There are several consolidation	functions that consol-
	   idate  primary data points via an aggregate function: AVERAGE, MIN,
	   MAX,	LAST. The format of RRA	line for these consolidation functions
	   is:

	   RRA:AVERAGE | MIN | MAX | LAST:xff:steps:rows

	   xff The xfiles factor defines what part of a	consolidation interval
	   may be made up from *UNKNOWN* data while the	consolidated value  is
	   still  regarded  as known. It is given as the ratio of allowed *UN-
	   KNOWN* PDPs to the number of	PDPs in	the interval. Thus, it	ranges
	   from	0 to 1 (exclusive).

	   steps  defines  how	many  of these primary data points are used to
	   build a consolidated	data point which then goes into	the archive.

	   rows	defines	how many generations of	data values  are  kept	in  an
	   RRA.

Aberrant Behavior Detection with Holt-Winters Forecasting
       In  addition to the aggregate functions,	there are a set	of specialized
       functions that enable RRDtool to	provide	data smoothing (via the	 Holt-
       Winters	forecasting  algorithm),  confidence  bands,  and the flagging
       aberrant	behavior in the	data source time series:

          RRA:HWPREDICT:rows:alpha:beta:seasonal period[:rra-num]

          RRA:SEASONAL:seasonal period:gamma:rra-num

          RRA:DEVSEASONAL:seasonal period:gamma:rra-num

          RRA:DEVPREDICT:rows:rra-num

          RRA:FAILURES:rows:threshold:window length:rra-num

       These RRAs differ from the  true	 consolidation	functions  in  several
       ways.   First,  each of the RRAs	is updated once	for every primary data
       point.  Second, these RRAs are interdependent.  To  generate  real-time
       confidence  bounds,  a matched set of HWPREDICT,	SEASONAL, DEVSEASONAL,
       and DEVPREDICT must exist. Generating smoothed values  of  the  primary
       data  points  requires  both a HWPREDICT	RRA and	SEASONAL RRA. Aberrant
       behavior	detection requires FAILURES, HWPREDICT,	DEVSEASONAL, and  SEA-
       SONAL.

       The  actual  predicted, or smoothed, values are stored in the HWPREDICT
       RRA. The	predicted deviations are stored	in DEVPREDICT (think  a	 stan-
       dard  deviation	which  can  be scaled to yield a confidence band). The
       FAILURES	RRA stores binary indicators. A	1 marks	the  indexed  observa-
       tion as failure;	that is, the number of confidence bounds violations in
       the  preceding  window  of  observations	 met  or  exceeded a specified
       threshold. An example of	using these RRAs to  graph  confidence	bounds
       and failures appears in rrdgraph.

       The  SEASONAL  and DEVSEASONAL RRAs store the seasonal coefficients for
       the Holt-Winters	forecasting algorithm and the seasonal deviations, re-
       spectively.  There is one entry per observation time point in the  sea-
       sonal  cycle.  For  example, if primary data points are generated every
       five minutes and	the seasonal cycle is 1	day, both SEASONAL and DEVSEA-
       SONAL will have 288 rows.

       In order	to simplify the	creation for the novice	user, in  addition  to
       supporting  explicit  creation  of the HWPREDICT, SEASONAL, DEVPREDICT,
       DEVSEASONAL, and	FAILURES RRAs, the RRDtool create command supports im-
       plicit creation of the other four when HWPREDICT	is specified alone and
       the final argument rra-num is omitted.

       rows specifies the length of the	RRA prior  to  wrap  around.  Remember
       that  there  is a one-to-one correspondence between primary data	points
       and entries in these RRAs. For the HWPREDICT CF,	rows should be	larger
       than  the seasonal period. If the DEVPREDICT RRA	is implicitly created,
       the default number of rows is the same as the HWPREDICT rows  argument.
       If the FAILURES RRA is implicitly created, rows will be set to the sea-
       sonal  period argument of the HWPREDICT RRA. Of course, the RRDtool re-
       size command is available if these defaults are not sufficient and  the
       creator	wishes	to  avoid  explicit creations of the other specialized
       function	RRAs.

       seasonal	period specifies the number of primary data points in  a  sea-
       sonal  cycle.  If SEASONAL and DEVSEASONAL are implicitly created, this
       argument	for those RRAs is set automatically to the value specified  by
       HWPREDICT.  If  they  are explicitly created, the creator should	verify
       that all	three seasonal period arguments	agree.

       alpha is	the adaption parameter of the intercept	(or baseline)  coeffi-
       cient  in the Holt-Winters forecasting algorithm. See rrdtool for a de-
       scription of this algorithm. alpha must lie between 0 and  1.  A	 value
       closer to 1 means that more recent observations carry greater weight in
       predicting  the baseline	component of the forecast. A value closer to 0
       means that past history carries greater weight in predicting the	 base-
       line component.

       beta  is	 the adaption parameter	of the slope (or linear	trend) coeffi-
       cient in	the Holt-Winters forecasting algorithm.	beta must lie  between
       0  and 1	and plays the same role	as alpha with respect to the predicted
       linear trend.

       gamma is	the adaption parameter of the  seasonal	 coefficients  in  the
       Holt-Winters  forecasting algorithm (HWPREDICT) or the adaption parame-
       ter in the exponential smoothing	update of the seasonal deviations.  It
       must lie	between	0 and 1. If the	SEASONAL and DEVSEASONAL RRAs are cre-
       ated  implicitly,  they	will  both  have the same value	for gamma: the
       value specified for the HWPREDICT alpha	argument.  Note	 that  because
       there  is  one  seasonal	coefficient (or	deviation) for each time point
       during the seasonal cycle, the adaptation rate is much slower than  the
       baseline.  Each	seasonal  coefficient is only updated (or adapts) when
       the observed value occurs at the	offset in the  seasonal	 cycle	corre-
       sponding	to that	coefficient.

       If SEASONAL and DEVSEASONAL RRAs	are created explicitly,	gamma need not
       be  the same for	both. Note that	gamma can also be changed via the RRD-
       tool tune command.

       rra-num provides	the links between related RRAs.	If HWPREDICT is	speci-
       fied alone and the other	RRAs are created implicitly, then there	is  no
       need to worry about this	argument. If RRAs are created explicitly, then
       carefully  pay  attention to this argument. For each RRA	which includes
       this argument, there is a dependency between that RRA and another  RRA.
       The  rra-num argument is	the 1-based index in the order of RRA creation
       (that is, the order they	appear in the create command).	The  dependent
       RRA for each RRA	requiring the rra-num argument is listed here:

          HWPREDICT rra-num is	the index of the SEASONAL RRA.

          SEASONAL rra-num is the index of the	HWPREDICT RRA.

          DEVPREDICT rra-num is the index of the DEVSEASONAL RRA.

          DEVSEASONAL rra-num is the index of the HWPREDICT RRA.

          FAILURES rra-num is the index of the	DEVSEASONAL RRA.

       threshold  is the minimum number	of violations (observed	values outside
       the confidence bounds) within a window that constitutes a  failure.  If
       the FAILURES RRA	is implicitly created, the default value is 7.

       window  length  is  the number of time points in	the window. Specify an
       integer greater than or equal to	the threshold and less than  or	 equal
       to  28.	The time interval this window represents depends on the	inter-
       val between primary data	points.	If the FAILURES	RRA is implicitly cre-
       ated, the default value is 9.

The HEARTBEAT and the STEP
       Here is an explanation by Don Baarda on the inner workings of  RRDtool.
       It  may	help you to sort out why all this *UNKNOWN* data is popping up
       in your databases:

       RRDtool gets fed	samples/updates	at  arbitrary  times.  From  these  it
       builds  Primary	Data  Points (PDPs) on every "step" interval. The PDPs
       are then	accumulated into the RRAs.

       The "heartbeat" defines the maximum acceptable  interval	 between  sam-
       ples/updates. If	the interval between samples is	less than "heartbeat",
       then  an	 average  rate is calculated and applied for that interval. If
       the interval between samples is longer than "heartbeat",	then that  en-
       tire interval is	considered "unknown". Note that	there are other	things
       that  can  make a sample	interval "unknown", such as the	rate exceeding
       limits, or a sample that	was explicitly marked as unknown.

       The known rates during a	PDP's "step" interval are used to calculate an
       average rate for	that PDP. If the total	"unknown"  time	 accounts  for
       more  than half the "step", the entire PDP is marked as "unknown". This
       means that a mixture of known and "unknown" sample times	 in  a	single
       PDP  "step" may or may not add up to enough "known" time	to warrent for
       a known PDP.

       The "heartbeat" can be short (unusual) or long  (typical)  relative  to
       the "step" interval between PDPs. A short "heartbeat" means you require
       multiple	 samples  per  PDP, and	if you don't get them mark the PDP un-
       known. A	long heartbeat can span	multiple "steps", which	 means	it  is
       acceptable  to  have  multiple PDPs calculated from a single sample. An
       extreme example of this might be	a "step" of 5 minutes  and  a  "heart-
       beat"  of  one day, in which case a single sample every day will	result
       in all the PDPs for that	entire day period being	set to the same	 aver-
       age rate. -- Don	Baarda <don.baarda@baesystems.com>

	      time|
	      axis|
	begin__|00|
	       |01|
	      u|02|----* sample1, restart "hb"-timer
	      u|03|   /
	      u|04|  /
	      u|05| /
	      u|06|/	 "hbt" expired
	      u|07|
	       |08|----* sample2, restart "hb"
	       |09|   /
	       |10|  /
	      u|11|----* sample3, restart "hb"
	      u|12|   /
	      u|13|  /
	step1_u|14| /
	      u|15|/	 "swt" expired
	      u|16|
	       |17|----* sample4, restart "hb",	create "pdp" for step1 =
	       |18|   /	 = unknown due to 10 "u" labled	secs > 0.5 * step
	       |19|  /
	       |20| /
	       |21|----* sample5, restart "hb"
	       |22|   /
	       |23|  /
	       |24|----* sample6, restart "hb"
	       |25|   /
	       |26|  /
	       |27|----* sample7, restart "hb"
	step2__|28|   /
	       |22|  /
	       |23|----* sample8, restart "hb",	create "pdp" for step1,	create "cdp"
	       |24|   /
	       |25|  /

       graphics	by vladimir.lavrov@desy.de.

HOW TO MEASURE
       Here are	a few hints on how to measure:

       Temperature
	   Usually you have some type of meter you can read to get the temper-
	   ature.   The	 temperature  is not really connected with a time. The
	   only	connection is that the temperature reading happened at a  cer-
	   tain	time. You can use the GAUGE data source	type for this. RRDtool
	   will	then record your reading together with the time.

       Mail Messages
	   Assume  you	have  a	 method	to count the number of messages	trans-
	   ported by your mailserver in	a certain amount of time,  giving  you
	   data	 like  '5 messages in the last 65 seconds'. If you look	at the
	   count of 5 like an ABSOLUTE data type you can simply	update the RRD
	   with	the number 5 and the end time of your monitoring period.  RRD-
	   tool	will then record the number of messages	per second. If at some
	   later  stage	you want to know the number of messages	transported in
	   a day, you can get the average messages per second from RRDtool for
	   the day in question and multiply this number	 with  the  number  of
	   seconds  in a day. Because all math is run with Doubles, the	preci-
	   sion	should be acceptable.

       It's always a Rate
	   RRDtool stores rates	in amount/second for COUNTER, DERIVE  and  AB-
	   SOLUTE  data.   When	 you plot the data, you	will get on the	y axis
	   amount/second which you might be tempted to convert to an  absolute
	   amount by multiplying by the	delta-time between the points. RRDtool
	   plots  continuous data, and as such is not appropriate for plotting
	   absolute amounts as for example "total bytes" sent and received  in
	   a  router.  What you	probably want is plot rates that you can scale
	   to bytes/hour, for example, or plot absolute	amounts	 with  another
	   tool	 that  draws  bar-plots,  where	the delta-time is clear	on the
	   plot	for each point (such that when you read	the graph you see  for
	   example  GB	on the y axis, days on the x axis and one bar for each
	   day).

EXAMPLE
	rrdtool	create temperature.rrd --step 300 \
	 DS:temp:GAUGE:600:-273:5000 \
	 RRA:AVERAGE:0.5:1:1200	\
	 RRA:MIN:0.5:12:2400 \
	 RRA:MAX:0.5:12:2400 \
	 RRA:AVERAGE:0.5:12:2400

       This sets up an RRD called temperature.rrd which	accepts	 one  tempera-
       ture  value every 300 seconds. If no new	data is	supplied for more than
       600 seconds, the	temperature becomes *UNKNOWN*.	The minimum acceptable
       value is	-273 and the maximum is	5'000.

       A few archive areas are also defined. The first stores the temperatures
       supplied	for 100	hours (1'200 * 300 seconds = 100  hours).  The	second
       RRA  stores  the	minimum	temperature recorded over every	hour (12 * 300
       seconds = 1 hour), for 100 days (2'400 hours). The third	and the	fourth
       RRA's do	the same for the  maximum  and	average	 temperature,  respec-
       tively.

EXAMPLE	2
	rrdtool	create monitor.rrd --step 300	     \
	  DS:ifOutOctets:COUNTER:1800:0:4294967295   \
	  RRA:AVERAGE:0.5:1:2016		     \
	  RRA:HWPREDICT:1440:0.1:0.0035:288

       This  example  is a monitor of a	router interface. The first RRA	tracks
       the traffic flow	in octets; the second RRA  generates  the  specialized
       functions  RRAs	for aberrant behavior detection. Note that the rra-num
       argument	of HWPREDICT is	missing, so the	other RRAs will	implicitly  be
       created with default parameter values. In this example, the forecasting
       algorithm  baseline adapts quickly; in fact the most recent one hour of
       observations (each at 5 minute intervals) accounts for 75% of the base-
       line prediction.	The linear trend forecast adapts much more slowly. Ob-
       servations made during the last day (at 288 observations	per  day)  ac-
       count  for only 65% of the predicted linear trend. Note:	these computa-
       tions rely on an	exponential smoothing formula described	 in  the  LISA
       2000 paper.

       The  seasonal  cycle  is	 one day (288 data points at 300 second	inter-
       vals), and the seasonal adaption	parameter will be set to 0.1. The  RRD
       file  will  store 5 days	(1'440 data points) of forecasts and deviation
       predictions before wrap around. The file	will store 1 day  (a  seasonal
       cycle) of 0-1 indicators	in the FAILURES	RRA.

       The  same  RRD  file  and  RRAs are created with	the following command,
       which explicitly	creates	all specialized	function RRAs.

	rrdtool	create monitor.rrd --step 300 \
	  DS:ifOutOctets:COUNTER:1800:0:4294967295 \
	  RRA:AVERAGE:0.5:1:2016 \
	  RRA:HWPREDICT:1440:0.1:0.0035:288:3 \
	  RRA:SEASONAL:288:0.1:2 \
	  RRA:DEVPREDICT:1440:5	\
	  RRA:DEVSEASONAL:288:0.1:2 \
	  RRA:FAILURES:288:7:9:5

       Of course, explicit creation need not replicate implicit	create,	a num-
       ber of arguments	could be changed.

EXAMPLE	3
	rrdtool	create proxy.rrd --step	300 \
	  DS:Total:DERIVE:1800:0:U  \
	  DS:Duration:DERIVE:1800:0:U  \
	  DS:AvgReqDur:COMPUTE:Duration,Requests,0,EQ,1,Requests,IF,/ \
	  RRA:AVERAGE:0.5:1:2016

       This example is monitoring the average request duration during each 300
       sec interval for	requests processed by a	web proxy during the interval.
       In this case, the proxy exposes two counters, the  number  of  requests
       processed since boot and	the total cumulative duration of all processed
       requests. Clearly these counters	both have some rollover	point, but us-
       ing  the	DERIVE data source also	handles	the reset that occurs when the
       web proxy is stopped and	restarted.

       In the RRD, the first data source stores	the requests per  second  rate
       during  the  interval. The second data source stores the	total duration
       of all requests processed during	the interval divided by	300. The  COM-
       PUTE  data  source  divides each	PDP of the AccumDuration by the	corre-
       sponding	PDP of TotalRequests and stores	the average request  duration.
       The remainder of	the RPN	expression handles the divide by zero case.

AUTHOR
       Tobias Oetiker <tobi@oetiker.ch>

1.2.30				  2009-01-19			  RRDCREATE(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=rrdcreate&sektion=1&manpath=FreeBSD+Ports+14.3.quarterly>

home | help