Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
CDEFTUTORIAL(1)			    rrdtool		       CDEFTUTORIAL(1)

NAME
       cdeftutorial - Alex van den Bogaerdt's CDEF tutorial

DESCRIPTION
       Intention of this document: to provide some examples of the commonly
       used parts of RRDtool's CDEF language.

       If you think some important feature is not explained properly, and if
       adding it to this document would	benefit	most users, please do ask me
       to add it.  I will then try to provide an answer	in the next release of
       this tutorial.  No feedback equals no changes! Additions	to this
       document	are also welcome.  -- Alex van den Bogaerdt
       <alex@vandenbogaerdt.nl>

   Why this tutorial?
       One of the powerful parts of RRDtool is its ability to do all sorts of
       calculations on the data	retrieved from its databases. However,
       RRDtool's many options and syntax make it difficult for the average
       user to understand. The manuals are good	at explaining what these
       options do; however they	do not (and should not)	explain	in detail why
       they are	useful.	As with	my RRDtool tutorial: if	you want a simple
       document	in simple language you should read this	tutorial.  If you are
       happy with the official documentation, you may find this	document too
       simple or even boring. If you do	choose to read this tutorial, I	also
       expect you to have read and fully understand my other tutorial.

   More	reading
       If you have difficulties	with the way I try to explain it please	read
       Steve Rader's rpntutorial. It may help you understand how this all
       works.

What are CDEFs?
       When retrieving data from an RRD, you are using a "DEF" to work with
       that data. Think	of it as a variable that changes over time (where time
       is the x-axis). The value of this variable is what is found in the
       database	at that	particular time	and you	can't do any modifications on
       it. This	is what	CDEFs are for: they takes values from DEFs and perform
       calculations on them.

Syntax
	  DEF:var_name_1=some.rrd:ds_name:CF
	  CDEF:var_name_2=RPN_expression

       You first define	"var_name_1" to	be data	collected from data source
       "ds_name" found in RRD "some.rrd" with consolidation function "CF".

       Assume the ifInOctets SNMP counter is saved in mrtg.rrd as the DS "in".
       Then the	following DEF defines a	variable for the average of that data
       source:

	  DEF:inbytes=mrtg.rrd:in:AVERAGE

       Say you want to display bits per	second (instead	of bytes per second as
       stored in the database.)	 You have to define a calculation (hence
       "CDEF") on variable "inbytes" and use that variable (inbits) instead of
       the original:

	  CDEF:inbits=inbytes,8,*

       This tells RRDtool to multiply inbytes by eight to get inbits. I'll
       explain later how this works. In	the graphing or	printing functions,
       you can now use inbits where you	would use inbytes otherwise.

       Note that the variable name used	in the CDEF (inbits) must not be the
       same as the variable named in the DEF (inbytes)!

RPN-expressions
       RPN is short-hand for Reverse Polish Notation. It works as follows.
       You put the variables or	numbers	on a stack. You	also put operations
       (things-to-do) on the stack and this stack is then processed. The
       result will be placed on	the stack. At the end, there should be exactly
       one number left:	the outcome of the series of operations. If there is
       not exactly one number left, RRDtool will complain loudly.

       Above multiplication by eight will look like:

       1.  Start with an empty stack

       2.  Put the content of variable inbytes on the stack

       3.  Put the number eight	on the stack

       4.  Put the operation multiply on the stack

       5.  Process the stack

       6.  Retrieve the	value from the stack and put it	in variable inbits

       We  will	 now  do  an  example  with real numbers. Suppose the variable
       inbytes would have value	10, the	stack would be:

       1.  ||

       2.  |10|

       3.  |10|8|

       4.  |10|8|*|

       5.  |80|

       6.  ||

       Processing the stack (step 5) will retrieve one value  from  the	 stack
       (from  the  right  at  step 4). This is the operation multiply and this
       takes two values	off the	stack as input.	The result is put back on  the
       stack (the value	80 in this case). For multiplication the order doesn't
       matter, but for other operations	like subtraction and division it does.
       Generally speaking you have the following order:

	  y = A	- B  -->  y=minus(A,B)	-->  CDEF:y=A,B,-

       This  is	 not very intuitive (at	least most people don't	think so). For
       the function f(A,B) you reverse the position of "f",  but  you  do  not
       reverse the order of the	variables.

Converting your	wishes to RPN
       First,  get  a  clear  picture  of  what	you want to do.	Break down the
       problem in smaller portions until they cannot be	split anymore. Then it
       is rather simple	to convert your	ideas into RPN.

       Suppose you have	several	RRDs and would like to add up some counters in
       them. These could be, for instance, the counters	for every WAN link you
       are monitoring.

       You have:

	  router1.rrd with link1in link2in
	  router2.rrd with link1in link2in
	  router3.rrd with link1in link2in

       Suppose you would like to add up	all these counters, except for link2in
       inside router2.rrd. You need to do:

       (in this	example, "router1.rrd:link1in" means the DS link1in inside the
       RRD router1.rrd)

	  router1.rrd:link1in
	  router1.rrd:link2in
	  router2.rrd:link1in
	  router3.rrd:link1in
	  router3.rrd:link2in
	  --------------------	 +
	  (outcome of the sum)

       As a mathematical function, this	could be written:

       "add(router1.rrd:link1in	, router1.rrd:link2in ,	router2.rrd:link1in  ,
       router3.rrd:link1in , router3.rrd:link2.in)"

       With RRDtool and	RPN, first, define the inputs:

	  DEF:a=router1.rrd:link1in:AVERAGE
	  DEF:b=router1.rrd:link2in:AVERAGE
	  DEF:c=router2.rrd:link1in:AVERAGE
	  DEF:d=router3.rrd:link1in:AVERAGE
	  DEF:e=router3.rrd:link2in:AVERAGE

       Now, the	mathematical function becomes: "add(a,b,c,d,e)"

       In  RPN,	there's	no operator that sums more than	two values so you need
       to do several additions.	You add	a and b, add c to the result, add d to
       the result and add e to the result.

	  push a:	  a	stack contains the value of a
	  push b and add: b,+	stack contains the result of a+b
	  push c and add: c,+	stack contains the result of a+b+c
	  push d and add: d,+	stack contains the result of a+b+c+d
	  push e and add: e,+	stack contains the result of a+b+c+d+e

       What was	calculated here	would be written down as:

	  ( ( (	(a+b) +	c) + d)	+ e) >

       This is in RPN:	"CDEF:result=a,b,+,c,+,d,+,e,+"

       This is correct but it can be made more clear to	humans.	 It  does  not
       matter if you add a to b	and then add c to the result or	first add b to
       c  and  then add	a to the result. This makes it possible	to rewrite the
       RPN   into   "CDEF:result=a,b,c,d,e,+,+,+,+"   which    is    evaluated
       differently:

	  push value of	variable a on the stack: a
	  push value of	variable b on the stack: a b
	  push value of	variable c on the stack: a b c
	  push value of	variable d on the stack: a b c d
	  push value of	variable e on the stack: a b c d e
	  push operator	+ on the stack:		 a b c d e +
	  and process it:			 a b c P   (where P == d+e)
	  push operator	+ on the stack:		 a b c P +
	  and process it:			 a b Q	   (where Q == c+P)
	  push operator	+ on the stack:		 a b Q +
	  and process it:			 a R	   (where R == b+Q)
	  push operator	+ on the stack:		 a R +
	  and process it:			 S	   (where S == a+R)

       As  you	can see	the RPN	expression "a,b,c,d,e,+,+,+,+,+" will evaluate
       in   "((((d+e)+c)+b)+a)"	  and	it   has   the	 same	 outcome    as
       "a,b,+,c,+,d,+,e,+".   This  is called the commutative law of addition,
       but you may forget this right away, as long as  you  remember  what  it
       means.

       Now look	at an expression that contains a multiplication:

       First  in  normal  math:	 "let  result =	a+b*c".	In this	case you can't
       choose the order	yourself, you have to start  with  the	multiplication
       and  then  add a	to it. You may alter the position of b and c, you must
       not alter the position of a and b.

       You have	to take	this in	consideration when converting this  expression
       into RPN. Read it as: "Add the outcome of b*c to	a" and then it is easy
       to write	the RPN	expression: "result=a,b,c,*,+" Another expression that
       would return the	same: "result=b,c,*,a,+"

       In normal math, you may encounter something like	"a*(b+c)" and this can
       also  be	converted into RPN. The	parenthesis just tell you to first add
       b and c,	and then multiply a with the result. Again, now	it is easy  to
       write  it in RPN: "result=a,b,c,+,*". Note that this is very similar to
       one  of	the  expressions  in  the   previous   paragraph,   only   the
       multiplication and the addition changed places.

       When  you  have	problems with RPN or when RRDtool is complaining, it's
       usually a good thing to write down the stack on a piece	of  paper  and
       see  what  happens.  Have  the  manual ready and	pretend	to be RRDtool.
       Just do all the math by hand to see what	happens, I'm  sure  this  will
       solve most, if not all, problems	you encounter.

Some special numbers
   The unknown value
       Sometimes  collecting  your  data  will	fail. This can be very common,
       especially when querying	over busy links. RRDtool can be	configured  to
       allow for one (or even more) unknown value(s) and calculate the missing
       update.	You can, for instance, query your device every minute. This is
       creating	one so called PDP or primary data point	 per  minute.  If  you
       defined	your  RRD  to  contain an RRA that stores 5-minute values, you
       need five of those PDPs to create one CDP  (consolidated	 data  point).
       These PDPs can become unknown in	two cases:

       1.  The	updates	are too	far apart. This	is tuned using the "heartbeat"
	   setting.

       2.  The update was set to unknown on  purpose  by  inserting  no	 value
	   (using the template option) or by using "U" as the value to insert.

       When  a	CDP is calculated, another mechanism determines	if this	CDP is
       valid or	not. If	there are too many PDPs	unknown, the CDP is unknown as
       well.  This is determined by the	 xff  factor.  Please  note  that  one
       unknown	counter	 update	 can  result  in two unknown PDPs! If you only
       allow for one unknown PDP per CDP, this makes the CDP go	unknown!

       Suppose the counter increments with one per second and you retrieve  it
       every minute:

	  counter value	   resulting rate
	  10'000
	  10'060	    1; (10'060-10'000)/60 == 1
	  10'120	    1; (10'120-10'060)/60 == 1
	  unknown	    unknown; you don't know the	last value
	  10'240	    unknown; you don't know the	previous value
	  10'300	    1; (10'300-10'240)/60 == 1

       If  the	CDP  was to be calculated from the last	five updates, it would
       get two unknown PDPs and	three known PDPs. If xff would have  been  set
       to 0.5 which by the way is a commonly used factor, the CDP would	have a
       known  value of 1. If xff would have been set to	0.2 then the resulting
       CDP would be unknown.

       You have	to decide the proper values for	heartbeat, number of PDPs  per
       CDP  and	 the  xff  factor.  As you can see from	the previous text they
       define the behavior of your RRA.

   Working with	unknown	data in	your database
       As you have read	in the previous	chapter, entries in an RRA can be  set
       to  the	unknown	value. If you do calculations with this	type of	value,
       the result has to be unknown too. This means that an expression such as
       "result=a,b,+" will be unknown if either	a or b is unknown.   It	 would
       be  wrong  to just ignore the unknown value and return the value	of the
       other parameter.	By doing so, you would assume "unknown"	 means	"zero"
       and this	is not true.

       There  has  been	 a  case where somebody	was collecting data for	over a
       year.  A	new piece of equipment was installed, a	new  RRD  was  created
       and the scripts were changed to add a counter from the old database and
       a  counter from the new database. The result was	disappointing, a large
       part of the statistics seemed to	have vanished mysteriously  ...	  They
       of  course  didn't,  values  from  the old database (known values) were
       added to	values from the	new database (unknown values) and  the	result
       was unknown.

       In this case, it	is fairly reasonable to	use a CDEF that	alters unknown
       data  into zero.	The counters of	the device were	unknown	(after all, it
       wasn't installed	yet!) but you know that	 the  data  rate  through  the
       device  had  to	be  zero  (because  of	the  same  reason:  it was not
       installed).

       There are some examples below that make this change.

   Infinity
       Infinite	data is	another	form of	a special number. It cannot be graphed
       because by definition you would never reach the infinite	value. You can
       think of	positive and  negative	infinity  depending  on	 the  position
       relative	to zero.

       RRDtool	is  capable  of	 representing  (-not-  graphing!)  infinity by
       stopping	at its current maximum (for positive infinity) or minimum (for
       negative	infinity) without knowing this maximum (minimum).

       Infinity	in RRDtool is mostly used to draw an AREA without knowing  its
       vertical	 dimensions.  You  can	think of it as drawing an AREA with an
       infinite	height and displaying only the part that  is  visible  in  the
       current	graph. This is probably	a good way to approximate infinity and
       it sure allows for some neat tricks. See	below for examples.

   Working with	unknown	data and infinity
       Sometimes you would like	to discard unknown data	and pretend it is zero
       (or any other value for that matter) and	sometimes you  would  like  to
       pretend that known data is unknown (to discard known-to-be-wrong	data).
       This  is	 why  CDEFs  have  support  for	 unknown  data.	There are also
       examples	available that show unknown data by using infinity.

Some examples
   Example: using a recently created RRD
       You are keeping statistics on your router for over a year now. Recently
       you installed an	extra router and you would like	to show	 the  combined
       throughput for these two	devices.

       If  you	just  add up the counters from router.rrd and router2.rrd, you
       will  add  known	 data  (from  router.rrd)  to	unknown	  data	 (from
       router2.rrd) for	the bigger part	of your	stats. You could solve this in
       a few ways:

          While  creating the new database, fill it with zeros	from the start
	   to now.  You	have to	make the database start	at or before the least
	   recent time in the other database.

          Alternatively, you could use	CDEF and alter unknown data to zero.

       Both methods have their pros and	cons. The first	method is  troublesome
       and  if	you  want to do	that you have to figure	it out yourself. It is
       not possible to create a	database filled	with zeros, you	 have  to  put
       them in manually. Implementing the second method	is described next:

       What  we	want is: "if the value is unknown, replace it with zero". This
       could be	written	in pseudo-code as:  if (value is unknown) then	(zero)
       else  (value).  When  reading  the  rrdgraph manual you notice the "UN"
       function	that returns zero or one. You also notice  the	"IF"  function
       that takes zero or one as input.

       First  look at the "IF" function. It takes three	values from the	stack,
       the first value is the decision point, the second value is returned  to
       the  stack  if  the evaluation is "true"	and if not, the	third value is
       returned	to the stack. We want the "UN" function	to decide what happens
       so we combine those two functions in one	CDEF.

       Lets write down the two possible	paths for the "IF" function:

	  if true  return a
	  if false return b

       In RPN:	"result=x,a,b,IF" where	"x" is either true or false.

       Now we have to fill in "x", this	should be  the	"(value	 is  unknown)"
       part and	this is	in RPN:	 "result=value,UN"

       We  now	combine	them: "result=value,UN,a,b,IF" and when	we fill	in the
       appropriate things for "a" and "b" we're	finished:

       "CDEF:result=value,UN,0,value,IF"

       You may want to read Steve Rader's RPN guide if you  have  difficulties
       with the	way I explained	this last example.

       If you want to check this RPN expression, just mimic RRDtool behavior:

	  For any known	value, the expression evaluates	as follows:
	  CDEF:result=value,UN,0,value,IF  (value,UN) is not true so it	becomes	0
	  CDEF:result=0,0,value,IF	   "IF"	will return the	3rd value
	  CDEF:result=value		   The known value is returned

	  For the unknown value, this happens:
	  CDEF:result=value,UN,0,value,IF  (value,UN) is true so it becomes 1
	  CDEF:result=1,0,value,IF	   "IF"	sees 1 and returns the 2nd value
	  CDEF:result=0			   Zero	is returned

       Of  course, if you would	like to	see another value instead of zero, you
       can use that other value.

       Eventually, when	all unknown data is removed from the RRD, you may want
       to remove this rule so that unknown data	is properly displayed.

   Example: better handling of unknown data, by	using time
       The above example has one drawback. If you do log unknown data in  your
       database	  after	 installing  your  new	equipment,  it	will  also  be
       translated into zero and	therefore you  won't  see  that	 there	was  a
       problem.	This is	not good and what you really want to do	is:

          If  there  is  unknown  data, look at the time that this sample was
	   taken.

          If the unknown value	is before time xxx, make it zero.

          If it is after time xxx, leave it as	unknown	data.

       This is doable: you can compare the time	that the sample	was  taken  to
       some  known time. Assuming you started to monitor your device on	Friday
       September 17, 1999, 00:35:57 MET	DST. Translate this  time  in  seconds
       since  1970-01-01  and  it  becomes 937'521'357.	If you process unknown
       values that were	received after this  time,  you	 want  to  leave  them
       unknown	and  if	 they  were  "received"	 before	this time, you want to
       translate them into zero	(so you	 can  effectively  ignore  them	 while
       adding them to your other routers counters).

       Translating   Friday   September	  17,  1999,  00:35:57	MET  DST  into
       937'521'357 can be done by, for instance, using gnu date:

	  date -d "19990917 00:35:57" +%s

       You could also dump the database	and see	where the data	starts	to  be
       known. There are	several	other ways of doing this, just pick one.

       Now  we	have  to  create  the  magic that allows us to process unknown
       values different	depending on the time that the sample was taken.  This
       is a three step process:

       1.  If the timestamp of the value is after 937'521'357, leave it	as is.

       2.  If the value	is a known value, leave	it as is.

       3.  Change the unknown value into zero.

       Lets look at part one:

	   if (true) return the	original value

       We rewrite this:

	   if (true) return "a"
	   if (false) return "b"

       We need to calculate true or false from step 1.	There  is  a  function
       available  that	returns	 the  timestamp	 for the current sample. It is
       called, how surprisingly, "TIME". This time has to  be  compared	 to  a
       constant	 number, we need "GT". The output of "GT" is true or false and
       this is good input to "IF". We want "if (time > 937521357) then (return
       a) else (return b)".

       This process was	already	described thoroughly in	the  previous  chapter
       so lets do it quick:

	  if (x) then a	else b
	     where x represents	"time>937521357"
	     where a represents	the original value
	     where b represents	the outcome of the previous example

	  time>937521357       --> TIME,937521357,GT

	  if (x) then a	else b --> x,a,b,IF
	  substitute x	       --> TIME,937521357,GT,a,b,IF
	  substitute a	       --> TIME,937521357,GT,value,b,IF
	  substitute b	       --> TIME,937521357,GT,value,value,UN,0,value,IF,IF

       We		     end		    up			 with:
       "CDEF:result=TIME,937521357,GT,value,value,UN,0,value,IF,IF"

       This looks very complex,	however, as you	can see, it was	not  too  hard
       to come up with.

   Example: Pretending weird data isn't	there
       Suppose	you have a problem that	shows up as huge spikes	in your	graph.
       You know	this happens and  why,	so  you	 decide	 to  work  around  the
       problem.	 Perhaps you're	using your network to do a backup at night and
       by  doing  so  you  get	almost	10mb/s	while the rest of your network
       activity	does not produce numbers higher	than 100kb/s.

       There are two options:

       1.  If the number exceeds 100kb/s it is wrong and you  want  it	masked
	   out by changing it into unknown.

       2.  You don't want the graph to show more than 100kb/s.

       Pseudo code: if (number > 100) then unknown else	number or Pseudo code:
       if (number > 100) then 100 else number.

       The  second  "problem"  may also	be solved by using the rigid option of
       RRDtool graph, however this has not the same result.  In	 this  example
       you can end up with a graph that	does autoscaling. Also,	if you use the
       numbers to display maxima they will be set to 100kb/s.

       We  use "IF" and	"GT" again. "if	(x) then (y) else (z)" is written down
       as "CDEF:result=x,y,z,IF"; now fill in x, y and z.  For x you  fill  in
       "number	greater	 than  100kb/s"	 becoming  "number,100000,GT" (kilo is
       1'000 and b/s is	what we	measure!).  The	"z" part is "number"  in  both
       cases  and  the	"y"  part is either "UNKN" for unknown or "100000" for
       100kb/s.

       The two CDEF expressions	would be:

	   CDEF:result=number,100000,GT,UNKN,number,IF
	   CDEF:result=number,100000,GT,100000,number,IF

   Example: working on a certain time span
       If you want a graph that	spans a	few weeks, but would only want to  see
       some  routers'  data  for  one week, you	need to	"hide" the rest	of the
       time frame. Don't ask me	when this would	be useful, it's	just here  for
       the example :)

       We  need	 to  compare  the  time	stamp to a begin date and an end date.
       Comparing isn't difficult:

	       TIME,begintime,GE
	       TIME,endtime,LE

       These two parts of the CDEF produce either 0 for	false or 1  for	 true.
       We  can	now  check if they are both 0 (or 1) using a few IF statements
       but, as Wataru Satoh pointed out, we can	use the	"*" or	"+"  functions
       as logical AND and logical OR.

       For  "*",  the  result  will  be	 zero (false) if either	one of the two
       operators is zero.  For "+", the	result will only be false (0) when two
       false (0) operators will	be added.  Warning: *any* number not equal  to
       0  will	be  considered "true". This means that,	for instance, "-1,1,+"
       (which should be	"true or true")	will become FALSE ...  In other	words,
       use "+" only if you know	for sure that you have	positive  numbers  (or
       zero) only.

       Let's compile the complete CDEF:

	       DEF:ds0=router1.rrd:AVERAGE
	       CDEF:ds0modified=TIME,begintime,GT,TIME,endtime,LE,*,ds0,UNKN,IF

       This  will return the value of ds0 if both comparisons return true. You
       could also do it	the other way around:

	       DEF:ds0=router1.rrd:AVERAGE
	       CDEF:ds0modified=TIME,begintime,LT,TIME,endtime,GT,+,UNKN,ds0,IF

       This will return	an UNKNOWN if either comparison	returns	true.

   Example: You	suspect	to have	problems and want to see unknown data.
       Suppose you add up the number  of  active  users	 on  several  terminal
       servers.	  If  one of them doesn't give an answer (or an	incorrect one)
       you get "NaN" in	the database ("Not a Number") and NaN is evaluated  as
       Unknown.

       In  this	 case,	you  would like	to be alerted to it and	the sum	of the
       remaining values	is of no value to you.

       It would	be something like:

	   DEF:users1=location1.rrd:onlineTS1:LAST
	   DEF:users2=location1.rrd:onlineTS2:LAST
	   DEF:users3=location2.rrd:onlineTS1:LAST
	   DEF:users4=location2.rrd:onlineTS2:LAST
	   CDEF:allusers=users1,users2,users3,users4,+,+,+

       If you now plot allusers, unknown data in one  of  users1..users4  will
       show  up	 as  a	gap  in	 your graph. You want to modify	this to	show a
       bright red line,	not a gap.

       Define an extra CDEF that is unknown if all is okay and is infinite  if
       there is	an unknown value:

	   CDEF:wrongdata=allusers,UN,INF,UNKN,IF

       "allusers,UN" will evaluate to either true or false, it is the (x) part
       of  the	"IF"  function	and it checks if allusers is unknown.  The (y)
       part of the "IF"	function is set	to "INF" (which	 means	infinity)  and
       the (z) part of the function returns "UNKN".

       The  logic  is:	if  (allusers  == unknown) then	return INF else	return
       UNKN.

       You can now use AREA to display this "wrongdata"	in bright red.	If  it
       is unknown (because allusers is known) then the red AREA	won't show up.
       If  the	value  is  INF (because	allusers is unknown) then the red AREA
       will be filled in on the	graph at that particular time.

	  AREA:allusers#0000FF:combined	user count
	  AREA:wrongdata#FF0000:unknown	data

   Same	example	useful with STACKed data:
       If you use stack	in the previous	example	(as I would do)	then you don't
       add up the values. Therefore, there is no relationship between the four
       values and you don't get	a single value to test.	 Suppose users3	 would
       be  unknown  at one point in time: users1 is plotted, users2 is stacked
       on top of users1, users3	is  unknown  and  therefore  nothing  happens,
       users4 is stacked on top	of users2.  Add	the extra CDEFs	anyway and use
       them to overlay the "normal" graph:

	  DEF:users1=location1.rrd:onlineTS1:LAST
	  DEF:users2=location1.rrd:onlineTS2:LAST
	  DEF:users3=location2.rrd:onlineTS1:LAST
	  DEF:users4=location2.rrd:onlineTS2:LAST
	  CDEF:allusers=users1,users2,users3,users4,+,+,+
	  CDEF:wrongdata=allusers,UN,INF,UNKN,IF
	  AREA:users1#0000FF:users at ts1
	  STACK:users2#00FF00:users at ts2
	  STACK:users3#00FFFF:users at ts3
	  STACK:users4#FFFF00:users at ts4
	  AREA:wrongdata#FF0000:unknown	data

       If there	is unknown data	in one of users1..users4, the "wrongdata" AREA
       will  be	 drawn	and  because  it starts	at the X-axis and has infinite
       height it will effectively overwrite the	STACKed	parts.

       You could combine the two CDEF lines into one (we don't use "allusers")
       if you like.  But there are good	reasons	for writing two	CDEFS:

          It improves the readability of the script.

          It can be used inside GPRINT	to display the total number of users.

       If you choose to	combine	them, you can substitute the "allusers"	in the
       second CDEF with	the part after the equal sign from the first line:

	  CDEF:wrongdata=users1,users2,users3,users4,+,+,+,UN,INF,UNKN,IF

       If you do so, you won't be able to use these next GPRINTs:

	  COMMENT:"Total number	of users seen"
	  GPRINT:allusers:MAX:"Maximum:	%6.0lf"
	  GPRINT:allusers:MIN:"Minimum:	%6.0lf"
	  GPRINT:allusers:AVERAGE:"Average: %6.0lf"
	  GPRINT:allusers:LAST:"Current: %6.0lf\n"

The examples from the RRD graph	manual page
   Degrees Celsius vs. Degrees Fahrenheit
       To convert Celsius into Fahrenheit use the formula F=9/5*C+32

	  rrdtool graph	demo.png --title="Demo Graph" \
	     DEF:cel=demo.rrd:exhaust:AVERAGE \
	     CDEF:far=9,5,/,cel,*,32,+ \
	     LINE2:cel#00a000:"D. Celsius" \
	     LINE2:far#ff0000:"D. Fahrenheit\c"

       This example gets the DS	called "exhaust" from database "demo.rrd"  and
       puts  the  values  in  variable	"cel".	The  CDEF used is evaluated as
       follows:

	  CDEF:far=9,5,/,cel,*,32,+
	  1. push 9, push 5
	  2. push function "divide" and	process	it
	     the stack now contains 9/5
	  3. push variable "cel"
	  4. push function "multiply" and process it
	     the stack now contains 9/5*cel
	  5. push 32
	  6. push function "plus" and process it
	     the stack contains	now the	temperature in Fahrenheit

   Changing unknown into zero
	  rrdtool graph	demo.png --title="Demo Graph" \
	     DEF:idat1=interface1.rrd:ds0:AVERAGE \
	     DEF:idat2=interface2.rrd:ds0:AVERAGE \
	     DEF:odat1=interface1.rrd:ds1:AVERAGE \
	     DEF:odat2=interface2.rrd:ds1:AVERAGE \
	     CDEF:agginput=idat1,UN,0,idat1,IF,idat2,UN,0,idat2,IF,+,8,* \
	     CDEF:aggoutput=odat1,UN,0,odat1,IF,odat2,UN,0,odat2,IF,+,8,* \
	     AREA:agginput#00cc00:Input	Aggregate \
	     LINE1:aggoutput#0000FF:Output Aggregate

       These two CDEFs are built from several functions.  It  helps  to	 split
       them  when  viewing what	they do. Starting with the first CDEF we would
       get:

	idat1,UN --> a
	0	 --> b
	idat1	 --> c
	if (a) then (b)	else (c)

       The result is therefore "0" if it is true that "idat1" equals "UN".  If
       not, the	original value of "idat1" is put back on the stack.  Lets call
       this answer "d".	The process is repeated	for the	next five items	on the
       stack, it is done the same and will return answer  "h".	The  resulting
       stack  is  therefore  "d,h".   The  expression  has  been simplified to
       "d,h,+,8,*" and it will now be easy to see that we add "d" and "h", and
       multiply	the result with	eight.

       The end result is that we have added "idat1" and	 "idat2"  and  in  the
       process we effectively ignored unknown values. The result is multiplied
       by eight, most likely to	convert	bytes/s	to bits/s.

   Infinity demo
	  rrdtool graph	example.png --title="INF demo" \
	     DEF:val1=some.rrd:ds0:AVERAGE \
	     DEF:val2=some.rrd:ds1:AVERAGE \
	     DEF:val3=some.rrd:ds2:AVERAGE \
	     DEF:val4=other.rrd:ds0:AVERAGE \
	     CDEF:background=val4,POP,TIME,7200,%,3600,LE,INF,UNKN,IF \
	     CDEF:wipeout=val1,val2,val3,val4,+,+,+,UN,INF,UNKN,IF \
	     AREA:background#F0F0F0 \
	     AREA:val1#0000FF:Value1 \
	     STACK:val2#00C000:Value2 \
	     STACK:val3#FFFF00:Value3 \
	     STACK:val4#FFC000:Value4 \
	     AREA:whipeout#FF0000:Unknown

       This  demo demonstrates two ways	to use infinity. It is a bit tricky to
       see what	happens	in the "background" CDEF.

	  "val4,POP,TIME,7200,%,3600,LE,INF,UNKN,IF"

       This RPN	takes the value	 of  "val4"  as	 input	and  then  immediately
       removes	it from	the stack using	"POP". The stack is now	empty but as a
       side effect we now know the time	that this sample was taken.  This time
       is put on the stack by the "TIME" function.

       "TIME,7200,%" takes the modulo of time and 7'200	(which is two  hours).
       The  resulting  value on	the stack will be a number in the range	from 0
       to 7199.

       For people who don't know the modulo  function:	it  is	the  remainder
       after an	integer	division. If you divide	16 by 3, the answer would be 5
       and the remainder would be 1. So, "16,3,%" returns 1.

       We  have	 the result of "TIME,7200,%" on	the stack, lets	call this "a".
       The start of the	RPN has	become "a,3600,LE" and this checks if  "a"  is
       less or equal than "3600". It is	true half of the time.	We now have to
       process	the  rest  of  the RPN and this	is only	a simple "IF" function
       that returns either "INF" or "UNKN" depending  on  the  time.  This  is
       returned	to variable "background".

       The second CDEF has been	discussed earlier in this document so we won't
       do that here.

       Now  you	 can draw the different	layers.	Start with the background that
       is either unknown (nothing to see) or infinite (the whole positive part
       of the graph gets filled).

       Next you	draw the data on top of	this background, it will  overlay  the
       background.  Suppose  one  of val1..val4	would be unknown, in that case
       you end up with only three bars stacked on  top	of  each  other.   You
       don't  want  to	see  this because the data is only valid when all four
       variables are valid. This is why	you  use  the  second  CDEF,  it  will
       overlay the data	with an	AREA so	the data cannot	be seen	anymore.

       If  your	 data can also have negative values you	also need to overwrite
       the other half of your graph. This can be done in a  relatively	simple
       way:  what you need is the "wipeout" variable and place a negative sign
       before it:  "CDEF:wipeout2=wipeout,-1,*"

   Filtering data
       You may do some complex data filtering:

	 MEDIAN	FILTER:	filters	shot noise

	   DEF:var=database.rrd:traffic:AVERAGE
	   CDEF:prev1=PREV(var)
	   CDEF:prev2=PREV(prev1)
	   CDEF:median=var,prev1,prev2,3,SORT,POP,EXC,POP
	   LINE3:median#000077:filtered
	   LINE1:prev2#007700:'raw data'

	 DERIVATE:

	   DEF:var=database.rrd:traffic:AVERAGE
	   CDEF:prev1=PREV(var)
	   CDEF:time=var,POP,TIME
	   CDEF:prevtime=PREV(time)
	   CDEF:derivate=var,prev1,-,time,prevtime,-,/
	   LINE3:derivate#000077:derivate
	   LINE1:var#007700:'raw data'

Out of ideas for now
       This document was created from questions	asked by either	myself	or  by
       other  people  on  the  RRDtool mailing list. Please let	me know	if you
       find errors in it or if you have	trouble	understanding it. If you think
       there should be an addition, mail me: <alex@vandenbogaerdt.nl>

       Remember: No feedback equals no changes!

SEE ALSO
       The RRDtool manpages

AUTHOR
       Alex van	den Bogaerdt <alex@vandenbogaerdt.nl>

1.9.0				  2024-07-29		       CDEFTUTORIAL(1)

Want to link to this manual page? Use this URL:
<https://man.freebsd.org/cgi/man.cgi?query=cdeftutorial&sektion=1&manpath=FreeBSD+Ports+15.0>

home | help