Using CDEFs to manipulate data in RRDTool

Introduction

The CDEF directive provides a means for manipulating the “raw” data stored in a round-robin archive (RRA).  It is typically used to apply a mathematical function to each data point referenced by one or more DEF statements which results in an array of new values — each of remains associated with the respective time of the original “raw” data point.  This is an in-memory transformation only; the original data remains unchanged in the RRA.

There is often much confusion regarding the differences of the DEF, CDEF, and VDEF directives.  It may help to think of the different directives in the following manner:

  • A DEF directive references a set of “raw” data as it is stored in a RRA
  • A CDEF directive applies a function to each data point it references
  • A VDEF directive applies a function to an aggregate of data points

The CDEF Directive

The basic format for a CDEF directive is as follows:

CDEF:Label=RPN Expression

Label is the name of the CDEF.  It may be referenced in other directives for inclusion in LINE, AREA charts or even other CDEF calculations.  It may be from 1-19 characters long and consists of characters in the set [a-zA-Z0-9_].  Note that the label must be unique and cannot overlap with any labels assigned to other DEFs, CDEFs, or VDEFs.

RPN Expression is the mathematical or logical expression that may be used to manipulate the raw data values as referenced by DEF or CDEF directives or even a pure mathematical function.  The expression uses Reverse Polish Notation to eliminate confusion or errors that may occur with the precedence rules required of traditional infix notation.  There are a number of mathematical, boolean and logical operators available for inclusion in a CDEF directive.

CDEF Examples

Example 1

This example illustrates the commonly used transformation of bytes to megabytes.  Disk and memory readings are often reported in bytes, but frequently this is not the most convenient unit for visualization.  In this case, the CDEF directive divides the “raw” data value as referenced by the DEF and divides it by 1048576 (1024 x 1024). Note that the AREA directive now references the CDEF label and that the vertical label has been updated to reflect the proper units.

rrdtool graph "Example 1 CDEF.png" \
--start "end-48 hours" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 1" \
--vertical-label "Megabytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
CDEF:megadisk1=disk1,1048576,/ \
AREA:megadisk1#0000FF:"Disk 1"

Example 1 CDEF

Example 2

This example shows a slightly more complex instance of the reverse-polish math that may be referenced in a CDEF.  In this case, the CDEF first sums up the the “raw” values as referenced by the two DEF statements and then converts the sum to megabytes.

rrdtool graph "Example 2 CDEF.png" \
--start "end-48 hours" --end "Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 2" \
--vertical-label "Megabytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
DEF:disk2=sysinfo.rrd:disk2_used:AVERAGE \
CDEF:megadisk=disk1,disk2,+,1048576,/ \
AREA:megadisk#0000FF:"Total Disk Used"

Example 2 CDEF

Example 3

This example illustrates a common technique for differentiating several types of related measurements.  It is a frequent graphing style for disk IO (reads vs. writes) as well as network IO (octets in vs. octets out).  It is achieved by simply negating the values of one of the operations and graphing the result.  In this example, a horizontal rule (HRULE) at the 0 point has also been added in order to highlight the baseline.  Note that the HRULE is specified as the last element to be drawn, which ensures that it will overlay the other graphed elements.

rrdtool graph "Example 3 CDEF.png" \
--start "end-48 hours" --end "Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 3" \
--vertical-label "Bytes" \
DEF:read=sysinfo.rrd:bytes_read:AVERAGE \
DEF:write=sysinfo.rrd:bytes_written:AVERAGE \
CDEF:negwrite=0,write,- \
AREA:read#0000FF:"Bytes Read" \
AREA:negwrite#00FF00:"Bytes Written" \
HRULE:0#000000

Example 3 CDEF

Example 4

This is a more complex example which illustrates the use of the IF operation as well as a more advanced graphing style useful to call out anomalous behaviors.  In this case, the temperature values (referenced by the DEF “temp”) are first assessed by the LE (less than or equal to) and GT (greater than) operations.  The values assigned in these CDEFs (iscool, ishot) will then be used in the CDEFs with the IF operations.  The IF operations evaluate iscool/ishot and if it is “true” (i.e. not zero), then the value for temp is returned.  Otherwise, the special constant “unknown”  is returned.  These values for the cool/hot CDEFs are then graphed and result in a clear demarcation where the system is over-heated.

rrdtool graph "Example 4 CDEF.png" \
--start "end-48 hours" --end "Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 4" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
CDEF:iscool=temp,175,LE \
CDEF:ishot=temp,175,GT \
CDEF:cool=iscool,temp,UNKN,IF \
CDEF:hot=ishot,temp,UNKN,IF \
AREA:cool#0000FF:"cool" \
AREA:hot#FF0000:"hot"

Example 4 CDEF

Example 5

This example illustrates the use of the LIMIT operation to achieve a similar effect for highlighting anomalous conditions.  In this case, the entire data set is initially graphed using the “hot” color and then the “cool” data set is overlayed on top of the appropriate sections.

rrdtool graph "Example 5 CDEF.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 5" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
CDEF:cool=temp,0,175,LIMIT \
AREA:temp#FF0000:"hot" \
AREA:cool#0000FF:"cool"

Example 5 CDEF

Example 6

This example shows the use of the MIN operation to provide a “layer cake effect” in the graph.  This is a popular graphing technique that can be used either to signify a state change above a given threshold or simply to provide a color gradient in the graph for extra polish. This particular example again relies on overlaying the “cool” graph to mask out the relevant sections of the data.

rrdtool graph "Example 6 CDEF.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 6" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
CDEF:cool=temp,175,MIN \
AREA:temp#FF0000:"hot" \
AREA:cool#0000FF:"cool"

Example 6 CDEF

Rules, Legends and Scales with RRDTool

Introduction

This section covers some of the basic options and directives that can be used to personalize the graph.  There are a number of options available to control how the legends and other text are displayed, adjustments for the scaling behavior, as well as means for demarcating significant levels or marking points in time.

Rule Directives

Rules are simply straight lines that are drawn in the graphing area.  There are two variations:  Horizontal Rules (HRULEs) and Vertical Rules (VRULEs).  Typical usages of rules may include displaying a bar across the graph that may indicate an important threshold or a delimiter that reflects a system change.  Note that rules will not be drawn if they are not within the scope of the actual data ranges being displayed.  The general format for rules are as follows:

HRULE:Value#Color:Legend

VRULE:Time#Color:Legend

Value represents the value on the y-axis that will apply to a horizontal rule.  Note that horizontal rules can only be perfectly horizontal; it is not possible to supply a formula for a sloped line.  If the values of the data being displayed are not within the range of the horizontal line, then the line will not be displayed.  A frequent use of a HRULE is to demarcate a critical threshold in the data.

Time represents the value on the x-axis that will apply to a vertical rule.  The time must be presented as a standard Unix epoch value (number of seconds since Jan 1, 1970 UTC).  Note that if the time range of the data displayed does not contain the time specified for the VRULE, the rule will not be displayed.  A common use of a VRULE is to flag a “state change” in the measured systems, such as a new software release.

Color defines the color of the line or area graph.  This is expressed in the web-standard  RGB hexadecimal triplet and must be separated from the label by a ‘#’.  Example color definitions are: 333399 (blue), 33FF00 (bright green), and CC0000 (red).  If the color is not specified, then the area will be “invisible”.

Legend is the text associated with the legend entry for the graph.  It is an optional element, and if it is omitted the ‘:’ separator should also be omitted.

Rule Examples

Example 1

The following example is a simple example of the different rule types.  The HRULE specifies a red horizontal line drawn at the value of 750,000,000 on the y-axis.   The VRULE specifies a light green vertical line drawn at the value of 11:30am Dec 30, 2009 (1262201400 in Unix epoch format).

rrdtool graph "Example 1.png" \
--start "end-48 hours" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 1" \
--vertical-label "Bytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk 1" \
HRULE:750000000#FF0000:"750 MB Warning" \
VRULE:1262201400#00FF00:"Software Rollout"

Example 1

Example 2

The following example illustrates use of a time value for the VRULE (Dec 18, 2009) that is not within the scope of the graphed data.  Note that the rule in this case is not displayed nor is there a legend element.  If the scope of the graphed data is changed to encompass the value of the VRULE, then the rule will again be displayed.  This behavior provides some utility in demarcating significant events without causing the y-axis to be “pinned” to a specified time.

rrdtool graph "Example 2 Rule.png" \
--start "end-48 hours" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 2" \
--vertical-label "Bytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk 1" \
HRULE:750000000#FF0000:"750 MB Warning" \
VRULE:1261201400#00FF00:"Software Rollout"

Example 2

Legend and Title Options

--title specifies the text to be displayed above the graphing canvas.

--vertical-label specifies the text to be displayed next to the y-axis.  Typically, this indicates the units of the measurement.

--right-axis specifies an alternate y-axis scale that will be displayed on the right side of the graph.  This can be used to realign the units displayed which may have been adjusted to increase the legibility of the graph.  This parameter requires both a scale and offset (scale:offset) be specified.

--right-axis-label specifies the text to be displayed on the right axis.

--no-legend omits the legend information from being drawn.

--legend-position defines where the legend will be displayed in the graph.  Acceptable values are north, south, east and west.  The default value is south.

Legend and Title Examples

Example 1

This simple example illustrates the title, vertical-label, and legend-position parameters in use.  Note that the text for the label and title is quoted to ensure proper parsing.  The legend-position overrides the default value of south with east.

rrdtool graph "Example 1 Legend.png" \
--start "end-48 hours" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 1" \
--vertical-label "Bytes" \
--legend-position east \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk 1"

Example 1 Legend

Example 2

This example illustrates the use of the right-axis to specify an alternate scale.  In this case, the y-axis scale on the left (the default) reflects the CPU temperature measured in Celsius.  The y-axis scale on the right reflects the fan speed measured in RPM.  For this example, the scale has been set to 1 and the offset specified as 0.

rrdtool graph "Example 2 Legend.png" \
--start "end-48 hours" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 2" \
--vertical-label "Celsius" \
--right-axis 1:0 \
--right-axis-label RPM \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
DEF:fan=sysinfo.rrd:cpu_fan:AVERAGE \
AREA:temp#0000FF:"CPU Temperature" \
LINE1:fan#00FF00:"Fan"

Example 2 Legend

Example 3

This example further refines the graph defined in Example 2 by adjusting the values of the fan so that they are closer to the range of the temperatures.  (This is performed by the CDEF directive, which will be further described in later examples.)  To keep the units correct, the scale for the right-label is adjusted to re-compensate.  Note that in this example the variations in temperature and RPM are now both clearly visible and the left and right scales on the y-axis reflect the different scales of the datatypes.

rrdtool graph "Example 3 Legend.png" \
--start "end-48 hours" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 3" \
--vertical-label "Celsius" \
--right-axis 20:0 \
--right-axis-label RPM \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
DEF:fan=sysinfo.rrd:cpu_fan:AVERAGE \
CDEF:fan20=fan,20,/ \
AREA:temp#0000FF:"CPU Temperature" \
LINE1:fan20#00FF00:"Fan"

Example 3 Legend

Scale Options

--upper-limit provides an override of the default auto-scaling behavior by setting an explicit upper limit value for the y-axis.  Note that the value provided by this option will continue to be overridden if an actual data value in the graph exceeds the limit specified.

--lower-limit provides an override of the default auto-scaling behavior by setting an explicit lower limit value for the y-axis.  Note that the value provided by this option will continue to be overridden if an actual data value in the graph is lower than the limit specified.

--rigid is used in conjunction with the upper-limit and lower-limit options to provide a definitive maximum and minimum y-axis value.  With this option specified, the auto-scaling behavior will not adjust the scale even if a data value would exceed (or fall below) the upper or lower limits specified.

--logarithmic changes the y-axis to use a logarithmic scale instead of the default linear scale.  This can be useful to visualize fine-grain patterns in the data that may otherwise be obscured if the values are wide-ranging.

--units-exponent sets the exponent expressed in the y-axis scale to a fixed value.  For example, setting this to 3 would lead to the scale to consistently use units of 1000 (10^3).

--units-length defines how many characters rrdtool should assume are present in the y-axis scale labels.  This may be necessary to specify when deviating from the default values for the expression of the units (via the units-exponent, logarithmic, or units=si options) in order to prevent rrdtool from overlapping the vertical label and scale labels.

--units=si overrides the exponential notation with the standard SI unit symbols (k, M, etc.)  Note that the exponential notation is the default only for logarithmic graphs; linear graphs already use the SI notation.

Scale Examples

Example 1

This example illustrates the use of the upper and lower limit options.  Note that in this case the specified lower-limit is ignored, as an actual data value is lower than the value specified and so the auto-scaler adjusts the lower range of the scale to compensate.

rrdtool graph "Example 1 Scale.png" \
--start "end-1 month" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 1" \
--vertical-label "Bytes" \
--upper-limit 1000000000 \
--lower-limit 500000000 \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk"

Example 1 Scale

Example 2

This example demonstrates how the rigid option can be used to enforce the specified upper and lower limits.  Note that this strict adherence to the specified limits may prevent data that is out of range from being displayed.

rrdtool graph "Example 2 Scale.png" \
--start "end-1 month" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 2" \
--vertical-label "Bytes" \
--upper-limit 1000000000 \
--lower-limit 500000000 \
--rigid \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk"

Example 2 Scale

Example 3

In this example, the logarithmic option is specified to alter the y-axis scaling behavior.  The units-exponent is also specified which maintains the expression of the scale at the fixed rate of 10^3 (1000′s).

rrdtool graph "Example 3 Scale.png" \
--start "end-1 month" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 3" \
--vertical-label "kilobytes" \
--logarithmic \
--units-exponent 3 \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk"

Example 3 Scale

Example 4

This is an alternate view of the previous graph using the SI notation instead of the default exponential notation for a logarithmic graph.  In addition, the units-length option is specified to facilitate the alignment of the axis label and scale units.

rrdtool graph "Example 4 Scale.png" \
--start "end-1 month" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 4" \
--vertical-label "Bytes" \
--logarithmic \
--units=si \
--units-length 5 \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk"

Example 4 Scale

Basic Area Graphs with RRDTool

Introduction

This post describes some basic techniques for generating area charts using RRD tool.  It builds upon the previous Line Graphs posting which describes DEFs and other basic graphing options.  The data used for the examples was generated using the tool specified in the RRD Example Data posting.

Area Chart Graphing Directives

As specified in the Line Graphs posting, the data definitions (DEFs) must be declared prior to the display directives.  The general format for an area directive is as follows:

AREA:Label#Color:Legend:STACK

Label refers to the label given to a previously defined data definition (DEF).  This defines the source of the data to be graphed.

Color defines the color of the line or area graph.  This is expressed in the web-standard  RGB hexadecimal triplet and must be separated from the label by a ‘#’.  Example color definitions are: 333399 (blue), 33FF00 (bright green), and CC0000 (red).  If the color is not specified, then the area will be “invisible”.

Legend is the text associated with the legend entry for the graph.  It is an optional element, and if it is omitted the ‘:’ separator should also be omitted.

STACK specifies that the results in the graph element be offset from the top of the previous display output instead of the normal 0-based offset.  It is case-sensitive and should always be specified in all-caps.  It is an optional element, and if it is omitted the ‘:’ separator should also be omitted.  To specify a stacked element without a legend, simply omit any entry for the legend but remember the colon.  (e.g. AREA:label#00FFAA::STACK)

Examples

Example 1

This is a simple example of an area graph.  In this case, there is only one data element defined and displayed (“disk1″).

rrdtool graph "Example 1.png" \
--start "end-28 hours" --end "Dec 31, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 1" \
--vertical-label "Bytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
AREA:disk1#0000FF:"Disk 1"

Example 1

Example 2

This example illustrates more than one display element.  In this case, the elements are not stacked and so both use the default 0-based offset.  Note that the order of display directives also defines the drawing order, which may result in some (or all) of an area graph in being hidden.

rrdtool graph "Example 2.png" \
--start "end-48 hours" --end "Jan 4, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 2" \
--vertical-label "Bytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
DEF:disk2=sysinfo.rrd:disk2_used:AVERAGE \
AREA:disk1#0000FF:"Disk 1" \
AREA:disk2#FF0000:"Disk 2"

Example 2

Example 3

This example illustrates the use of the STACK option for the AREA directive.  Remember that the STACK offsets the data display to start from the top of the previously displayed element.  If there is no previous element, then the STACK option uses the default 0-based offset.  Any number of display directives can use the STACK option.  A frequent use of the STACK option is to illustrate a “summation” of a set of servers or services.

Compare this graph with that of Example 2, which covers the same set of data.

rrdtool graph "Example 3.png" \
--start "end-48 hours" --end "Jan 4, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 3" \
--vertical-label "Bytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
DEF:disk2=sysinfo.rrd:disk2_used:AVERAGE \
AREA:disk1#0000FF:"Disk 1":STACK \
AREA:disk2#FF0000:"Disk 2":STACK

Example 3

Example 4

This example illustrates the technique of making an “invisible” element.  In this case, the “disk1″ directive has no color or legend specified.  However, despite it not being displayed, it is still present as applicable to the STACK option for the “disk2″ element.

rrdtool graph "Example 4.png" \
--start "end-48 hours" --end "Jan 4, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 4" \
--vertical-label "Bytes" \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
DEF:disk2=sysinfo.rrd:disk2_used:AVERAGE \
AREA:disk1::STACK \
AREA:disk2#FF0000:"Disk 2":STACK

Example 4