Graphing Running Averages in RRDTool

There are a number of times where the data collected may appear so erratic that it is difficult to identify any trends.  While use of a VDEF can provide a gross average of a data set, it doesn’t provide the utility of a true running average.  The running average can provide a consistent calculation regardless as to the time scale displayed in the graph.  Fortunately, RRDTool provides a flexible means via its TREND operator for calculating a running average to address this need.

To begin, setup a chart that graphs the desired data in a typical manner before adding in the running average.  The following graph command creates a somewhat typical visualization of the last day’s network traffic as illustrated by the following graph:

rrdtool graph example.png \
  --title "Network Traffic" \
  --width=500 \
  --slope-mode \
  --end=now \
  --start=end-24h \
  --base=1000 \
  DEF:bytesIn=network.rrd:bytesIn:AVERAGE \
  DEF:bytesOut=network.rrd:bytesOut:AVERAGE \
  CDEF:bpsIn=bytesIn,8,* \
  CDEF:bpsOut=bytesOut,8,* \
  CDEF:bpsOutNeg=bpsOut,-1,* \
  VDEF:bpsInTot=bpsIn,TOTAL \
  VDEF:bpsOutTot=bpsOut,TOTAL \
  COMMENT:"              current     min        max      total\n" \
  AREA:bpsIn#FF880044:"Bits/s In " \
  GPRINT:bpsIn:LAST:"%6.1lf%s\t" \
  GPRINT:bpsIn:MIN:"%6.1lf%s\t" \
  GPRINT:bpsIn:MAX:"%6.1lf%s\t" \
  GPRINT:bpsInTot:"%6.1lf%s\n" \
  LINE1:bpsIn#FF8800CC \
  AREA:bpsOutNeg#44C80044:"Bits/s Out" \
  GPRINT:bpsOut:LAST:"%6.1lf%s\t" \
  GPRINT:bpsOut:MIN:"%6.1lf%s\t" \
  GPRINT:bpsOut:MAX:"%6.1lf%s\t" \
  GPRINT:bpsOutTot:"%6.1lf%s\n" \
  LINE1:bpsOutNeg#44C800CC \
  HRULE:0#000000

With the basic graph setup, adding the elements necessary for the running average can now begin.  In this example, it appears there is a roughly hourly cycle (Time Machine backups) that defines the period so a reasonable start is to work with a minimum average calculated over 60 minutes.

First, additional CDEF statements need to be declared applying the TREND operator to the desired data.  The format for this operation is straightforward:

CDEF:label=data_source,time_span,TREND

In this case, two new data sources are defined and are given the labels bspInTrend and bpsOutTrendNeg.  The data sources for these are the previously declared bpsIn and bpsOutNeg and the time span for them is 1 hour (3600 seconds).

  CDEF:bpsInTrend=bpsIn,3600,TREND \
  CDEF:bpsOutTrendNeg=bpsOutNeg,3600,TREND \

With the running averages now defined, it is possible to graph the data.  For the purposes of this example, the running average is graphed as a “shadow” on top of the base data.

  LINE2:bpsInTrend#000000CC \
  LINE2:bpsOutTrendNeg#000000CC \

The graph shown below illustrates the new running averages.

The running averages are now shown and it illustrates that despite the hourly peaks, the overall trend is fairly flat with the slight exception around 00:15.  However, the running averages also show a gap at the beginning of the graph. This gap is because it takes an hour of data before the hourly running average can be computed.  While this may be acceptable, it doesn’t look polished.

A minor tweak of the original DEF statements can provide “padding” necessary to eliminate the displayed gap in the running average lines.  This can be accomplished by extending the start time of the data sources by at least the same duration specified in the running averages.  Note that the display area is not affected by altering the start time of the DEF statements — that is controlled by the --start and --end options.  For the example data, the DEF statements need to extend the range of the data set by 1 hour so as to encompass 25 hours instead of the display area’s 24 hours:

  DEF:bytesIn=network.rrd:bytesIn:AVERAGE:start=end-25h \
  DEF:bytesOut=network.rrd:bytesOut:AVERAGE:start=end-25h \

Putting it together, the following command will pull in all the data necessary to calculate a running average for the extent of the display area, calculate the running averages, and display the results:

rrdtool graph example.png \
  --title "Network Traffic" \
  --width=500 \
  --slope-mode \
  --end=now \
  --start=end-24h \
  --base=1000 \
  DEF:bytesIn=network.rrd:bytesIn:AVERAGE:start=end-25h \
  DEF:bytesOut=network.rrd:bytesOut:AVERAGE:start=end-25h \
  CDEF:bpsIn=bytesIn,8,* \
  CDEF:bpsOut=bytesOut,8,* \
  CDEF:bpsOutNeg=bpsOut,-1,* \
  CDEF:bpsInTrend=bpsIn,3600,TREND \
  CDEF:bpsOutTrendNeg=bpsOutNeg,3600,TREND \
  VDEF:bpsInTot=bpsIn,TOTAL \
  VDEF:bpsOutTot=bpsOut,TOTAL \
  COMMENT:"              current     min        max      total\n" \
  AREA:bpsIn#FF880044:"Bits/s In " \
  GPRINT:bpsIn:LAST:"%6.1lf%s\t" \
  GPRINT:bpsIn:MIN:"%6.1lf%s\t" \
  GPRINT:bpsIn:MAX:"%6.1lf%s\t" \
  GPRINT:bpsInTot:"%6.1lf%s\n" \
  LINE1:bpsIn#FF8800CC \
  AREA:bpsOutNeg#44C80044:"Bits/s Out" \
  GPRINT:bpsOut:LAST:"%6.1lf%s\t" \
  GPRINT:bpsOut:MIN:"%6.1lf%s\t" \
  GPRINT:bpsOut:MAX:"%6.1lf%s\t" \
  GPRINT:bpsOutTot:"%6.1lf%s\n" \
  LINE1:bpsOutNeg#44C800CC \
  LINE2:bpsInTrend#000000CC \
  LINE2:bpsOutTrendNeg#000000CC \
  HRULE:0#000000

 

Graphing Time Shifted Data in RRDTool

While visualizing performance or activity data using a well-designed chart can be useful, it is frequently desired to be able to compare recent performance against an earlier period of time.  Comparisons against historic data help identify changes in trends or other anomalous performance behaviors.  In some cases, this can be done simply by extending the time span of the graph.  However, expansion of the graphed time span can lead to either an unacceptable loss of resolution or an unacceptably large image.  Fortunately, RRDTool can make this type of comparison chart fairly easily without resulting in a loss of fidelity or unwieldy image size.

To begin, it helps to setup a chart that graphs the desired data in a typical manner before adding in the time shifted overlays.  The following graph command creates a somewhat typical visualization of the last hour’s network traffic as illustrated by Example 1 graph:

rrdtool graph example.png \
  --title "Network Traffic" \
  --width=500 \
  --slope-mode \
  --end=now \
  --start=end-1h \
  --base=1000 \
  DEF:bytesIn=network.rrd:bytesIn:AVERAGE \
  DEF:bytesOut=network.rrd:bytesOut:AVERAGE \
  CDEF:bpsIn=bytesIn,8,* \
  CDEF:bpsOut=bytesOut,8,* \
  CDEF:bpsOutNeg=bpsOut,-1,* \
  VDEF:bpsInTot=bpsIn,TOTAL \
  VDEF:bpsOutTot=bpsOut,TOTAL \
  COMMENT:"            current     min        max      total\n" \
  AREA:bpsIn#FF880044:"Bits/s In " \
  GPRINT:bpsIn:LAST:"%6.1lf%s\t" \
  GPRINT:bpsIn:MIN:"%6.1lf%s\t" \
  GPRINT:bpsIn:MAX:"%6.1lf%s\t" \
  GPRINT:bpsInTot:"%6.1lf%s\n" \
  LINE1:bpsIn#FF8800CC \
  AREA:bpsOutNeg#44C80044:"Bits/s Out" \
  GPRINT:bpsOut:LAST:"%6.1lf%s\t" \
  GPRINT:bpsOut:MIN:"%6.1lf%s\t" \
  GPRINT:bpsOut:MAX:"%6.1lf%s\t" \
  GPRINT:bpsOutTot:"%6.1lf%s\n" \
  LINE1:bpsOutNeg#44C800CC \
  HRULE:0#000000

With the basic chart setup as desired, now the command can be modified to support a time shifted overlay.

First, additional DEF statements need to be defined for time range for the source data to encompass the time span desired for comparison.  Note that this is not the same as the graph’s display range; those options (--start and --end) should remain unchanged.  The override start time should extend the time frame to the start of the time shifted data.  For example, the following would be suitable for comparison of the previous 1 hour:

  DEF:bytesInPrevHour=network.rrd:bytesIn:AVERAGE:start=end-2h \
  DEF:bytesOutPrevHour=network.rrd:bytesOut:AVERAGE:start=end-2h \

As another example, the following would be suitable for a comparison of the same hour, 1 day (24 hours) ago:

  DEF:bytesInPrevDay=network.rrd:bytesIn:AVERAGE:start=end-25h \
  DEF:bytesOutPrevDay=network.rrd:bytesOut:AVERAGE:start=end-25h \

Next, a new SHIFT statement needs to be introduced that shifts the data being displayed by the specified number of seconds.  The SHIFT statements must be specified after the DEF statements they are shifting.  The following statements would shift the newly inserted DEF statements by 1 hour (3600 seconds):

  SHIFT:bytesInPrevHour:3600 \
  SHIFT:bytesOutPrevHour:3600 \

Next, the CDEF and VDEF operations need to be supplied in order to transform the new data in a consistent manner with the original data.

  CDEF:bpsInPrev=bytesInPrevHour,8,* \
  CDEF:bpsOutPrev=bytesOutInPrevHour,8,* \
  CDEF:bpsOutPrevNeg=bpsOutPrev,-1,* \

Finally, the new elements are ready for graphing.  Care should be made to ensure the shifted data is displayed in a distinct manner from the current data so as to prevent visual confusion or obscuring the current data.

  LINE1:bpsInPrev#00000088 \ 
  LINE1:bpsOutPrevNeg#00000088 \

Pulling it all together, the following command will pull in the data from the extended time frame, shift it forward by the appropriate amount of time so that it aligns with the displayed time frame, perform the desired transformation options, and finally display the results:

rrdtool graph example.png \
  --title "Network Traffic" \
  --width=500 \
  --slope-mode \
  --end=now \
  --start=end-1h \
  --base=1000 \
  DEF:bytesIn=network.rrd:bytesIn:AVERAGE \
  DEF:bytesOut=network.rrd:bytesOut:AVERAGE \
  DEF:bytesInPrevHour=network.rrd:bytesIn:AVERAGE:start=end-2h \
  DEF:bytesOutPrevHour=network.rrd:bytesOut:AVERAGE:start=end-2h \
  SHIFT:bytesInPrevHour:3600 \
  SHIFT:bytesOutPrevHour:3600 \
  CDEF:bpsIn=bytesIn,8,* \
  CDEF:bpsOut=bytesOut,8,* \
  CDEF:bpsOutNeg=bpsOut,-1,* \
  CDEF:bpsInPrev=bytesInPrevHour,8,* \
  CDEF:bpsOutPrev=bytesOutInPrevHour,8,* \
  CDEF:bpsOutPrevNeg=bpsOutPrev,-1,* \
  VDEF:bpsInTot=bpsIn,TOTAL \
  VDEF:bpsOutTot=bpsOut,TOTAL \
  COMMENT:"            current     min        max      total\n" \
  AREA:bpsIn#FF880044:"Bits/s In " \
  GPRINT:bpsIn:LAST:"%6.1lf%s\t" \
  GPRINT:bpsIn:MIN:"%6.1lf%s\t" \
  GPRINT:bpsIn:MAX:"%6.1lf%s\t" \
  GPRINT:bpsInTot:"%6.1lf%s\n" \
  LINE1:bpsIn#FF8800CC \
  AREA:bpsOutNeg#44C80044:"Bits/s Out" \
  GPRINT:bpsOut:LAST:"%6.1lf%s\t" \
  GPRINT:bpsOut:MIN:"%6.1lf%s\t" \
  GPRINT:bpsOut:MAX:"%6.1lf%s\t" \
  GPRINT:bpsOutTot:"%6.1lf%s\n" \
  LINE1:bpsOutNeg#44C800CC \
  LINE1:bpsInPrev#00000088 \ 
  LINE1:bpsOutPrevNeg#00000088 \
  HRULE:0#000000

 

Automatic Log Rotation in Mac OS X

Log rotation is a standard practice to ensure that log files are periodically restarted so that they don’t grow without bounds.  In the past, when disk space was at a premium and maximum file sizes were limited, this was an absolute necessity.  For more modern systems, these constraints may be less relevant although they may continue to be a valid concern for very active systems or with verbose logging enabled.  Even with abundant disk space, appropriate managing of the system logs helps ensure the system remains healthy, logs remain available for diagnostics and analysis, and operational overhead is reduced.

In Linux, the typical log rotation utility is, appropriately enough, logrotate.  However, Mac OS X uses a different tool, newsyslog, to manage the log rotation responsibilities.  This utility is run automatically by the OS at frequently intervals and evaluates the rules defined in its configuration file — /etc/newsyslog.conf.  For the log files defined in the config, it can evaluate the need to rotate based on a number of criteria, such as file size, at a specific time, at a specific interval, etc.  It can also specify additional actions to take on the rotated log, including compression of the old log, maintaining a fixed number of previous log files, setting appropriate permissions, etc.  Full documentation is available in the man page for newsyslog.conf.

Each entry in the newsyslog.conf file is specified as a single line with space-delimited fields for the various options.  The general format is as follows (with [] surrounding the optional fields):

logfilename [owner:group] mode count size when flags [/pid_file] [sig_num]

An example entry for rotating the named.stats log file can be specified as follows:

/var/log/named.stats                    640  5     1000 *     J

In the above example, the full path is specified for the target log file — /var/log/named.stats.  Permissions (640) are specified in the typical Unix manner, in this case enabling read/write for the owner and read-only for the group.  This log file will maintain a maximum of 5 files, removing the oldest file as newer archived logs are created.  This log file is also set to rotate at a fixed size — 1000 kilobytes and can rotate at any time (“*”) this size threshold is exceeded.  The “J” flag specified calls for the the older log files to be compressed (further reducing used disk space) using the bzip2 utility.

A special note for binary (not text-based) logs:  by default, newsyslog will inject a line at the beginning of a file indicating when and why the log rotated.  For most text-based logs, this can be easily skipped over but for binary logs this might result in an unreadable file.  In this case, be sure to specify the “B” flag which suppresses the informational line from being injected.