Automatic Log Rotation in Mac OS X

Log rotation is a standard practice to ensure that log files are periodically restarted so that they don’t grow without bounds.  In the past, when disk space was at a premium and maximum file sizes were limited, this was an absolute necessity.  For more modern systems, these constraints may be less relevant although they may continue to be a valid concern for very active systems or with verbose logging enabled.  Even with abundant disk space, appropriate managing of the system logs helps ensure the system remains healthy, logs remain available for diagnostics and analysis, and operational overhead is reduced.

In Linux, the typical log rotation utility is, appropriately enough, logrotate.  However, Mac OS X uses a different tool, newsyslog, to manage the log rotation responsibilities.  This utility is run automatically by the OS at frequently intervals and evaluates the rules defined in its configuration file — /etc/newsyslog.conf.  For the log files defined in the config, it can evaluate the need to rotate based on a number of criteria, such as file size, at a specific time, at a specific interval, etc.  It can also specify additional actions to take on the rotated log, including compression of the old log, maintaining a fixed number of previous log files, setting appropriate permissions, etc.  Full documentation is available in the man page for newsyslog.conf.

Each entry in the newsyslog.conf file is specified as a single line with space-delimited fields for the various options.  The general format is as follows (with [] surrounding the optional fields):

logfilename [owner:group] mode count size when flags [/pid_file] [sig_num]

An example entry for rotating the named.stats log file can be specified as follows:

/var/log/named.stats                    640  5     1000 *     J

In the above example, the full path is specified for the target log file — /var/log/named.stats.  Permissions (640) are specified in the typical Unix manner, in this case enabling read/write for the owner and read-only for the group.  This log file will maintain a maximum of 5 files, removing the oldest file as newer archived logs are created.  This log file is also set to rotate at a fixed size — 1000 kilobytes and can rotate at any time (“*”) this size threshold is exceeded.  The “J” flag specified calls for the the older log files to be compressed (further reducing used disk space) using the bzip2 utility.

A special note for binary (not text-based) logs:  by default, newsyslog will inject a line at the beginning of a file indicating when and why the log rotated.  For most text-based logs, this can be easily skipped over but for binary logs this might result in an unreadable file.  In this case, be sure to specify the “B” flag which suppresses the informational line from being injected.

Advanced Color Graphing Techniques using RRDTool

Introduction

This section covers some of the more advanced uses of color in a RRDTool graph.  These techniques can not only help in providing an additional bit of polish to an otherwise ordinary graph, but can also be useful in clarifying interpretation and providing additional information.

Despite the availability of the techniques outlined below, it is still essential that proper color conventions and patterns be observed.  For example, red typically denotes “hot” in the context of temperature or “severe” in a notification/aberration detection.  Using red to denote a cool temperature on a temperature graph or a “normal” operating condition is very likely to lead to viewer confusion.  For more information, please be sure to consult a good reference on color theory.

Examples

Example 1

This is a simple example of a “stock” graph.  It uses no special color treatments but it is still able to clearly convey the necessary information.  In this case, it simply presents a graph of the system temperature.

rrdtool graph "Example 1 Colors.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 1" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
AREA:temp#FF0000

Example 1 Colors

Example 2

In this example, the use of an “alpha channel” in the color specification is introduced.  The easiest way to think of an alpha channel is as a transparency measure.  A alpha channel of “FF” would be completely opaque, while an alpha channel of “00″ would be completely transparent.  The alpha channel is specified in hexadecimal form at the end of the RGB color value.  In the example below, the color value of “FF000044″ has specified an alpha channel value of “44″.  If no alpha channel is specified, then a default value of “FF” (fully opaque) is used.

rrdtool graph "Example 2 Colors.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 2" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
AREA:temp#FF000044

Example 2 Colors

Example 3

This example illustrates the use of the “layer cake” effect.  This technique can help to provide additional context to a graph, as the colored layers can help clearly delineate when a system is operating within tolerances or not.  Thus, the health of the system can be determined at a glance and is not dependent on the viewer being intimate with the operational thresholds.  This example breaks down the temperature readings into four layers (cold, cool, warm, and hot) of 50 degrees each, but it would be a trivial extension to increase/decrease the number of layers.

The CDEF for the middle layers can be somewhat intimidating for those who are not experts in Reverse Polish Notation.  In this example, each layer relies on being stacked and so the appropriate calculation is determining the portion of the temperature (if any) that makes up the layer.  The following breakdown may help make it more palatable:

cool=temp,50,GT,temp,100,GT,50,temp,50,-,IF,UNKN,IF
if (temp > 50) then
  if (temp > 100) then
    cool = 50
  else
    cool = temp - 50
else
  cool = UNKN

As each layer is a maximum of 50 degrees, the trick is to determine how much (if any) of a layer falls within the designation.  If the actual temperature exceeds that of the layer, then simply use the maximum value (50).  If the temperature falls within the layer, then the value should be the temperature less the total of any previous bands.  If the temperature is less than the minimal temperature for this layer, then simply return the “unknown” value to prevent any graphing.

rrdtool graph "Example 3 Colors.png" \
--start "end-48 hours" --end "12am Dec 5, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 3" \
--vertical-label "Temperature" \
--lower-limit 0 --rigid \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
CDEF:cold=temp,50,LE,temp,50,IF \
CDEF:cool=temp,50,GT,temp,100,GT,50,temp,50,-,IF,UNKN,IF \
CDEF:warm=temp,100,GT,temp,150,GT,50,temp,100,-,IF,UNKN,IF \
CDEF:hot=temp,150,GT,temp,150,-,UNKN,IF \
AREA:cold#0000FFAA:cold:STACK \
AREA:cool#0000FF44:cool:STACK \
AREA:warm#FF000044:warm:STACK \
AREA:hot#FF0000AA:hot:STACK

Example 3 Colors

Example 4

There are several techniques for “feathering” the colors in a graph as shown in this example.  The technique illustrated in this example is suitable for a representing a single color palette with the gradient lightest at the top and darkest at the bottom.  It is achieved by simply overlaying the graph with the same color selection at selected proportions and relying on the alpha channel to “build up” as the layers overlap.  Care should be made when using this technique not too make the top layers so translucent they become difficult to discern.

This example simply maps the set of data values into sets for 1/4, 1/2 and 3/4 values and then overlays the original value graph.  Additional looks can also be achieved through the use of alternative data transformation maps/ratios.

rrdtool graph "Example 4 Colors.png" \
--start "end-48 hours" --end "12am Dec 5, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 4" \
--vertical-label "Temperature" \
--lower-limit 0 --rigid \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
CDEF:tier1=temp,4,/ \
CDEF:tier2=temp,2,/ \
CDEF:tier3=temp,4,/,3,* \
AREA:temp#FF000022: \
AREA:tier3#FF000022: \
AREA:tier2#FF000022: \
AREA:tier1#FF000022:

Example 4 Colors

Example 5

This example illustrates another method of “feathering” the colors in the graph.  In this case, the color gradient is lightest at the bottom and darkest at the top.  In order to achieve this, the value is simply divided up and then each layer is stacked on top of the other while steadily increasing the alpha channel value.

rrdtool graph "Example 5 Colors.png" \
--start "end-48 hours" --end "12am Dec 5, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 5" \
--vertical-label "Temperature" \
--lower-limit 0 --rigid \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
CDEF:tier=temp,4,/ \
AREA:tier#FF000022::STACK \
AREA:tier#FF000044::STACK \
AREA:tier#FF000066::STACK \
AREA:tier#FF000088::STACK

Example 5 Colors

Example 6

This example illustrates the use of highlights to clearly delineate the borders between stacked area graphs.  It allows the use of a softer color palette without having to resort to a clashing color scheme to define the borders.

The highlight lines should be specified after all the area graphs have been declared.  Each highlight should be specified in the same order as its corresponding area graph in order to ensure the proper color is “on top” should the data sets have any overlap.  It is typically easiest to maintain the same color scheme by using the same RGB value as the area graph but specifying high alpha channel value.

rrdtool graph "Example 6 Colors.png" \
--start "end-48 hours" --end "12am Jan 15, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 6" \
--vertical-label "Bytes" \
--lower-limit 0 --rigid \
DEF:disk1=sysinfo.rrd:disk_used:AVERAGE \
DEF:disk2=sysinfo.rrd:disk2_used:AVERAGE \
AREA:disk1#0000FF22:: \
AREA:disk2#00F00022::STACK \
LINE1:disk1#0000FFAA:"Disk 1" \
LINE1:disk2#00F000AA:"Disk 2":STACK

Example 6 Colors

Using VDEFs for set calculations in RRDTool

Introduction

The VDEF directive provides a mechanism for applying mathematical operations on sets of data.  Unlike the CDEF directive, the result of a VDEF is a single value.  The VDEF can reference either a DEF or CDEF data set and can perform a number of operations including calculating averages, standard deviations, least squares lines, and more.  These values can then be further referenced in CDEFs for graphing or printed out.

There is often much confusion regarding the differences of the DEF, CDEF, and VDEF directives.  It may help to think of the different directives in the following manner:

  • A DEF directive references a set of “raw” data as it is stored in a RRA
  • A CDEF directive applies a function to each data point it references
  • A VDEF directive applies a function to an aggregate of data points

The VDEF Directive

The basic format for a VDEF directive is as follows:

VDEF:Label=RPN Expression

Label is the name of the VDEF.  It may be referenced in other directives including HRULEs or CDEF calculations.  It may be from 1-19 characters long and consists of characters in the set [a-zA-Z0-9_].  Note that the label must be unique and cannot overlap with any labels assigned to other DEFs, CDEFs, or VDEFs.

RPN Expression is the mathematical or logical expression that is applied to a data set.  The expression uses Reverse Polish Notation to eliminate confusion or errors that may occur with the precedence rules required of traditional infix notation.  There are a number of mathematical, boolean and logical operators available for inclusion in a VDEF directive.

VDEF Examples

Example 1

This is a simple example which illustrates the use of VDEFs to reveal information about the set of data.  In this case, the MINIMUM, MAXIMUM, and AVERAGE operations are used to determine the respective values for the data set.  The resulting value for each operation is then used as the value for a HRULE providing a clear illustration in the resulting graph.  Note that the VDEF values are based only on the data referenced.  In this case, the data is bounded by the start and end time specified and so would likely differ for a different window into the original data set.

rrdtool graph "Example 1 VDEF.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 1" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
VDEF:min=temp,MINIMUM \
VDEF:max=temp,MAXIMUM \
VDEF:avg=temp,AVERAGE \
AREA:temp#FFEF00 \
HRULE:max#FF0000:"Max" \
HRULE:avg#000000:"Average" \
HRULE:min#0000FF:"Min"

Example 1 VDEF

Example 2

This example uses the average and standard deviation VDEF operations to identify temperature ranges that exceed the average temperature by more than one standard deviation.  The VDEF operations determine the appropriate values, which are then used in the CDEF operations to generate new data sets.  The original data set is graphed in the “hot” color with the “normal” and “cool” graphs overlayed on top of the appropriate sections.

The CDEF calculations may be difficult to parse for a novice to Reverse Polish syntax.  It may help to break it down as follows:

  1. cool=temp,avg,stdev,-,LE,temp,UNKN,IF
  2. cool=temp,(avg – stdev),LE,temp,UNKN,IF
  3. cool=(temp <= (avg – stdev)),temp,UNKN,IF
  4. cool=if (temp <= (avg – stdev)) then temp else UNKN

rrdtool graph "Example 2 VDEF.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 2" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
VDEF:avg=temp,AVERAGE \
VDEF:stdev=temp,STDEV \
CDEF:cool=temp,avg,stdev,-,LE,temp,UNKN,IF \
CDEF:medium=temp,avg,stdev,+,LE,temp,UNKN,IF \
AREA:temp#FF0000:"Hot" \
AREA:medium#FFEF00:"Normal" \
AREA:cool#0000FF:"Cool"

Example 2 VDEF

Example 3

This example illustrates how to use the least squares line VDEF operations to draw a trendline.  The LSLSLOPE operation can determine the slope of line and the LSLINT can provide the y-axis intercept value.  A trendline can then be generated using the classic "y=mx+b" formula (where m is the slope and b is the intercept).

The CDEF operation implements this formula using several “tricks” of rrdtool: a workaround for the CDEF requirement to reference a DEF or CDEF, and the use of the COUNT operation to increment a value for each data point in the graph set.  In the example, the CDEF reference requirement is satisfied by the “temp,POP” elements, which effectively puts a value from the temp DEF on the stack and then pops it back off, discarding it.

rrdtool graph "Example 3 VDEF.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 3" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
VDEF:slope=temp,LSLSLOPE \
VDEF:intercept=temp,LSLINT \
CDEF:trendline=temp,POP,COUNT,slope,*,intercept,+ \
AREA:temp#FFEF00 \
LINE2:trendline#000000

Example 3 VDEF

Example 4

This examples uses the PERCENTNAN operation in order to identify the 5% coolest and 5% hottest temperatures in the data set.  These values are then applied as part of the  CDEF calculations to generate the appropriate data sets for graphing.  Note that it is frequently best to use the PERCENTNAN operation instead of the PERCENT operation as the PERCENTNAN variant handles any gaps in the data in a more graceful manner.

rrdtool graph "Example 4 VDEF.png" \
--start "end-48 hours" --end "12am Nov 1, 2009" \
--imgformat PNG --width 500 --height 120 \
--title "Example 4" \
--vertical-label "Temperature" \
DEF:temp=sysinfo.rrd:temperature:AVERAGE \
VDEF:cool5=temp,5,PERCENTNAN \
VDEF:hot95=temp,95,PERCENTNAN \
CDEF:cool=temp,
cool5,LE,temp,UNKN,IF \
CDEF:medium=temp,
hot95,LE,temp,UNKN,IF \
AREA:temp#FF0000:"Hot" \
AREA:medium#FFEF00:"Normal" \
AREA:cool#0000FF:"Cool"

Example 4 VDEF