>If I compare the last 100 samples for example, from both AVERAGE and
>MAX, they are all the same line for line (although I am just showing
>10 below to reduce the size of this email):

This is true, because the RRA being used has 1cdp=1pdp, and so AVG and MAX will 
yield the same value (max of a set of 1 is the same as the average of a set of 
1)

I am assuming here that you have a single very large RRA of 1cdp=1pdp and no 
other RRAs.  You may have some consolodation RRAs defined which will affect 
graphing functions.

One thing to note is that (if you have RRDtool 1.4.x) you can define a 95th 
Percentile set of rules that will perform the 95th percentile calculations for 
you, removing the need for the external script.

>Using AVERAGE produces lower values on a graph for
>current/average/max/total statistics at the bottom of the graph, and
>the graph is drawn differently showing this. This is the same DS as
>above but using AVERAGE instead of MAX, you will notice the 95th
>percentile value is the same because it is generate by the external
>PHP script as I mentioned: http://i.imgur.com/P27kjfc.png

This is because, when using a graph, there is potentially additional 
consolodation being performed, and potentially other RRAs may be selected if 
you have them defined.

When making a graph, it will select the RRA that most closely matches the 
granularity of the pixels of the graph.  If you only have the one RRA 
(1cdp=1pdp) then this will have to be used.  Next, it may need to perform 
additional consolodation, for example, if 1 pixel = 2cdp.  In this case, it 
will have to average/max this set of pdp as well (which concolodation function 
is used depends on your DEF declariation).

As a result, you may find that the data set is further averaged after being 
selected.

This is not a problem for the average summary statistics, but for the MAX you 
will get MAX(AVG(x)) which will usually be lower than MAX(x).  The way around 
this is to define a second DS that uses MAX and use it for the MAX statistics, 
though this can be wasteful as it requires a corresponding MAX RRA to use, 
which is redundant if your normal RRA is 1cdp=1pdp.

So, this explains why you see lower MAX values than you expect.  I originally 
had this problem in Routers2 but changed the code to explicitly use the MAX RRA 
when calculating the MAX stats and the AVG RRA for the AVG and LAST.  At least 
you don't get the problem of the 95th percentile becoming more inaccurate as 
the granularity increases, as you are working on 1cdp=1pdp throughout, and the 
calculation is performed on the unconsolodated data. 

In this case, RRDtool is being a little too 'helpful', by performing the 
additional consolodation step before calculating the statistics but using an 
inappropriate consolodation function.  You need to override its selection and 
force it to use MAX -- under RRDtool 1.4.x you can provide a CF override on the 
DEF declaration using 'reduce'.

For example:

DEF:x=foo.rrd:ds:AVG
DEF:x2=foo.rrd:ds:AVG:reduce=MAX
VDEF:avgx=x,AVERAGE
VDEF:maxx=x2,MAXIMUM
VDEF:percx=x,95,PERCENTNAN
LINE:x#00ff00:Average value
HRULE:percx#ff0000:95th Percentile
GPRINT:avgx:Average is %.2lf %sbps
GPRINT:maxx:Maximum is %.2lf %sbps

See how the x2 DEF specifies the MAX reduction function, and is used only for 
creating the maxx VDEF which is only used for the GPRINT.  Everything else uses 
the x DEF.  I've also used the PERCENTNAN function to calculate the 95th 
percentile internally, though this may also fall afoul of the consolodation 
functions resulting in a lower than accurate value - I've not yet tested this.

Have a good Christmas and new year...

Steve

Steve Shipway
University of Auckland ITS
UNIX Systems Design Lead
s.ship...@auckland.ac.nz
Ph: +64 9 373 7599 ext 86487


_______________________________________________
rrd-users mailing list
rrd-users@lists.oetiker.ch
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users

Reply via email to