[noctool-devel] average "monitor"
Ingvar
ingvar at hexapodia.net
Mon Jul 28 15:12:56 UTC 2008
Jim writes:
> Hello,
>
> I'm wondering if anyone has any thoughts on how one might make an
> "monitor" that represents the average of several other monitors.
I'd actually be inclined to call that a "view" and either have it be one or
more "equipment"s or one or more "monitor"s (or, possibly, a composite of
them) and default to either "min", "avg" or "max" (in the case of
"equipment"-centred views, I suspect either "avg" or "max" would be the right
default, for views composed of "monitor"s, I am less sure). I can probably
think of even more interesting ways of aggregating measures into useful
values, if given a few more moments to think about it. [1]
Just so that's on record, somewhere. :)
The typical use-case, as I see it, is to slap sufficient inter-related things
into one or more views, so all you'd look at frequently is the status for the
view-as-such, then opening the view up to watch components within the view (be
that one or more equipment objects or one or more monitors; sort of how the
equipment aggregates monitors).
> I'm looking at http://meta.rocksclusters.org/ganglia/ right now where they
> display a graph of the average load over some 450+ machines. How might
> you implement something like that in NOCtool?
Depends on, I would've thought.
> p.s. anyone seen any "good" monitoring UIs? something they like... I
> can't say I ever really have :P
Closest I've seen so far is HP OpenView and Spectrum (no longer Cabletron, but
surprisingly still alive). Both rely heavily on the admin(s) to set up decent
views, as uncareful adding of monitored elements tends towards "crowded".
//Ingvar
[1] Off the top of my head, I could probably make a case for:
minimum measure/alert level
arithmetic mean of measure/alert level
median of measure/alert level
geometric mean of measure/alert level (this'd be "multiply all N
measures together, extract the Nth root, this is the geomean)
maximum of measure/alert level
Minimum is the one I'd have teh hardest time to defend, but...
The three averages are variously useful for performance indication
(artithmean is useful for a fine-grained load-balancing; median is handy
for most practical purposes, I would've thought and the geomean ought
to spike as you starts towards having more servers with an issue, while
being fairly unresponsive when there's just a small problem)
Max is handy whenever you have a small number of things aggregated (or not
much load-balancing between them).
More information about the Noctool-devel
mailing list