[chrony-users] Monitoring Chrony

So far, I haven't been able to find a good programmatic way to extract stats with chronyc. There are a bunch of annoying parsing issues with things like the sourcestats command. The offset includes a precision, so I have to parse the precision and convert that to be all in one precision. I haven't seen much documentation on the protocol between chronyc and chronyd.

Take a look at the chronyd log files. The data is more amenable to machine reading than the chronyc output.

- Ben Kochie

Bryan Christianson
***@whatroute.net

--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.

Ben Kochie

2016-02-11 11:14:32 UTC

Log parsing is possible, there is a tool called mtail that can parse logs
to collect metrics. It is generally recommend to ask services directly
about their current state rather than regexping logs.

This way the stats are read in a more on-demand way.

Post by Ben Kochie
So far, I haven't been able to find a good programmatic way to extract

stats with chronyc. There are a bunch of annoying parsing issues with
things like the sourcestats command. The offset includes a precision, so I
have to parse the precision and convert that to be all in one precision. I
haven't seen much documentation on the protocol between chronyc and chronyd.
Take a look at the chronyd log files. The data is more amenable to machine
reading than the chronyc output.

Post by Ben Kochie
- Ben Kochie

Bryan Christianson
--
with "unsubscribe" in the subject.
with "help" in the subject.

Miroslav Lichvar

2016-02-11 12:58:21 UTC

Post by Ben Kochie
So far, I haven't been able to find a good programmatic way to extract
stats with chronyc. There are a bunch of annoying parsing issues with
things like the sourcestats command. The offset includes a precision, so I
have to parse the precision and convert that to be all in one precision.

Yeah, I've struggled with that too. I like the human readable format
when inspecting the chrony state, but it does complicate parsing quite
a bit.

Post by Ben Kochie
A couple of specific questions.
* Would chrony be interested in supporting the Prometheus metrics format?

I looked at the page describing the archicture, but it's not clear to
me how would a support in chrony look like. Would chronyd or something
using the chronyc protocol be listening on a port for requests? Or
would it periodically push data over socket somewhere? The page
listing client libraries does't include a C library.

Post by Ben Kochie
* Is there a mode for the various metrics outputs to be more machine
readable? (json?)

No, not yet. I'd like to add a raw mode to chronyc that would print
the values in something easily parseable. I'm not sure about json, I'd
probably prefer something usable even from shell using just sed or
awk.

Post by Ben Kochie
* Is there documentation for the chronyc protocol outside the code?

No, unfortunately not. FWIW, the protocol is quite simple, almost all
information you would need to implement a new client is contained in
candm.h.

Post by Ben Kochie
* Are there any non-C chronyc client implementations? (python/ruby/whatever)

Probably not, at least I've not seen anything. At some point I'd like
to split chronyc into a library and a client application. Bindings for
other languages could then be easily created.

--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.

Ben Kochie

2016-02-11 13:12:36 UTC

so I

Post by Ben Kochie
have to parse the precision and convert that to be all in one precision.

Yeah, I've struggled with that too. I like the human readable format
when inspecting the chrony state, but it does complicate parsing quite
a bit.

Post by Ben Kochie
A couple of specific questions.
* Would chrony be interested in supporting the Prometheus metrics format?

Typically we do this one of a few ways.
#1 - The application listens on a port for http requests, the default is
/metrics. It then can respond with plain/text in the format I posted
above. Or it will content negotiate and use grpc, a nice compact protobuf
format. The grpc format is the most efficient, but we've had few problems
collecting text metrics at scale.

#2 - We run a side-car exporter. We do this quite a lot for existing open
source software, like mysql, that would never listen on http, but can
provide metrics with their own protocol.

#3 - The way we collect metrics for ntpd, is we have a loop script, or cron
script, that parse output and put that output in prometheus format into a
text file. Then we access these metrics via the node_exporter's textfile
reader.

#4 - We use something like mtail[0] and parse log files. This is what I do
for things like apache[1] that have minimal useful internal metrics.

[0]: https://github.com/google/mtail
[1]:
https://github.com/google/mtail/blob/master/examples/apache_metrics.mtail

Post by Ben Kochie
* Is there a mode for the various metrics outputs to be more machine
readable? (json?)

One idea I had would be to add a "metrics" command to chronyc. Then you
could run a loop/cron job that would be basically "chronyc metrics >
chrony_metrics.prom"

The output format would be sed/awk friendly as you always get one metric
key and value per line.

Post by Ben Kochie
* Is there documentation for the chronyc protocol outside the code?

No, unfortunately not. FWIW, the protocol is quite simple, almost all
information you would need to implement a new client is contained in
candm.h.

Ok, I will take a look.

Post by Ben Kochie
* Are there any non-C chronyc client implementations?

(python/ruby/whatever)
Probably not, at least I've not seen anything. At some point I'd like
to split chronyc into a library and a client application. Bindings for
other languages could then be easily created.

This would be pretty nice.

Post by Miroslav Lichvar
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.

Miroslav Lichvar

2016-02-12 09:05:28 UTC

Post by Miroslav Lichvar
I looked at the page describing the archicture, but it's not clear to
me how would a support in chrony look like. Would chronyd or something
using the chronyc protocol be listening on a port for requests? Or
would it periodically push data over socket somewhere? The page
listing client libraries does't include a C library.

Typically we do this one of a few ways.
#2 - We run a side-car exporter. We do this quite a lot for existing open
source software, like mysql, that would never listen on http, but can
provide metrics with their own protocol.

This one seems most reasonable to me. A separate service that uses the
chronyc protocol to read the metrics from chronyd.

Post by Ben Kochie
#3 - The way we collect metrics for ntpd, is we have a loop script, or cron
script, that parse output and put that output in prometheus format into a
text file. Then we access these metrics via the node_exporter's textfile
reader.

This is probably the easiest way :).

Post by Ben Kochie
#4 - We use something like mtail[0] and parse log files. This is what I do
for things like apache[1] that have minimal useful internal metrics.

The chrony logs are good in showing when exactly has the state
changed, but if you are interested in metrics like root dispersion,
which are constantly changing (in a deterministic way), you would have
to calculate their current value.

Post by Ben Kochie
* Is there a mode for the various metrics outputs to be more machine
readable? (json?)

One idea I had would be to add a "metrics" command to chronyc. Then you
could run a loop/cron job that would be basically "chronyc metrics >
chrony_metrics.prom"

Which metrics it would print? With the "clients" command for instance
there can megabytes of data, which in most cases probably wouldn't be
useful to collect, but in some cases I think it might, e.g. monitoring
if clients are alive from the server in a small network.

Post by Ben Kochie
The output format would be sed/awk friendly as you always get one metric
key and value per line.

If there was just one key/value per line, wouldn't it be more
difficult for a simple sed/awk parser to group data by source, as in
sourcestats?

I was considering something like CSV, which can be parsed in shell
with a single "read" command and can be easily converted to more
verbose formats like json.

$ chronyc -r tracking
#refid,address,stratum,...
10.16.255.1,10.16.255.1,2,...

$ chronyc -r sources | grep -v '^#' | while IFS=, read mode state ...
do
echo $mode $state ...
done

Ben Kochie

2016-02-27 14:08:12 UTC

So I started work on adding a "metrics" command to client.c. It's pretty
hacky, but works.

https://github.com/SuperQ/chrony/pull/1

Comments welcome.

- Ben Kochie

Typically we do this one of a few ways.
#2 - We run a side-car exporter. We do this quite a lot for existing

open

Post by Ben Kochie
source software, like mysql, that would never listen on http, but can
provide metrics with their own protocol.

This one seems most reasonable to me. A separate service that uses the
chronyc protocol to read the metrics from chronyd.

Post by Ben Kochie
#3 - The way we collect metrics for ntpd, is we have a loop script, or

cron

Post by Ben Kochie
script, that parse output and put that output in prometheus format into a
text file. Then we access these metrics via the node_exporter's textfile
reader.

This is probably the easiest way :).

Post by Ben Kochie
#4 - We use something like mtail[0] and parse log files. This is what I

Post by Ben Kochie
for things like apache[1] that have minimal useful internal metrics.

Post by Ben Kochie
* Is there a mode for the various metrics outputs to be more machine
readable? (json?)

One idea I had would be to add a "metrics" command to chronyc. Then you
could run a loop/cron job that would be basically "chronyc metrics >
chrony_metrics.prom"

Post by Ben Kochie
The output format would be sed/awk friendly as you always get one metric
key and value per line.

If there was just one key/value per line, wouldn't it be more
difficult for a simple sed/awk parser to group data by source, as in
sourcestats?
I was considering something like CSV, which can be parsed in shell
with a single "read" command and can be easily converted to more
verbose formats like json.
$ chronyc -r tracking
#refid,address,stratum,...
10.16.255.1,10.16.255.1,2,...
$ chronyc -r sources | grep -v '^#' | while IFS=, read mode state ...
do
echo $mode $state ...
done
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.

Miroslav Lichvar

2016-02-29 09:10:14 UTC

(this discussion would better fit the chrony-devel list)

Post by Ben Kochie
So I started work on adding a "metrics" command to client.c. It's pretty
hacky, but works.
https://github.com/SuperQ/chrony/pull/1
Comments welcome.

Ok, so you implemented the metrics command as a new function which
does the same as the serverstats command, but uses a different output
format. I assume you would extend it later to include also the
tracking, sources and sourcestats data. That would be a lot of
duplicated code.

As I said in the previous mail, I'd rather see it implemented as a
different output format for the existing commands. A new chronyc
option could be added to select the format, with default being the
currently used human-readable output. A new printf-like function would
be added, which would support printing hostnames or IP addresses, time
intervals, offsets, and all other data that need to be printed.
Depending on what output mode chronyc was running in, it would print
the labels, align the columns, print the values with units, print end
of lines, etc. All functions that implement the individual commands
would then be modified to use this new function.

I'm planning to look into this in the next few weeks. At this point
I'm mainly interested in adding the CSV format to allow easy parsing
in shell, but I think the Prometheus format could be added too.

Does this make sense?

Ben Kochie

2016-02-29 09:40:49 UTC