Discussion:
[chrony-users] Time offset on versions 3+ without hw timestamping
Thibaut BEYLER
2017-08-29 16:29:18 UTC
Permalink
Hi,

I have tested different versions and setup of chrony the last few weeks. My
goal is to get the most accurate and stable time possible from UTC using
ntp and a gps timeserver.

My setup is pretty simple, linux boxes, one switch between everything. For
my tests I am also using a third-party software as monitoring, connected
via PTP and PPS on my timeservers, which does ntp polling (with hw
timestamping) every second on the different clients and graph the offsets.

I started first with older versions of chrony (1.30) and got pretty good
results, then tried to play with client side hardware timestamping and
tested version 3.

The results with hwtimestamps were pretty good, especialy with a source
that use hw timestamps as well, showing a very steady line on my monitoring
system with a slight constant offset (+5us).. maybe due to an
asymmetry somewhere ?

However I then noticed than all my chrony 3+ clients with hardware
timestamping not enabled report suddenly an innacurate time with a constant
offset to utc (approximativly -20 to -25 usec to utc) on my monitoring
system. If i switch back to chrony 1.30, with the same configuration, the
time offset is good again.

I first thought that there was a problem on the server side of chrony, but
if I sync a 1.30 client on a 3+ client, it will report the same offset too.

Another behaviour i noticed on version 3+ with kernel timestamping is that
for some server i can have very long poll time (60 seconds) even if i
configured minpoll and maxpoll to shorter interval (0 & 1)

I tried version 2.4.1 and there are no such problem, so I assume it starts
with version 3.0. I just tested the chrony 3.2-pre2 and the problem is
still there.

Any idea where this could come from ?

Happy to provide more infos/data if needed
Miroslav Lichvar
2017-08-29 17:09:01 UTC
Permalink
Post by Thibaut BEYLER
My setup is pretty simple, linux boxes, one switch between everything. For
my tests I am also using a third-party software as monitoring, connected
via PTP and PPS on my timeservers, which does ntp polling (with hw
timestamping) every second on the different clients and graph the offsets.
It's not very clear to me how the setup looks like. The monitoring
software runs on the clients and it uses the same or different NTP
server, or it monitors the clients (operating also as servers) from a
different machine over NTP?
Post by Thibaut BEYLER
Another behaviour i noticed on version 3+ with kernel timestamping is that
for some server i can have very long poll time (60 seconds) even if i
configured minpoll and maxpoll to shorter interval (0 & 1)
You mean the update interval (as reported by chronyc tracking) is 60
seconds and the responses are not passing the test C (reported in
chronyc ntpdata)? That's normal if there is a lot of jitter for longer
periods of time. The difference between older versions and 3.0+ is
that the synchronization may be more stable due to SW timestamping or
the correction for asymmetric jitter (reported in chronyc ntpdata) and
it takes longer before chronyd is willing to accept a measurement with
larger delay. You can increase the maxdelaydevratio value if you would
prefer more frequent updates.
Post by Thibaut BEYLER
I tried version 2.4.1 and there are no such problem, so I assume it starts
with version 3.0. I just tested the chrony 3.2-pre2 and the problem is
still there.
Any idea where this could come from ?
The difference is most likely in SW timestamping which is supported
since 3.0. Before 3.0, chronyd used kernel RX timestamps and daemon TX
timestamps. If the client is using kernel/kernel timestamps with server
which uses kernel/daemon timestamps, or the server has kernel/kernel
timestamps, but the client is not using the interleaved mode, there
will likely be an asymmetry even if the client and server have
identical hardware.

Enabling HW timestamping only on one side will likely improve
stability, but may have a negative effect on accuracy as the
asymmetries are less likely to cancel out.

An interesting test would be to enable TX-only HW timestamping with
"rxfilter none" and see how the asymmetry changes.

In any case, if you can measure the asymmetry and it's stable, it can
be corrected with the offset option.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Thibaut BEYLER
2017-08-30 18:07:41 UTC
Permalink
Post by Miroslav Lichvar
It's not very clear to me how the setup looks like. The monitoring
software runs on the clients and it uses the same or different NTP
server, or it monitors the clients (operating also as servers) from a
different machine over NTP?
The monitoring software run on another server and monitor the chrony
clients (which are operating as servers in that case) over NTP


You mean the update interval (as reported by chronyc tracking) is 60
Post by Miroslav Lichvar
seconds and the responses are not passing the test C (reported in
chronyc ntpdata)? That's normal if there is a lot of jitter for longer
periods of time. The difference between older versions and 3.0+ is
that the synchronization may be more stable due to SW timestamping or
the correction for asymmetric jitter (reported in chronyc ntpdata) and
it takes longer before chronyd is willing to accept a measurement with
larger delay. You can increase the maxdelaydevratio value if you would
prefer more frequent updates.
Yes indeed, test C is failing pretty quickly

The difference is most likely in SW timestamping which is supported
Post by Miroslav Lichvar
since 3.0. Before 3.0, chronyd used kernel RX timestamps and daemon TX
timestamps. If the client is using kernel/kernel timestamps with server
which uses kernel/daemon timestamps, or the server has kernel/kernel
timestamps, but the client is not using the interleaved mode, there
will likely be an asymmetry even if the client and server have
identical hardware.
Enabling HW timestamping only on one side will likely improve
Post by Miroslav Lichvar
stability, but may have a negative effect on accuracy as the
asymmetries are less likely to cancel out
An interesting test would be to enable TX-only HW timestamping with
"rxfilter none" and see how the asymmetry changes.



I did some tests today regarding the timestamp generation, i have a client
running chrony 3 with a 3.16 so it is actually till using the good old
deamon/kernel (DK) timestamp (at least that's what is printed in ntpdata).

In the end, the offset reported by my monitoring are almost the same wheter
i'm using DK , KK , or HK. Only in full HH the reported offset change
drasticly

I can notice however that when using DK, KK or HK the reported jitter
asymmetry is almost always all the time to +0.50 , with HH it's always 0

Could the asymmetric jitter correction added in chrony 3 produce wrong time
in my setup ? Is it possible to disable this in order to test ?

In any case, if you can measure the asymmetry and it's stable, it can
Post by Miroslav Lichvar
be corrected with the offset option.
Actually after longer period of monitoring i can see it's not stable (at
least not with KK or HK), sometime the offset jump by 5, 10 or 20us , stay
stable , and then jump again
Post by Miroslav Lichvar
Post by Thibaut BEYLER
My setup is pretty simple, linux boxes, one switch between everything.
For
Post by Thibaut BEYLER
my tests I am also using a third-party software as monitoring, connected
via PTP and PPS on my timeservers, which does ntp polling (with hw
timestamping) every second on the different clients and graph the
offsets.
It's not very clear to me how the setup looks like. The monitoring
software runs on the clients and it uses the same or different NTP
server, or it monitors the clients (operating also as servers) from a
different machine over NTP?
Post by Thibaut BEYLER
Another behaviour i noticed on version 3+ with kernel timestamping is
that
Post by Thibaut BEYLER
for some server i can have very long poll time (60 seconds) even if i
configured minpoll and maxpoll to shorter interval (0 & 1)
You mean the update interval (as reported by chronyc tracking) is 60
seconds and the responses are not passing the test C (reported in
chronyc ntpdata)? That's normal if there is a lot of jitter for longer
periods of time. The difference between older versions and 3.0+ is
that the synchronization may be more stable due to SW timestamping or
the correction for asymmetric jitter (reported in chronyc ntpdata) and
it takes longer before chronyd is willing to accept a measurement with
larger delay. You can increase the maxdelaydevratio value if you would
prefer more frequent updates.
Post by Thibaut BEYLER
I tried version 2.4.1 and there are no such problem, so I assume it
starts
Post by Thibaut BEYLER
with version 3.0. I just tested the chrony 3.2-pre2 and the problem is
still there.
Any idea where this could come from ?
The difference is most likely in SW timestamping which is supported
since 3.0. Before 3.0, chronyd used kernel RX timestamps and daemon TX
timestamps. If the client is using kernel/kernel timestamps with server
which uses kernel/daemon timestamps, or the server has kernel/kernel
timestamps, but the client is not using the interleaved mode, there
will likely be an asymmetry even if the client and server have
identical hardware.
Enabling HW timestamping only on one side will likely improve
stability, but may have a negative effect on accuracy as the
asymmetries are less likely to cancel out.
An interesting test would be to enable TX-only HW timestamping with
"rxfilter none" and see how the asymmetry changes.
In any case, if you can measure the asymmetry and it's stable, it can
be corrected with the offset option.
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
Miroslav Lichvar
2017-08-30 18:42:35 UTC
Permalink
Post by Thibaut BEYLER
The monitoring software run on another server and monitor the chrony
clients (which are operating as servers in that case) over NTP
So the same timestamping combination that the client is using to
synchronize its clock is used in the monitoring? I'm not sure if that
is a valid test. If there is a large asymmetry and the clock has a
large error, I don't think the monitoring client would see it, because
the asymmetry would cancel the error out in the opposite direction, in
which the monitoring client is making measurements.

In order to measure the error with SW timestamping it's necessary to
use something better, e.g. HW timestamping or a reference clock.

You could run a separate server instance of chronyd on a different
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
Post by Thibaut BEYLER
I can notice however that when using DK, KK or HK the reported jitter
asymmetry is almost always all the time to +0.50 , with HH it's always 0
Could the asymmetric jitter correction added in chrony 3 produce wrong time
in my setup ? Is it possible to disable this in order to test ?
With 3.2-pre2 you can disable it with "asymmetry 0.0" in the server
directive.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-08-30 19:14:05 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Miroslav Lichvar
Post by Thibaut BEYLER
The monitoring software run on another server and monitor the chrony
clients (which are operating as servers in that case) over NTP
So the same timestamping combination that the client is using to
synchronize its clock is used in the monitoring? I'm not sure if that
is a valid test. If there is a large asymmetry and the clock has a
large error, I don't think the monitoring client would see it, because
the asymmetry would cancel the error out in the opposite direction, in
which the monitoring client is making measurements.
In order to measure the error with SW timestamping it's necessary to
use something better, e.g. HW timestamping or a reference clock.
You could run a separate server instance of chronyd on a different
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
The problem is interrupt contention, and clock reading contention. I once
tried to have two different readers (chrony and ntpd, or two versions of
chrony) reading the interrupt at the same time and one of them was about 8us
out because it did not get the interrupt until the first one had finished.
Ie, in monitoring you do not want to do it "on the second" since it might
interfere with the other one.
Post by Miroslav Lichvar
Post by Thibaut BEYLER
I can notice however that when using DK, KK or HK the reported jitter
asymmetry is almost always all the time to +0.50 , with HH it's always 0
Could the asymmetric jitter correction added in chrony 3 produce wrong time
in my setup ? Is it possible to disable this in order to test ?
With 3.2-pre2 you can disable it with "asymmetry 0.0" in the server
directive.
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2017-08-31 06:40:41 UTC
Permalink
Post by Bill Unruh
Post by Miroslav Lichvar
You could run a separate server instance of chronyd on a different
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
The problem is interrupt contention, and clock reading contention. I once
tried to have two different readers (chrony and ntpd, or two versions of
chrony) reading the interrupt at the same time and one of them was about 8us
out because it did not get the interrupt until the first one had finished.
Ie, in monitoring you do not want to do it "on the second" since it might
interfere with the other one.
Was that with timestamping of PPS in userspace using ioctl(TIOCMIWAIT)?

I don't think that applies to timestamping of NTP packets. A receive
timestamp is made by the kernel, not by the application, and separate
instances of chronyd are not receiving both the same packet. They
won't share the same UDP port.

When running multiple instances of chronyd, it's important than only
one of them is adjusting the clock. Also, they should be configured to
use different "port", "cmdport", "bindcmdaddress", and "pidfile".
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-08-31 12:58:34 UTC
Permalink
It has been a while now and I do not entirely remember. I wrote my own
interrupt handler module, which would grab the time immediately that it was
called by the kernel to service the interrupt, so it was effectively the
kernel that was servicing the timestamping. If I recall correctly I had two
modules attached to two interrupts, but of course one would be called before
theother, and that would give a delay in the servicing.
Post by Miroslav Lichvar
Post by Bill Unruh
Post by Miroslav Lichvar
You could run a separate server instance of chronyd on a different
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
The problem is interrupt contention, and clock reading contention. I once
tried to have two different readers (chrony and ntpd, or two versions of
chrony) reading the interrupt at the same time and one of them was about 8us
out because it did not get the interrupt until the first one had finished.
Ie, in monitoring you do not want to do it "on the second" since it might
interfere with the other one.
Was that with timestamping of PPS in userspace using ioctl(TIOCMIWAIT)?
I don't think that applies to timestamping of NTP packets. A receive
timestamp is made by the kernel, not by the application, and separate
instances of chronyd are not receiving both the same packet. They
won't share the same UDP port.
When running multiple instances of chronyd, it's important than only
one of them is adjusting the clock. Also, they should be configured to
use different "port", "cmdport", "bindcmdaddress", and "pidfile".
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Thibaut BEYLER
2017-08-31 16:48:14 UTC
Permalink
Post by Miroslav Lichvar
So the same timestamping combination that the client is using to
synchronize its clock is used in the monitoring? I'm not sure if that
is a valid test. If there is a large asymmetry and the clock has a
large error, I don't think the monitoring client would see it, because
the asymmetry would cancel the error out in the opposite direction, in
which the monitoring client is making measurements.
In order to measure the error with SW timestamping it's necessary to
use something better, e.g. HW timestamping or a reference clock.
You could run a separate server instance of chronyd on a different
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
Ok i think i understand what you mean, i will try to configure a server
instance to see if there is differencies. What source should i configure on
that instance ?

With 3.2-pre2 you can disable it with "asymmetry 0.0" in the server
Post by Miroslav Lichvar
directive.
I tried to force the jitter asymmetry to 0 and got good results again. Also
with this settings test C are not failing anymore.

With the default chrony settings the asymmetry and test C failures are all
occuring with sources coming from different ntp output (SW and HW) of two
timervers from the same vendor.

With a third timeserver (sw ntp output only) no asymmetry is detected and
the results are good.

The jitter asymmetry goes especially very fast from 0 to +0.50 with the
sources that have hw timestamping, here are some logs just after a chrony
restarts for instance :

https://pastebin.com/fSHuw7Mx
Post by Miroslav Lichvar
It has been a while now and I do not entirely remember. I wrote my own
interrupt handler module, which would grab the time immediately that it was
called by the kernel to service the interrupt, so it was effectively the
kernel that was servicing the timestamping. If I recall correctly I had two
modules attached to two interrupts, but of course one would be called before
theother, and that would give a delay in the servicing.
Post by Miroslav Lichvar
Post by Miroslav Lichvar
You could run a separate server instance of chronyd on a different
Post by Miroslav Lichvar
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
The problem is interrupt contention, and clock reading contention. I once
tried to have two different readers (chrony and ntpd, or two versions of
chrony) reading the interrupt at the same time and one of them was about 8us
out because it did not get the interrupt until the first one had finished.
Ie, in monitoring you do not want to do it "on the second" since it might
interfere with the other one.
Was that with timestamping of PPS in userspace using ioctl(TIOCMIWAIT)?
I don't think that applies to timestamping of NTP packets. A receive
timestamp is made by the kernel, not by the application, and separate
instances of chronyd are not receiving both the same packet. They
won't share the same UDP port.
When running multiple instances of chronyd, it's important than only
one of them is adjusting the clock. Also, they should be configured to
use different "port", "cmdport", "bindcmdaddress", and "pidfile".
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
--
"unsubscribe" in the subject.
the subject.
Miroslav Lichvar
2017-09-01 13:47:29 UTC
Permalink
Post by Thibaut BEYLER
Post by Miroslav Lichvar
You could run a separate server instance of chronyd on a different
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
Ok i think i understand what you mean, i will try to configure a server
instance to see if there is differencies. What source should i configure on
that instance ?
Just the local clock, e.g. 'local stratum 10' in chrony.conf.
Post by Thibaut BEYLER
I tried to force the jitter asymmetry to 0 and got good results again. Also
with this settings test C are not failing anymore.
The question is if "good" really means more accurate here. I'm still
not sure what is your reference and how is measured the offset of the
clients, what timestamping is involved, etc. If the monitoring client
used DK timestamping, it wouldn't be surprising if DK on the machine
which is tested gave the smallest offset.

I think a good way to measure the accuracy of the clients would be to
get an independent stratum-1 NTP server which is known to be accurate
(e.g. the LeoNTP unit), add a network card which has HW timestamping
and fast reading of the clock (e.g. i210 or i350) to the client that
should be tested, connect it directly to the reference server and
measure the offset on the client with a separate chronyd instance
configured to not adjust the clock.

Something like this:

Production NTP server ---> Network ---> Client <---- Reference NTP server
Post by Thibaut BEYLER
The jitter asymmetry goes especially very fast from 0 to +0.50 with the
sources that have hw timestamping, here are some logs just after a chrony
https://pastebin.com/fSHuw7Mx
That suggests some of the timestamps (server's or client's) are not HW
timestamps, or that the client is not using the interleaved mode when
it needs to. An HH client with HH server connected to same switch with
low network traffic in my experience doesn't show any asymmetry. With
kernel and daemon timestamps that's normal.

A good way to confirm that all timestamps used for synchronization are
HW timestamps is to check the delay as reported in measurements.log or
chronyc ntpdata. If you know the the switch adds 20 microseconds, but
ntpdata shows delay larger than say 30 microseconds (assuming 1Gb
ethernet), you know something is wrong.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Thibaut BEYLER
2017-09-01 17:00:08 UTC
Permalink
I already have a secondary interface & addresses configured on my servers,
so I was thinking to use it for the second instance instead of using a
non-standard port (which my monitoring software doesn't support)

But how can I configure the server instance it to serve the local time ? if
I don't put any server directive it won't serve ntp, and if i put a server
directive it will serve that time..
Post by Miroslav Lichvar
Post by Thibaut BEYLER
Post by Miroslav Lichvar
You could run a separate server instance of chronyd on a different
port with HW timestamping for the monitoring client. It needs to
support the interleaved mode to be able to get the server's HW
transmit timestamps.
Ok i think i understand what you mean, i will try to configure a server
instance to see if there is differencies. What source should i configure
on
Post by Thibaut BEYLER
that instance ?
Just the local clock, e.g. 'local stratum 10' in chrony.conf.
Post by Thibaut BEYLER
I tried to force the jitter asymmetry to 0 and got good results again.
Also
Post by Thibaut BEYLER
with this settings test C are not failing anymore.
The question is if "good" really means more accurate here. I'm still
not sure what is your reference and how is measured the offset of the
clients, what timestamping is involved, etc. If the monitoring client
used DK timestamping, it wouldn't be surprising if DK on the machine
which is tested gave the smallest offset.
I think a good way to measure the accuracy of the clients would be to
get an independent stratum-1 NTP server which is known to be accurate
(e.g. the LeoNTP unit), add a network card which has HW timestamping
and fast reading of the clock (e.g. i210 or i350) to the client that
should be tested, connect it directly to the reference server and
measure the offset on the client with a separate chronyd instance
configured to not adjust the clock.
Production NTP server ---> Network ---> Client <---- Reference NTP server
Post by Thibaut BEYLER
The jitter asymmetry goes especially very fast from 0 to +0.50 with the
sources that have hw timestamping, here are some logs just after a chrony
https://pastebin.com/fSHuw7Mx
That suggests some of the timestamps (server's or client's) are not HW
timestamps, or that the client is not using the interleaved mode when
it needs to. An HH client with HH server connected to same switch with
low network traffic in my experience doesn't show any asymmetry. With
kernel and daemon timestamps that's normal.
A good way to confirm that all timestamps used for synchronization are
HW timestamps is to check the delay as reported in measurements.log or
chronyc ntpdata. If you know the the switch adds 20 microseconds, but
ntpdata shows delay larger than say 30 microseconds (assuming 1Gb
ethernet), you know something is wrong.
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
Miroslav Lichvar
2017-09-11 09:07:35 UTC
Permalink
Post by Thibaut BEYLER
I already have a secondary interface & addresses configured on my servers,
so I was thinking to use it for the second instance instead of using a
non-standard port (which my monitoring software doesn't support)
But how can I configure the server instance it to serve the local time ? if
I don't put any server directive it won't serve ntp, and if i put a server
directive it will serve that time..
There is the "local" directive for that. It allows chronyd to serve
the local system time, even if it's not synchronized to anything.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Thibaut BEYLER
2017-09-11 12:00:06 UTC
Permalink
Ok thanks, I tried with 'local" before and got very bad results, did some
digging and turns out that was because I was having the same driftfile on
both instance (probably the parameter isn't necessary on the server
config), it's better now.

I specified a custom sock as bindcmdaddress for each instance
(/var/run/chrony/chronyd_1.sock and /var/run/chrony/chronyd_1.sock) and a
custom cmdport for sthe server instance.

I can get chronyc to work with both instance, however ntpdata is not
working on neitheir of them, i get "501 Not authorised", any idea ?
Post by Thibaut BEYLER
Post by Thibaut BEYLER
I already have a secondary interface & addresses configured on my
servers,
Post by Thibaut BEYLER
so I was thinking to use it for the second instance instead of using a
non-standard port (which my monitoring software doesn't support)
But how can I configure the server instance it to serve the local time ?
if
Post by Thibaut BEYLER
I don't put any server directive it won't serve ntp, and if i put a
server
Post by Thibaut BEYLER
directive it will serve that time..
There is the "local" directive for that. It allows chronyd to serve
the local system time, even if it's not synchronized to anything.
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
Miroslav Lichvar
2017-09-11 12:13:30 UTC
Permalink
Post by Thibaut BEYLER
Ok thanks, I tried with 'local" before and got very bad results, did some
digging and turns out that was because I was having the same driftfile on
both instance (probably the parameter isn't necessary on the server
config), it's better now.
I specified a custom sock as bindcmdaddress for each instance
(/var/run/chrony/chronyd_1.sock and /var/run/chrony/chronyd_1.sock) and a
custom cmdport for sthe server instance.
I can get chronyc to work with both instance, however ntpdata is not
working on neitheir of them, i get "501 Not authorised", any idea ?
That means chronyc is not using the Unix domain socket. Did you
specify it with the -h option?

chronyc -h /var/run/chrony/chronyd_1.sock ntpdata
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Thibaut BEYLER
2017-09-11 13:58:07 UTC
Permalink
Yes it's better that way ! Did not realise we could use -h like this..

Coming back to my isues with chrony 3 : I played a bit with cpu power
management recently, at first to figure out first my problem with PPS
sources, and found out that disabling the c-states would improve *drastically
*chrony performances on my systems.

When i boot my systems with c-state disabled (kernel
parameters processor.max_cstate=1 idle=poll ) my sources std dev can be
devided by 5 to 10 with chrony 1.30

With chrony 3, changes are even more importants and i don't get any of
those problems i was having before, test C is not failing anymore, std dev
is minimal (under 100ns on sources with hardware timestamping, with k/k on
chrony side, 100x lower that when I was having without the kernel
parameters) and peer delay is reduced by over 50us.

Also, offset reported by my monitoring system are good again.

Would there be any way to get those performance without disabling c-states
system-wide ?
Post by Miroslav Lichvar
Post by Thibaut BEYLER
Ok thanks, I tried with 'local" before and got very bad results, did some
digging and turns out that was because I was having the same driftfile on
both instance (probably the parameter isn't necessary on the server
config), it's better now.
I specified a custom sock as bindcmdaddress for each instance
(/var/run/chrony/chronyd_1.sock and /var/run/chrony/chronyd_1.sock)
and a
Post by Thibaut BEYLER
custom cmdport for sthe server instance.
I can get chronyc to work with both instance, however ntpdata is not
working on neitheir of them, i get "501 Not authorised", any idea ?
That means chronyc is not using the Unix domain socket. Did you
specify it with the -h option?
chronyc -h /var/run/chrony/chronyd_1.sock ntpdata
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
Miroslav Lichvar
2017-10-06 07:40:33 UTC
Permalink
Post by Thibaut BEYLER
Coming back to my isues with chrony 3 : I played a bit with cpu power
management recently, at first to figure out first my problem with PPS
sources, and found out that disabling the c-states would improve *drastically
*chrony performances on my systems.
Would there be any way to get those performance without disabling c-states
system-wide ?
In case you are not following the chrony-dev list, there was a post
about a program that can disable power saving on a CPU core just for a
moment when it's waiting for the PPS interrupt.

https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2017/10/msg00012.html
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Thibaut BEYLER
2017-10-11 16:26:51 UTC
Permalink
Thanks, that seems like an interesting solution for power-saving. So far i
disabled c-state and power management system-wide (
using /dev/cpu_dma_latency and governors) which gives really good results
(more stable peer delay & std dev)

My PPS stddev get under 500ns, I however still get a constant offset betwen
my PPS source and my ntp sources of about 8-9us (same with different PPS
sources)

I really don't think it's a problem on the ntp servers side as it's using
two asic-powered sources with hardware timestamping, get constant stddev
under 10ns and can get value like 1.28us for "peer delay" and "max. error"
under 4us when i bypass the switch (for testing purpose)

I guess there is still some delay somewhere on the PPS signal processing
that gives this offset.
Post by Thibaut BEYLER
Post by Thibaut BEYLER
Coming back to my isues with chrony 3 : I played a bit with cpu power
management recently, at first to figure out first my problem with PPS
sources, and found out that disabling the c-states would improve
*drastically
Post by Thibaut BEYLER
*chrony performances on my systems.
Would there be any way to get those performance without disabling
c-states
Post by Thibaut BEYLER
system-wide ?
In case you are not following the chrony-dev list, there was a post
about a program that can disable power saving on a CPU core just for a
moment when it's waiting for the PPS interrupt.
https://listengine.tuxfamily.org/chrony.tuxfamily.org/
chrony-dev/2017/10/msg00012.html
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
Miroslav Lichvar
2017-10-11 17:08:12 UTC
Permalink
Post by Thibaut BEYLER
Thanks, that seems like an interesting solution for power-saving. So far i
disabled c-state and power management system-wide (
using /dev/cpu_dma_latency and governors) which gives really good results
(more stable peer delay & std dev)
My PPS stddev get under 500ns, I however still get a constant offset betwen
my PPS source and my ntp sources of about 8-9us (same with different PPS
sources)
Even if the CPU never enters a power-saving mode, I think there will
always be some delay between the interrupt and the kernel actually
making a PPS timestamp.

If the delay is stable and known, the measurements can be fixed with
the offset option.

A polling driver might be able to provide a better accuracy. I'm using
this one on a AR93xx-based board: https://github.com/mlichvar/pps-gpio-poll

Another way to get a sub-microsecond accuracy might be with the i210
card. It has software defined pins (SDP), which can be used for
external timestamping of a PPS signal. The extpps option of the PPS
refclock in chrony enables that.
Post by Thibaut BEYLER
I really don't think it's a problem on the ntp servers side as it's using
two asic-powered sources with hardware timestamping, get constant stddev
under 10ns and can get value like 1.28us for "peer delay" and "max. error"
under 4us when i bypass the switch (for testing purpose)
I guess there is still some delay somewhere on the PPS signal processing
that gives this offset.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-10-11 19:19:37 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Miroslav Lichvar
Post by Thibaut BEYLER
Thanks, that seems like an interesting solution for power-saving. So far i
disabled c-state and power management system-wide (
using /dev/cpu_dma_latency and governors) which gives really good results
(more stable peer delay & std dev)
My PPS stddev get under 500ns, I however still get a constant offset betwen
my PPS source and my ntp sources of about 8-9us (same with different PPS
sources)
Even if the CPU never enters a power-saving mode, I think there will
always be some delay between the interrupt and the kernel actually
making a PPS timestamp.
That should be of the order of a usec, not 8 or 9. On the otherhand one way
network delays of 8 or 9 us are very possible. It I would blame the ntp
network timestamping not the PPS. What is the network path to the ntp source?
Also does the pps source also trigger some other IRQ server. That other program could easily take 8-9usec
to process and delay your pps servicing.
Post by Miroslav Lichvar
If the delay is stable and known, the measurements can be fixed with
the offset option.
A polling driver might be able to provide a better accuracy. I'm using
this one on a AR93xx-based board: https://github.com/mlichvar/pps-gpio-poll
Another way to get a sub-microsecond accuracy might be with the i210
card. It has software defined pins (SDP), which can be used for
external timestamping of a PPS signal. The extpps option of the PPS
refclock in chrony enables that.
Post by Thibaut BEYLER
I really don't think it's a problem on the ntp servers side as it's using
two asic-powered sources with hardware timestamping, get constant stddev
under 10ns and can get value like 1.28us for "peer delay" and "max. error"
under 4us when i bypass the switch (for testing purpose)
But how does the ntp signal get from the ntp server to the computer? That path
might have one-way delays in it.
Post by Miroslav Lichvar
Post by Thibaut BEYLER
I guess there is still some delay somewhere on the PPS signal processing
that gives this offset.
It is hard to imagine that giving 8-9us unless you are using an 8088 computer.
Most irq time handlers will grab the system time first thing when called.
Thus it is the irq handling time in the cpu that gives the time delay.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-09-11 15:06:56 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Miroslav Lichvar
Post by Thibaut BEYLER
I already have a secondary interface & addresses configured on my servers,
so I was thinking to use it for the second instance instead of using a
non-standard port (which my monitoring software doesn't support)
But how can I configure the server instance it to serve the local time ? if
I don't put any server directive it won't serve ntp, and if i put a server
directive it will serve that time..
There is the "local" directive for that. It allows chronyd to serve
the local system time, even if it's not synchronized to anything.
Of course that has the disadvantage that it does nothing for the time on your
system at all. If it thinks it is Jan2 1970 it will continue doing that and
serve that as the time to the other systems. Ie, the local directive is useful
only if your system has at least sporadically, access to a good
server/refclock, and it is freewheeling when those go offline. It allows it to
continue server time to others, even as its time gets worse and worse untill
the next connectivity to its server.
Post by Miroslav Lichvar
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Loading...