[chrony-users] HW Timestamping fails with specific source

Post by Thibaut BEYLER
I recently investigated some server on a site that struggle to use HW
timestamp and spend most of their time in deamon/deamon mode instead of
hardware/kernel. The weird part is that it occurs only with source 1, the
client kernels are stable with hw/kernel on the less acurate sources.

It's a race condition between receiving the TX timestamp of the client
request and receiving the response from the server. If the response is
so fast that it is received before the TX timestamp of the request,
the late timestamp will be ignored as there will not be a
corresponding request to which it could be applied.

What NIC do the clients have? I've seen this with an Intel card. It
happened only for a minority of requests and they were all dropped due
to failing the test C, so overall it worked well with HW timestamping.

I'm not sure if it should be treated as a driver/HW issue or if
applications should really be expected to get TX timestamps so late.
I asked about this on the Intel development list some time ago, but
didn't get a response. I think it could be addressed in chrony by
introducing a new timeout for timestamps, but I'd rather avoid the
extra complexity.

As a workaround you could try to add another switch between the server
and clients to increase the peer delay. You could also try to lower
the priority of the chronyd process to give it more time to get the
timestamp. Someone reported it happened only when chronyd was running
with a high priority.

--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.

Thibaut BEYLER

2018-01-29 18:21:30 UTC

Nics are Intel 10GB FC, 82599ES on all sites.

I haven't though of that race condition, site which has problem is the most
recent so more recent switches, cpu & mobos, so that would make sense,

However, in my case 98% of he requests are Deamon/Deamon, so most of them
pasts all the tests (generally those below 60us) which is bad for time
accurancy.

Looking at the logs i can also spots some crazy vales getting 2 secondes
peer delay (and thoses are Deamon/Kernel ) .. mixed timestamp ? Will
investigate more tommorow, thanks

It's a race condition between receiving the TX timestamp of the client
request and receiving the response from the server. If the response is
so fast that it is received before the TX timestamp of the request,
the late timestamp will be ignored as there will not be a
corresponding request to which it could be applied.
What NIC do the clients have? I've seen this with an Intel card. It
happened only for a minority of requests and they were all dropped due
to failing the test C, so overall it worked well with HW timestamping.
I'm not sure if it should be treated as a driver/HW issue or if
applications should really be expected to get TX timestamps so late.
I asked about this on the Intel development list some time ago, but
didn't get a response. I think it could be addressed in chrony by
introducing a new timeout for timestamps, but I'd rather avoid the
extra complexity.
As a workaround you could try to add another switch between the server
and clients to increase the peer delay. You could also try to lower
the priority of the chronyd process to give it more time to get the
timestamp. Someone reported it happened only when chronyd was running
with a high priority.
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.

Miroslav Lichvar

2018-02-02 10:48:59 UTC

Post by Thibaut BEYLER
However, in my case 98% of he requests are Deamon/Deamon, so most of them
pasts all the tests (generally those below 60us) which is bad for time
accurancy.

Hm, with recent kernels and drivers that support SW timestamping the
timestamps should always be at least kernel/kernel. What kernel
version do the clients have?

I've tried a few different approaches to wait for the late HW
timestamp and pushed one that I'm quite happy with. It is specific to
the Linux timestamping code and should have a minimal impact in the
usual case when the timestamps are not late (mostly limited to a
configuration with one interface that has HW timestamping, but another
does not).

Could you please build from git and see if it fixes the issue in your
environment?

Post by Thibaut BEYLER
Looking at the logs i can also spots some crazy vales getting 2 secondes
peer delay (and thoses are Deamon/Kernel ) .. mixed timestamp ? Will
investigate more tommorow, thanks

Do you have a packet capture or chrony debug output showing this?

Thibaut BEYLER

2018-02-12 17:16:02 UTC

Post by Miroslav Lichvar
Hm, with recent kernels and drivers that support SW timestamping the
timestamps should always be at least kernel/kernel. What kernel
version do the clients have ?

Clients have a kernel v4.9.51 so yes it should work (and it works with
other sources)

I've tried a few different approaches to wait for the late HW

Post by Miroslav Lichvar
timestamp and pushed one that I'm quite happy with. It is specific to
the Linux timestamping code and should have a minimal impact in the
usual case when the timestamps are not late (mostly limited to a
configuration with one interface that has HW timestamping, but another
does not).

Could you please build from git and see if it fixes the issue in your

Post by Miroslav Lichvar
environment?

Thanks ! back from holiday and tested that. Good news is that TX
timestamping is now always HW, however RX timestamp is Deamon most of the
time, sometimes Kernel. However, the general performance are worse now :
the peer delay is jittery and goes very high. Here's the debug log with my
two sources (slow : 10.214.11.23 , fast : 10.214.16.11 ) the best one
having the worst perfs : https://pastebin.com/yi0nYNCm

Post by Miroslav Lichvar
Looking at the logs i can also spots some crazy vales getting 2 secondes

Post by Thibaut BEYLER
peer delay (and thoses are Deamon/Kernel ) .. mixed timestamp ? Will
investigate more tommorow, thanks

Do you have a packet capture or chrony debug output showing this?

I could not reproduce that issue today but here's a debug log from 2 weeks
ago that I saved : https://pastebin.com/LUcYsTNz

Post by Thibaut BEYLER
However, in my case 98% of he requests are Deamon/Deamon, so most of them
pasts all the tests (generally those below 60us) which is bad for time
accurancy.

Hm, with recent kernels and drivers that support SW timestamping the
timestamps should always be at least kernel/kernel. What kernel
version do the clients have?
I've tried a few different approaches to wait for the late HW
timestamp and pushed one that I'm quite happy with. It is specific to
the Linux timestamping code and should have a minimal impact in the
usual case when the timestamps are not late (mostly limited to a
configuration with one interface that has HW timestamping, but another
does not).
Could you please build from git and see if it fixes the issue in your
environment?

Do you have a packet capture or chrony debug output showing this?
--
Miroslav Lichvar
--
with "unsubscribe" in the subject.
with "help" in the subject.

Miroslav Lichvar

2018-02-13 14:28:58 UTC

Post by Miroslav Lichvar
Could you please build from git and see if it fixes the issue in your
environment?

Thanks ! back from holiday and tested that. Good news is that TX
timestamping is now always HW, however RX timestamp is Deamon most of the
the peer delay is jittery and goes very high. Here's the debug log with my
two sources (slow : 10.214.11.23 , fast : 10.214.16.11 ) the best one
having the worst perfs : https://pastebin.com/yi0nYNCm

Thanks. I can reproduce the issue now. It's a race condition in the
kernel and it's not related to HW timestamping.

When chronyd opens a socket with kernel RX timestamping and it's the
only socket on the system using timestamping, it takes some time
before the kernel actually starts timestamping received packets. If a
response is received before that, it will only have a daemon timestamp
and that one is bad due to waiting for the transmit timestamp.

I'm not sure yet how to fix it. We could keep a dummy socket open to
permanently enable timestamping in the kernel, or maybe reuse the
cmdmon sockets for this purpose.

As a workaround, you can enable the NTP server socket, which will keep
the timestamping enabled, e.g. by adding "allow 127.0.0.1" to
chrony.conf.

Post by Miroslav Lichvar
Looking at the logs i can also spots some crazy vales getting 2 secondes

Post by Thibaut BEYLER
peer delay (and thoses are Deamon/Kernel ) .. mixed timestamp ? Will
investigate more tommorow, thanks

Do you have a packet capture or chrony debug output showing this?

I could not reproduce that issue today but here's a debug log from 2 weeks
ago that I saved : https://pastebin.com/LUcYsTNz

Interesting. I don't see anything wrong here. It looks more like an
issue with the network or server, and not the timestamping itself, as
there is a >2s gap between the log messages corresponding to the
transmission and reception.

Thibaut BEYLER

2018-02-14 10:26:13 UTC

Post by Miroslav Lichvar
When chronyd opens a socket with kernel RX timestamping and it's the
only socket on the system using timestamping, it takes some time
before the kernel actually starts timestamping received packets. If a
response is received before that, it will only have a daemon timestamp
and that one is bad due to waiting for the transmit timestamp.

Glad you found something, so the problem occurs when the ntp response
packet is received before the kernel could initialize the RX timestamping ?

As a workaround, you can enable the NTP server socket, which will keep

Post by Miroslav Lichvar
the timestamping enabled, e.g. by adding "allow 127.0.0.1" to
chrony.conf.

Works great this way, thanks ! Hardware/Kernel all the time and low peer
delay with this workaround.

Post by Miroslav Lichvar
Could you please build from git and see if it fixes the issue in your
environment?

Post by Thibaut BEYLER
two sources (slow : 10.214.11.23 , fast : 10.214.16.11 ) the best one
having the worst perfs : https://pastebin.com/yi0nYNCm

Thanks. I can reproduce the issue now. It's a race condition in the
kernel and it's not related to HW timestamping.
When chronyd opens a socket with kernel RX timestamping and it's the
only socket on the system using timestamping, it takes some time
before the kernel actually starts timestamping received packets. If a
response is received before that, it will only have a daemon timestamp
and that one is bad due to waiting for the transmit timestamp.
I'm not sure yet how to fix it. We could keep a dummy socket open to
permanently enable timestamping in the kernel, or maybe reuse the
cmdmon sockets for this purpose.
As a workaround, you can enable the NTP server socket, which will keep
the timestamping enabled, e.g. by adding "allow 127.0.0.1" to
chrony.conf.

Post by Miroslav Lichvar
Looking at the logs i can also spots some crazy vales getting 2

secondes

Post by Thibaut BEYLER
peer delay (and thoses are Deamon/Kernel ) .. mixed timestamp ? Will
investigate more tommorow, thanks

Do you have a packet capture or chrony debug output showing this?

I could not reproduce that issue today but here's a debug log from 2

weeks

Post by Thibaut BEYLER
ago that I saved : https://pastebin.com/LUcYsTNz

Miroslav Lichvar

2018-02-14 10:55:16 UTC

Glad you found something, so the problem occurs when the ntp response
packet is received before the kernel could initialize the RX timestamping ?

Yes. When chronyd is polling a server, it opens a socket, enables
timestamping and few other options on the socket, and sends a request.
The kernel has a global counter for sockets using timestamping, but it
is not updated synchronously with that setting on the socket (for
performance reasons I assume). If chronyd is the only application
using timestamping and it doesn't have the server socket open, its
client socket will be the one which turns the kernel timestamping on
and off.

Post by Thibaut BEYLER
Works great this way, thanks ! Hardware/Kernel all the time and low peer
delay with this workaround.

Great.

I'll probably push a fix which will keep one socket open all the time
just for the timestamping.

Denny Page

2018-01-29 17:39:42 UTC