Discussion:
[chrony-users] Defaulting to stepping the clock
Pedro Côrte-Real
2018-07-02 12:29:32 UTC
Permalink
Hi,

I'm a user of Ubuntu both on servers and desktops/laptops. With Ubuntu
18.04 switching to chrony by default I decided to use that on all
machines. When setting up the roll-out to all machines using puppet I
checked the default config to see if I neded to do any changes. Two
lines stood out:

maxupdateskew 100.0
makestep 1 3

If I'm reading the documentation correctly this means "step the clock
in the first three corrections if the step is above one second but
below 100"

It seems at least the second line is a recommended config:

https://chrony.tuxfamily.org/faq.html#_what_is_the_minimum_recommended_configuration_for_an_ntp_client

This seems very strange to me as a default for an NTP tool. I have two
main use cases that I assume are common:

1) For a server never step the clock and if the drift is large
complain loudly because something has gone very wrong. Servers are
always on and should be always syncing so if their clock drifts a lot
something has gone wrong.
2) In a desktop/laptop stepping the clock is probably always ok if
going forward but may be bad if going back. So just accept frequency
adjustments both going forward and backwards. Machines are turned
on/off, suspend/resume and so it's less important to complain loudly.
Instead maintain monotonic clocks that are synchronized quickly even
if their frequency needs to shift a log.

Given all this why were these defaults chosen? Are there recommended
settings to approximate 1) and 2)? Is the recommendation to do
something else?

Thanks,

Pedro
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2018-07-02 13:46:56 UTC
Permalink
Post by Pedro Côrte-Real
maxupdateskew 100.0
makestep 1 3
If I'm reading the documentation correctly this means "step the clock
in the first three corrections if the step is above one second but
below 100"
That 100.0 is actually a limit for the estimated error in frequency
and is not related to the offset.
Post by Pedro Côrte-Real
1) For a server never step the clock and if the drift is large
complain loudly because something has gone very wrong. Servers are
always on and should be always syncing so if their clock drifts a lot
something has gone wrong.
2) In a desktop/laptop stepping the clock is probably always ok if
going forward but may be bad if going back. So just accept frequency
adjustments both going forward and backwards. Machines are turned
on/off, suspend/resume and so it's less important to complain loudly.
Instead maintain monotonic clocks that are synchronized quickly even
if their frequency needs to shift a log.
Given all this why were these defaults chosen? Are there recommended
settings to approximate 1) and 2)? Is the recommendation to do
something else?
The recommended configuration is supposed to minimize the number of
people complaining that NTP does not work. I agree that for most
people it would be better to completely disable stepping of the clock.
But those few that have a computer which for some reason can start
with system time far from the true time, it would be a big problem.
They would need to figure out what's going on and use the chronyc
makestep command to fix it. That's not a good user experience.

I don't see a big difference between servers and desktops/laptops.
What matters is the RTC. On real HW it needs to be present, have a
battery, must not drift too much and the local/UTC setting must be
correct. Similar requirements apply to virtual machines. It needs to
work with suspend/resume and migrations.

Even if we could be assume that everything always works as expected,
someone or something would need to set the RTC for the first time.
When you bought a new server/desktop/laptop, did it always come with
the clock correctly set?
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Pedro Côrte-Real
2018-07-02 14:53:58 UTC
Permalink
Post by Miroslav Lichvar
That 100.0 is actually a limit for the estimated error in frequency
and is not related to the offset.
I see that now, sorry. Ignore that bit then.
Post by Miroslav Lichvar
The recommended configuration is supposed to minimize the number of
people complaining that NTP does not work. I agree that for most
people it would be better to completely disable stepping of the clock.
But those few that have a computer which for some reason can start
with system time far from the true time, it would be a big problem.
They would need to figure out what's going on and use the chronyc
makestep command to fix it. That's not a good user experience.
I think the real discussion then is the difference between adjustments
at boot time and adjustments at chrony startup time. Chrony startup
time may be early boot when stepping is not a problem but it may also
be after installing it on a running server when things may be
confused. If there was a way to say "only step when starting up at
early boot time when nothing important is running" then maybe that
would be ok.
Post by Miroslav Lichvar
I don't see a big difference between servers and desktops/laptops.
What matters is the RTC. On real HW it needs to be present, have a
battery, must not drift too much and the local/UTC setting must be
correct. Similar requirements apply to virtual machines. It needs to
work with suspend/resume and migrations.
The main difference I see between servers and desktops/laptops is that
servers may actually need to have a good clock of fail reliably.
Whereas desktops/laptops are fine as long as the clock is reasonable
(I'd argue monotonic is important but maybe not even that).
Post by Miroslav Lichvar
Even if we could be assume that everything always works as expected,
someone or something would need to set the RTC for the first time.
When you bought a new server/desktop/laptop, did it always come with
the clock correctly set?
Being required to do an adjustment manually doesn't seem like a
problem but thinking about it I've probably relied on Ubuntu running
ntpdate in the past. These days, with interactive installers that
connect to the network having the installer ask if it should step the
clock would probably be ideal.

Pedro
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2018-07-02 15:20:00 UTC
Permalink
Post by Pedro Côrte-Real
I think the real discussion then is the difference between adjustments
at boot time and adjustments at chrony startup time. Chrony startup
time may be early boot when stepping is not a problem but it may also
be after installing it on a running server when things may be
confused. If there was a way to say "only step when starting up at
early boot time when nothing important is running" then maybe that
would be ok.
The -R option tells chronyd to ignore the makestep directive. A script
could add it to the chronyd command line when uptime is more than few
minutes for instance. However, I suspect this would be confusing. I
think users have learned to restart the NTP service when the clock is
wrong, e.g. when resume in a VM didn't work as expected, or the clock
was incorrectly set by a wrong date command. If they didn't know the
-R option was added automatically, they would have to reboot the
machine.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2018-07-02 17:10:02 UTC
Permalink
This post might be inappropriate. Click to display it.
Pedro Côrte-Real
2018-07-02 19:06:07 UTC
Permalink
Post by Bill Unruh
But the clock is not good and for some reason you switched on chonyd when it
was not good. Is it more important to slowly slew the clock or to get it on
time as quickly as possible.
Personally I'd prefer a third option of just failing loudly and make
me have to fix the system manually to give me a chance of
understanding the root cause.
Post by Bill Unruh
No, there is so so much junk pasted to the screen during installing that that
request would get lost in the noise and the clock would stay way out.
I was thinking of one question that defaults to yes and has to be gone
through doing the install. But just doing it by default on install (or
maybe even first boot) with no questions would also be fine.

But I realize this is mostly bikeshedding. I could use the current
default without much issue and it's easy enough to set something else
with puppet anyway.

Thanks for everyone's input.

Pedro
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2018-07-02 19:54:19 UTC
Permalink
Post by Pedro Côrte-Real
Post by Bill Unruh
But the clock is not good and for some reason you switched on chonyd when it
was not good. Is it more important to slowly slew the clock or to get it on
time as quickly as possible.
Personally I'd prefer a third option of just failing loudly and make
me have to fix the system manually to give me a chance of
understanding the root cause.
What root cause? You start chronyd, either because you switched on the
computer, or because you did not have it running for a year. If you had it
running shortly before you again switched on chronyd, it would have
freewheeled the time and been reasonably on time. (one second typically would
have taken it many days to accumulate. If the system clock was out by 50PPM,
for example because the temp was very different inside between when chonyd
determined the rate and the time when chronyd was not running,
it would have taken more than 6 hours to accumulate a second.

So, it is really unclear to me what corner case you are worried about.
Post by Pedro Côrte-Real
Post by Bill Unruh
No, there is so so much junk pasted to the screen during installing that that
request would get lost in the noise and the clock would stay way out.
I was thinking of one question that defaults to yes and has to be gone
through doing the install. But just doing it by default on install (or
And if it never gets answered, what is chronyd to do?
Post by Pedro Côrte-Real
maybe even first boot) with no questions would also be fine.
But I realize this is mostly bikeshedding. I could use the current
default without much issue and it's easy enough to set something else
with puppet anyway.
Thanks for everyone's input.
Pedro
--
with "unsubscribe" in the subject.
with "help" in the subject.
Pedro Côrte-Real
2018-07-03 11:11:03 UTC
Permalink
Post by Bill Unruh
What root cause? You start chronyd, either because you switched on the
computer, or because you did not have it running for a year. If you had it
running shortly before you again switched on chronyd, it would have
freewheeled the time and been reasonably on time. (one second typically would
have taken it many days to accumulate. If the system clock was out by
50PPM, for example because the temp was very different inside between when
chonyd
determined the rate and the time when chronyd was not running, it would have
taken more than 6 hours to accumulate a second.
So, it is really unclear to me what corner case you are worried about.
I'm thinking of things like a server rebooted and has a broken
hardware clock that stepped somewhere crazy (HW needs fixing). Or a VM
suddenly has a broken clock because of a host bug or config (host
needs fixing). Anything that's actually not a normal scenario like the
ones you've described. But I worry too much, stepping by default is
probably fine.
Post by Bill Unruh
And if it never gets answered, what is chronyd to do?
I was suggesting one of the questions in the installer, you can't not
answer it if you want to move on with the install. There used to be
(maybe still are) questions about having the hardware clock in local
time or UTC because of Windows being broken. Having the installer
doing it silently in the background might be fine as well.

Cheers,

Pedro
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2018-07-02 17:00:24 UTC
Permalink
makestep is an command to be run when chrond is started. When started it is
probably that the clock, whether on a server, a laptop or whatever, is out.
So, instead of trying to fix a huge offset at the startup of chronyd by
slewing the clock, it says to step the clock if it is out by over 1 sec. on
the first three updates.

The skew is the undertainty in the rate of the clock compared with the ntp
source. If larger than 100PPM uncertainty (not rate but uncertainty in the
rate) do not use that source to determine the rate.


William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Pedro Côrte-Real
Hi,
I'm a user of Ubuntu both on servers and desktops/laptops. With Ubuntu
18.04 switching to chrony by default I decided to use that on all
machines. When setting up the roll-out to all machines using puppet I
checked the default config to see if I neded to do any changes. Two
maxupdateskew 100.0
makestep 1 3
If I'm reading the documentation correctly this means "step the clock
in the first three corrections if the step is above one second but
below 100"
No idea where you got "below 100" If your clock is out by say 100 years, it
is also stepped.
Post by Pedro Côrte-Real
https://chrony.tuxfamily.org/faq.html#_what_is_the_minimum_recommended_configuration_for_an_ntp_client
This seems very strange to me as a default for an NTP tool. I have two
1) For a server never step the clock and if the drift is large
complain loudly because something has gone very wrong. Servers are
always on and should be always syncing so if their clock drifts a lot
something has gone wrong.
The 3 means "in the first three clock measurements". If the server is always
on, then it will not be "in the first 3 measurements" situation so this is
irrelevant. If you just set up your server and just switched it on, that
server's clock could well be out by a huge amount, and you do not want to wait
while chrony tries to slow away 100 years of offset by changing the clock rate
by the max of 10%. It would take 1000 years to do so.
Post by Pedro Côrte-Real
2) In a desktop/laptop stepping the clock is probably always ok if
going forward but may be bad if going back. So just accept frequency
adjustments both going forward and backwards. Machines are turned
on/off, suspend/resume and so it's less important to complain loudly.
Instead maintain monotonic clocks that are synchronized quickly even
if their frequency needs to shift a log.
Given all this why were these defaults chosen? Are there recommended
settings to approximate 1) and 2)? Is the recommendation to do
something else?
Both are reasonable defaults. Most would have something like
makestep .1 3
which would step the clock if out by .1 sec in the first three measurements.
Post by Pedro Côrte-Real
Thanks,
Pedro
--
with "unsubscribe" in the subject.
with "help" in the subject.
Loading...