Discussion:
[chrony-users] Chrony forgets servers (specified by FQDN) when no DNS server
Stephen Satchell
2017-12-20 12:25:14 UTC
Permalink
TL;DR: chronyd doesn't like server specifications with FQDN when there
is no DNS resolver available -- and neither does rival ntpd. We'll see
if NXDOMAIN is just as bad.

(I did check to see if this had already been reported -- didn't find any
reference to issues like I experienced. Went back two years.)

My story:

I live in a hilly area that experiences power dropouts of about 1.5
seconds (more than a sag) during high winds. Tonight, I had four such
events. My edge router rebooted each time.

My edge router (CentOS 7 on four-port box) is connected via ARRIS
BGW210-700 broadband gateway to AT&T UVerse fiber (100/20). As I am
building this new edge router, I hadn't gotten A Round Tuit to set up
caching DNS yet. Translation: using 8.8.8.8 and 8.8.4.4 only. I have a
Time Machines GPS-based source indicated by IP address (10.1.1.15); the
rest of the sources are downstream, called out with FQDNs.

After one of the power cycles, I checked matters with chronyc(1) and
found the only active NTP server in the "sources" list was my local GPS
NTP box. Everything else was missing. When I restarted chronyd,
everything was there as expected.

Fortunately, I'm qualifying the circuit and new edge router, so nothing
is live on it.

It would appear that not having DNS service available is fatal to
bringing up a server. So I have installed the caching DNS server so
chrony will get *something* as a response; we'll see on the next power
fail if things look better, or if NXDOMAIN results in the same.

By the way, I have another edge server running ntpd which *is* live, and
it behaves the same way...so both NTP daemons have the same, er, difficulty.

Now some of you will be saying "where's your UPS"? Another missing
Round Tuit -- the box is in the garage waiting to be opened and tested
before I tear my rack apart.

N.B.: the constant power cycling took out my LED desk lamp...

RFC: should I consider writing a script that will call chronyc to create
the servers again, say once a day?
1. Is this recommended?
2. Would this tend to eventually add all the servers
in [0123].centos.pool.ntp.org
3. Is there a better way to "wake up" servers rejected
because of no resolver, or NXDOMAIN if that causes drops?
4. If not, may I make a feature request?
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-12-20 17:01:52 UTC
Permalink
No idea if it will work, but you could put the IP address in instead of names
for your servers-- that way no dns is needed.
TL;DR: chronyd doesn't like server specifications with FQDN when there is no
DNS resolver available -- and neither does rival ntpd. We'll see if NXDOMAIN
is just as bad.
(I did check to see if this had already been reported -- didn't find any
reference to issues like I experienced. Went back two years.)
I live in a hilly area that experiences power dropouts of about 1.5 seconds
(more than a sag) during high winds. Tonight, I had four such events. My
edge router rebooted each time.
My edge router (CentOS 7 on four-port box) is connected via ARRIS BGW210-700
broadband gateway to AT&T UVerse fiber (100/20). As I am building this new
edge router, I hadn't gotten A Round Tuit to set up caching DNS yet.
Translation: using 8.8.8.8 and 8.8.4.4 only. I have a Time Machines
GPS-based source indicated by IP address (10.1.1.15); the rest of the sources
are downstream, called out with FQDNs.
After one of the power cycles, I checked matters with chronyc(1) and found
the only active NTP server in the "sources" list was my local GPS NTP box.
Everything else was missing. When I restarted chronyd, everything was there
as expected.
Fortunately, I'm qualifying the circuit and new edge router, so nothing is
live on it.
It would appear that not having DNS service available is fatal to bringing up
a server. So I have installed the caching DNS server so chrony will get
*something* as a response; we'll see on the next power fail if things look
better, or if NXDOMAIN results in the same.
By the way, I have another edge server running ntpd which *is* live, and it
behaves the same way...so both NTP daemons have the same, er, difficulty.
Now some of you will be saying "where's your UPS"? Another missing Round
Tuit -- the box is in the garage waiting to be opened and tested before I
tear my rack apart.
N.B.: the constant power cycling took out my LED desk lamp...
RFC: should I consider writing a script that will call chronyc to create the
servers again, say once a day?
1. Is this recommended?
2. Would this tend to eventually add all the servers
in [0123].centos.pool.ntp.org
3. Is there a better way to "wake up" servers rejected
because of no resolver, or NXDOMAIN if that causes drops?
4. If not, may I make a feature request?
--
"unsubscribe" in the subject.
subject.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Stephen Satchell
2017-12-20 17:57:51 UTC
Permalink
Post by Bill Unruh
No idea if it will work, but you could put the IP address in instead of names
for your servers-- that way no dns is needed.
I have longstanding arrangements with several NTP time sources. For
those sources, those that don't do load-sharing with DNS, I have
manually resolved the FQDN and put the IP address in the config file.

For [0-3].centos.pool.ntp.org I have left them defined by their FQDN --
so if chrony(8) "forgets" them, I still have my GPS NTP box plus four
external sources for time synchronization.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-12-20 19:29:25 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Bill Unruh
No idea if it will work, but you could put the IP address in instead of names
for your servers-- that way no dns is needed.
I have longstanding arrangements with several NTP time sources. For those
sources, those that don't do load-sharing with DNS, I have manually resolved
the FQDN and put the IP address in the config file.
For [0-3].centos.pool.ntp.org I have left them defined by their FQDN -- so if
chrony(8) "forgets" them, I still have my GPS NTP box plus four external
sources for time synchronization.
IF your gps is also PPS, then it will be the only source carrying the time
setting anyway, because it is so much more accurate than any of the external
sources. If it is only NMEA then it will be a backup, since the network
sources will be more accurate. It is not clear to me why you need 8 network
sources. (four pool and four static IP sources).
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Stephen Satchell
2017-12-20 18:32:23 UTC
Permalink
Post by Bill Unruh
No idea if it will work, but you could put the IP address in instead of names
for your servers-- that way no dns is needed.
http://support.ntp.org/bin/view/Servers/StratumTwoTimeServers
In that list of servers, there is a field for each NTP time provider,
"Use DNS". Many of the public servers in this table require that DNS be
used with their service. Here is one entry from the table:

ServerStratum StratumTwo
CountryCode US NV
Hostname time2.nv.skyfiberinternet.com
IP Address 162.210.111.4
IPv6 Address

UseDNS Yes
PoolMember No
ServerLocation Reno, Nevada, USA

A spot check shows that the owners of the servers want you to use DNS.
Their server, their rules.

That means the Rules of Engagement makes how chronyd(8) currently
handles NXDOMAIN (or no resolver) a bug.

QED
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-12-20 19:35:14 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Bill Unruh
No idea if it will work, but you could put the IP address in instead of names
for your servers-- that way no dns is needed.
http://support.ntp.org/bin/view/Servers/StratumTwoTimeServers
In that list of servers, there is a field for each NTP time provider, "Use
DNS". Many of the public servers in this table require that DNS be used with
There is absolutely no way they would know if you are using dns or not. All
communication with the server is via IP, and whether that is static IP or dns
is unknown. The primary reason they say that is that they reserve the right to
change the IP address of their server and if you are using a static address,
there is no way you will know, except that it will not work anymore.
ServerStratum StratumTwo
CountryCode US NV
Hostname time2.nv.skyfiberinternet.com
IP Address 162.210.111.4
IPv6 Address
UseDNS Yes
PoolMember No
ServerLocation Reno, Nevada, USA
A spot check shows that the owners of the servers want you to use DNS. Their
server, their rules.
It is not a rule, and is not something they could know whether you are abiding
by it or not.

Remember that if they change the IP of the server, it can take up to 3 days
for that change to propagate through the set of DNS servers, so again they
cannot enforce their "rule" (which is not a rule anyway)
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Rob Janssen
2017-12-20 19:51:25 UTC
Permalink
Post by Bill Unruh
There is absolutely no way they would know if you are using dns or not. All
communication with the server is via IP, and whether that is static IP or dns
is unknown. The primary reason they say that is that they reserve the right to
change the IP address of their server and if you are using a static address,
there is no way you will know, except that it will not work anymore.
It is in the "terms of use".  A reason to do this could be that they want to offer this service now, but want to have
some way to terminate it in the (far) future.  When they do no longer want to offer NTP service, they can remove
the name from DNS and the usage of the service should disappear over time.  Only those that did not abide this rule
will remain.  And with most NTP services, they will remain anyway for as long the service is running, which of
course could be a couple of years for a few users.  But most of them should be gone in a couple of months.
Post by Bill Unruh
It is not a rule, and is not something they could know whether you are abiding
by it or not.
Remember that if they change the IP of the server, it can take up to 3 days
for that change to propagate through the set of DNS servers, so again they
cannot enforce their "rule" (which is not a rule anyway)
Do you think that rules are only valid when they can be actively enforced?
I think not.  I think rules set by the provider of the service are there for the users to abide to, and when they
don't like the rule their option is not to use the service.

A time server that uses DNS based rules for reference servers should fail gracefully when the DNS does not return
an IP address (anymore).  So, when it does a lookup only once it should issue an error message about that server,
and proceed its startup as if that server was never there in the configuration.  When it is resolving DNS names on
a regular basis (e.g. once per day), it could keep the server configuration and keep retrying the DNS lookup at
that same interval and start using the server when the DNS lookup succeeds.
Not starting the service at all is only an option when all the DNS lookups have failed (i.e. there is no server) and
there is no mechanism to re-try the lookups.  When there is, it is much better to keep the service running.
(after all, a network may not be available at boot time and may become available later)

Rob
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-12-20 20:20:28 UTC
Permalink
There are two questions here:
a) chrony drips a source if the dns does not deliver a valid IP for that
server.
b)What should the the underlying philosophical stance be of a program like
chrony to the use of DNS.

I was answering the first, not the second. You had a problem, chronyd was
dropping servers if dns failed. My suggestion was how to get around that. You
have now expanded this discussion into the second question.

I would mildly dispute your contention that when the server administrator says
you should use dns, that is a contract term, which the user has the moral
imperitive not to violate. I suspect that most operators of servers would be
surprized at that interpretation, but would rather say--"Look, I could at any
time change the IP, and then you as user would be stuck". After all they also,
as in your example, publish their IP address for the server. Surely that line
is as important in one's interpretation as the useDNS line.
Would it be nice if chronyd could recover on its own a loss of dns access?
Sure. Is it a bug if it does not? I would say maybe but not one for which
there is agreat urgency for it to "fixed". And also this is a separate
(though related) issue to your original question "How can I solve my problem
of loss of servers when my router goes down".
You have apparently implimented my suggestion as to how to solve that
question, despite your moral qualms about that implimentation.
Post by Bill Unruh
In that list of servers, there is a field for each NTP time provider, "Use
DNS".  Many of the public servers in this table require that DNS be used
There is absolutely no way they would know if you are using dns or not. All
communication with the server is via IP, and whether that is static IP or dns
is unknown. The primary reason they say that is that they reserve the right to
change the IP address of their server and if you are using a static address,
there is no way you will know, except that it will not work anymore.
It is in the "terms of use".  A reason to do this could be that they want to
offer this service now, but want to have
some way to terminate it in the (far) future.  When they do no longer want to
offer NTP service, they can remove
the name from DNS and the usage of the service should disappear over time. 
Only those that did not abide this rule
will remain.  And with most NTP services, they will remain anyway for as long
the service is running, which of
course could be a couple of years for a few users.  But most of them should
be gone in a couple of months.
Post by Bill Unruh
It is not a rule, and is not something they could know whether you are abiding
by it or not.
Remember that if they change the IP of the server, it can take up to 3 days
for that change to propagate through the set of DNS servers, so again they
cannot enforce their "rule" (which is not a rule anyway)
Do you think that rules are only valid when they can be actively enforced?
I think not.  I think rules set by the provider of the service are there for
the users to abide to, and when they
don't like the rule their option is not to use the service.
A time server that uses DNS based rules for reference servers should fail
gracefully when the DNS does not return
an IP address (anymore).  So, when it does a lookup only once it should issue
an error message about that server,
and proceed its startup as if that server was never there in the
configuration.  When it is resolving DNS names on
a regular basis (e.g. once per day), it could keep the server configuration
and keep retrying the DNS lookup at
that same interval and start using the server when the DNS lookup succeeds.
Not starting the service at all is only an option when all the DNS lookups
have failed (i.e. there is no server) and
there is no mechanism to re-try the lookups.  When there is, it is much
better to keep the service running.
(after all, a network may not be available at boot time and may become available later)
Rob
--
with "unsubscribe" in the subject.
with "help" in the subject.
Stephen Satchell
2017-12-20 20:40:09 UTC
Permalink
Post by Bill Unruh
You have apparently implimented my suggestion as to how to solve that
question, despite your moral qualms about that implimentation.
Actually, I exchanged mail with the server administrators before moving
to IP-based specification for those server entries. That option is in
the "rules of engagement."

The pool servers? If there, great. If not, no loss.

As to why I have so many external servers when I use a GPS-based NTP
appliance: false ticker detection. I don't trust consumer-grade
equipment to hold up to all possibilities -- it could die or go crazy.

(Note to others: I'm not using a 1 PPS signal from the GPS appliance.
Yet. Not in an edge router.)
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-12-20 22:08:39 UTC
Permalink
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Bill Unruh
You have apparently implimented my suggestion as to how to solve that
question, despite your moral qualms about that implimentation.
Actually, I exchanged mail with the server administrators before moving to
IP-based specification for those server entries. That option is in the
"rules of engagement."
The pool servers? If there, great. If not, no loss.
As to why I have so many external servers when I use a GPS-based NTP
appliance: false ticker detection. I don't trust consumer-grade equipment
to hold up to all possibilities -- it could die or go crazy.
(Note to others: I'm not using a 1 PPS signal from the GPS appliance. Yet.
Not in an edge router.)
In that case your GPS will give you about 1-10ms of accuracy. The net servers
should be in the 10-100usec range, so they should dominate and chrony should
typically latch onto them.
Certainly having a backup to a gps is a very good idea. But 8 of them seems a
tad excessive. Nothing particularly wrong with it.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bryan Christianson
2017-12-20 20:42:00 UTC
Permalink
Hi Rob, Bill
Post by Bill Unruh
a) chrony drips a source if the dns does not deliver a valid IP for that
server.
b)What should the the underlying philosophical stance be of a program like
chrony to the use of DNS.
I was answering the first, not the second. You had a problem, chronyd was
dropping servers if dns failed. My suggestion was how to get around that. You
have now expanded this discussion into the second question.
I tend to agree with Rob on this one. I remember a case a few years back where some modem/router vendor had an NTP IP address hardcoded into the device firmware. The owner of the IP ceased providing the NTP service on that address but for years they were still taking a hit from the millions of devices that had been sold. Even though there may have been no NTP server at that address, the packets still had to hit the routers before they were dropped/rejected. If the device vendor had used DNS then their wouldn't have been a problem for the operator of the server.

Bryan Christianson
***@whatroute.net
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2017-12-20 22:15:23 UTC
Permalink
Post by Bryan Christianson
Hi Rob, Bill
Post by Bill Unruh
a) chrony drips a source if the dns does not deliver a valid IP for that
server.
b)What should the the underlying philosophical stance be of a program like
chrony to the use of DNS.
I was answering the first, not the second. You had a problem, chronyd was
dropping servers if dns failed. My suggestion was how to get around that. You
have now expanded this discussion into the second question.
I tend to agree with Rob on this one. I remember a case a few years back where some modem/router vendor had an NTP IP address hardcoded into the device firmware. The owner of the IP ceased providing the NTP service on that address but for years they were still taking a hit from the millions of devices that had been sold. Even though there may have been no NTP server at that address, the packets still had to hit the routers before they were dropped/rejected. If the device vendor had used DNS then their wouldn't have been a problem for the operator of the server.
Certainly having a commercial vendor hard coding the address is really really
horrible, unless it is an address of the vendor's own machines. In that case
the millions of machines the vendor shipped completely swamped the server
(especially since the software used on the vendor's system was incompetently
designed to constantly decrease the time between packets if packet loss
occured.
But that is not what is being discussed here. Here one single person is trying to make his
device more resistant to loosing all of his servers. I have never had much
sympathy with the "What if everyone did it?" argument for behaviour or
morality.

And I do agree that it would be best if his system were robust against loss of
DNS. But as a way around that lack of robustness, "hardcoding" into
/etc/chrony.conf seems a perfectly sensible and acceptable procedure.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Stephen Satchell
2017-12-20 21:34:22 UTC
Permalink
Post by Rob Janssen
A time server that uses DNS based rules for reference servers should
fail gracefully when the DNS does not return
an IP address (anymore).  So, when it does a lookup only once it should
issue an error message about that server,
and proceed its startup as if that server was never there in the
configuration.  When it is resolving DNS names on
a regular basis (e.g. once per day), it could keep the server
configuration and keep retrying the DNS lookup at
that same interval and start using the server when the DNS lookup succeeds.
Not starting the service at all is only an option when all the DNS
lookups have failed (i.e. there is no server) and
there is no mechanism to re-try the lookups.  When there is, it is much
better to keep the service running.
(after all, a network may not be available at boot time and may become available later)
I find this statement of behavior (treat NOSERV/NXDOMAIN as an excuse to
forget a server/peer/pool) a bit astonishing, and very un-Unix-like.

Let's make some assumptions:
1. The daemon software has, in its data structures for
server/peer/pool, the FQDN for each server and peer.
2. The daemon software, on NXDOMAIN or no answer, sets the IP address
to zeros (0xFFFFFFF for IPv4, and
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 for IPv6)
3. All information about the server/peer/pool entry is in the data
structure, such as filter data
4. The polling loop is able to fork a process to perform DNS lookups.
(This many not necessarily be true with Windows.)

So the standard polling loop uses the poll timing specified in the
server/peer/pool command for all servers, peers, and pools, initialized
or not. If the poll interval has expired for a given server/peer/pool
entry, it does this:
a. IP address zero: reset pool interval to minpoll, and fork a
process to do DNS lookup -- the forked process will perform the DNS
lookup, and on success will fill in the IP address and set the
first-time flag so the polling loop will pick it up in the next cycle
b. IP address non-zero and first-time flag set: do what the server
currently does with a new server or peer entry
b. IP address non-zero and first-time flag not set: do what it does
now.

Forking a process means that the daemon's polling loop doesn't lock up
the daemon on the DNS lookup when there is no DNS available, or it takes
a double-handful of seconds to get NOSERV or NXDOMAIN. (If a process is
already forked for an entry, then don't fork it again; wait for the
forked process to die.) If/when the forked process gets a successful A
or AAAA record, it sets it in the data structure for the entry so that
the pool loop will pick it up on the next poll interval expiration.

Also note that it eliminates special start-up code. The config file
parser fills in the data structure for each server/peer with zero IP
address, and the polling loop handles the lookup and initialization.
This also works with chronyc(1): it causes chronyd(8) to build the new
data structure, and the polling loop does the rest. When you use
chronyc(1) to remove a server or peer, chronyd(8) just removes the data
structure for that entry. Poof.

And that's how I would remove chrony's current astonishing behavior in
the face of DNS not being there at start-up. Like in my power-fail
situation, where the edge router with chronyd(8) comes up before the
CSU/DSU to the network. Enterprise users might be surprised to learn
about this astonishing forgetfulness of chronyd(8) in the face of a
temporary failure.

How to handle entries where the NTP server has gone away?

Keep a TTL timer, set by an entry in the configuration file.
(reasonable default would be 24 hours.) When "reach" is not 0x00, reset
the TTL timer. When the TTL timer expires, clear the filter variables,
set the poll to minpoll, zero the IP address, and reset the TTL timer.

The rationale for this method of handling extended tempfail is the same
rationale used for SMTP daemons: wait somewhat impatiently for the
remote server to come back, and if it doesn't come back in a reasonable
time then bounce the mail.

From the standpoint of NTP protocol, a server that is out of service
for an extended time may have different properties when it comes back
on-line. (Replaced, for example.) So the filter variables would
contain bogus data, particularly in a pool situation where you were
originally talking to a "close" server, and now switched to a "far" server.

(And, it eliminates the need for a separate "pool" command, which would
help some distribution sources (<cough> Red Hat) who use "server" when
they mean "pool" in their default configurations.)

If this should be moved to chrony-dev, I can do that.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Rob Janssen
2017-12-20 22:19:52 UTC
Permalink
Post by Rob Janssen
A time server that uses DNS based rules for reference servers should fail gracefully when the DNS does not return
an IP address (anymore).  So, when it does a lookup only once it should issue an error message about that server,
and proceed its startup as if that server was never there in the configuration.  When it is resolving DNS names on
a regular basis (e.g. once per day), it could keep the server configuration and keep retrying the DNS lookup at
that same interval and start using the server when the DNS lookup succeeds.
Not starting the service at all is only an option when all the DNS lookups have failed (i.e. there is no server) and
there is no mechanism to re-try the lookups.  When there is, it is much better to keep the service running.
(after all, a network may not be available at boot time and may become available later)
I find this statement of behavior (treat NOSERV/NXDOMAIN as an excuse to forget a server/peer/pool) a bit astonishing, and very un-Unix-like.
Read it again please.  When there IS a retry mechanism, it should not drop the server.  When there isn't, it should not
completely exit but only drop the server that cannot be looked up.

Rob
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2018-01-02 10:55:36 UTC
Permalink
TL;DR: chronyd doesn't like server specifications with FQDN when there is no
DNS resolver available -- and neither does rival ntpd. We'll see if
NXDOMAIN is just as bad.
After one of the power cycles, I checked matters with chronyc(1) and found
the only active NTP server in the "sources" list was my local GPS NTP box.
Everything else was missing. When I restarted chronyd, everything was there
as expected.
chrony supports delayed name resolving. When a server hostname cannot
be resolved on start, it should be trying again in an exponentially
increasing interval. Running "chronyc online" should force it to do it
right now.

What chrony version do you use and on what system it is running? When
"chronyc sources" doesn't list the servers, what does "chronyc
activity" print?

If you recompile chrony with debugging enabled, running chronyd with
-d -d might give us an idea what's wrong.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Stephen Satchell
2018-01-02 14:18:23 UTC
Permalink
Post by Miroslav Lichvar
What chrony version do you use and on what system it is running? When
"chronyc sources" doesn't list the servers, what does "chronyc
activity" print?
Centos 7.4, chrony 3.1

The edge router had crashed New Year's Day (which is a little
disturbing) which I detected when I tried to gather information to
answer your questions, so this was a great opportunity to recreate the
conditions and gather the facts you requested.

After rebooting with the uplink disconnected, the only source active was
my local GPS appliance. For those entries in my configuration that
specify IP addresses, it was listed by "source" as not yet active; the
rest of the entries were not there, as I reported before.

"activity" showed that four sources had unknown names.

When I plugged the uplink cable back in, after a few minutes I did
"sources" and everything was there as I would expect. So it appears
that chrony is doing its job.

Perhaps "sources" should add a line to its report when there are one or
more sources with unknown IP addresses and to use "activity" to find out
more information. Consider this a feature request of low priority.
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Loading...