Discussion:
[chrony-users] understanding drift vs. clock frequency
Olaf Hering
2018-12-03 14:36:36 UTC
Permalink
I'm trying to understand the correlation between the estimated clock frequency and the estimated drift. During boot the TSC frequency is estimated by the kernel, and during runtime some drift is calculated based on the time returned by remote servers. In my case it may look like this:

***@bax:~ # dmesg | grep -Ei '(hz|msr|tsc|tol|clocksource)'
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.004000] tsc: Detected 2194.858 MHz processor
[ 0.221809] TSC deadline timer enabled
[ 0.221811] smpboot: CPU0: Intel(R) Xeon(R) CPU D-1531 @ 2.20GHz (family: 0x6, model: 0x56, stepping: 0x3)
[ 0.460963] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.929214] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[ 0.930094] clocksource: Switched to clocksource hpet
[ 0.945712] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[ 3.036012] tsc: Refined TSC clocksource calibration: 2194.917 MHz
[ 3.036027] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fa37107ca2, max_idle_ns: 440795258165 ns
[ 4.139385] clocksource: Switched to clocksource tsc
***@bax:~ # l /var/lib/chrony/drift && cat $_
-rw-r--r-- 1 chrony chrony 42 Dec 3 13:07 /var/lib/chrony/drift
-10.241223 0.186852
***@bax:~ # w
13:57:18 up 6 days, 4:44, 2 users, load average: 0.00, 0.00, 0.00

I think the kernel assumes "2194917000" ticks represent a second. What does the drift mean in this case? This part is not explained well enough. What really happens during that amount of ticks?

Does it mean after that amount of ticks the time advanced by (1*1000*1000*1000)+10241=1000010241us? So the system time needs to be slowed down by the drift value? In this case the real TSC frequency would be lower, like ((2194917*1000)*(1*1000*1000*1000))/((1*1000*1000*1000)+10241) = 2194894522 Hz = 2194.894 MHz?

Or did the time advance after the amount of ticks just by (1*1000*1000*1000)-10241 = 999989759us? So the system time needs to be accelerated by the drift value? In this case the real TSC frequency would be higher, like ((2194917*1000)*(1*1000*1000*1000))/((1*1000*1000*1000)-10241) = 2194939478 Hz = 2194.939 MHz?

Or is perhaps some of my math backwards?

Thanks,
Olaf
Miroslav Lichvar
2018-12-03 15:44:50 UTC
Permalink
Post by Olaf Hering
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.004000] tsc: Detected 2194.858 MHz processor
[ 0.221809] TSC deadline timer enabled
[ 0.460963] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.929214] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[ 0.930094] clocksource: Switched to clocksource hpet
[ 0.945712] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[ 3.036012] tsc: Refined TSC clocksource calibration: 2194.917 MHz
[ 3.036027] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fa37107ca2, max_idle_ns: 440795258165 ns
[ 4.139385] clocksource: Switched to clocksource tsc
-rw-r--r-- 1 chrony chrony 42 Dec 3 13:07 /var/lib/chrony/drift
-10.241223 0.186852
13:57:18 up 6 days, 4:44, 2 users, load average: 0.00, 0.00, 0.00
I think the kernel assumes "2194917000" ticks represent a second. What does the drift mean in this case? This part is not explained well enough. What really happens during that amount of ticks?
2194917000 is the number of CPU cycles in one second as measured by
the PIT timer.
Post by Olaf Hering
Does it mean after that amount of ticks the time advanced by (1*1000*1000*1000)+10241=1000010241us? So the system time needs to be slowed down by the drift value? In this case the real TSC frequency would be lower, like ((2194917*1000)*(1*1000*1000*1000))/((1*1000*1000*1000)+10241) = 2194894522 Hz = 2194.894 MHz?
Or did the time advance after the amount of ticks just by (1*1000*1000*1000)-10241 = 999989759us? So the system time needs to be accelerated by the drift value? In this case the real TSC frequency would be higher, like ((2194917*1000)*(1*1000*1000*1000))/((1*1000*1000*1000)-10241) = 2194939478 Hz = 2194.939 MHz?
A negative value in chrony driftfile indicates the clock is slow and
needs to run faster in order to match the real time.

So it's the latter, but it's slightly more complicated.

The measured frequency of the CPU is not accurate, because the PIT is
not accurate. The nominal frequency is probably different. With some
newer CPUs the kernel can determine the nominal frequency directly and
there is no need for calibration.

Then there is a hidden frequency offset due to a limitation in the
conversion between TSC and real time in the kernel, which is up to
about to about 0.2 ppm.

So, in this case, if the PIT and TSC were using the same oscillator, I
think we would only know that the PIT is running about 10.0-10.4 ppm
slower. It doesn't say much about the TSC.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Miroslav Lichvar
2018-12-03 16:04:39 UTC
Permalink
Post by Miroslav Lichvar
Then there is a hidden frequency offset due to a limitation in the
conversion between TSC and real time in the kernel, which is up to
about to about 0.2 ppm.
I think this error is actually specific to the MONOTONIC_RAW clock,
which is rarely used by applications. IIRC it's not in the
REALTIME/MONOTONIC clock that chronyd is synchronizing.
--
Miroslav Lichvar
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Bill Unruh
2018-12-03 16:33:22 UTC
Permalink
drift=clock freq - 1sec/sec
In chrony
drift= clock freq- remote freq.
(well I might have the sign of that wrong)
The assumption is the remote freq has been brought into sync with UTC which by
definition has a freq of 1 sec/sec. All measurements have uncertainties
associated with them.

I have no idea how linux originally sets the clock freq. It does not too bad
job of it. It may use the rtc to do so. But the whole purpose of chrony is to make a great job of it.




William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ ***@physics.ubc.ca
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
Post by Olaf Hering
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.004000] tsc: Detected 2194.858 MHz processor
[ 0.221809] TSC deadline timer enabled
[ 0.460963] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.929214] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[ 0.930094] clocksource: Switched to clocksource hpet
[ 0.945712] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[ 3.036012] tsc: Refined TSC clocksource calibration: 2194.917 MHz
[ 3.036027] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fa37107ca2, max_idle_ns: 440795258165 ns
[ 4.139385] clocksource: Switched to clocksource tsc
-rw-r--r-- 1 chrony chrony 42 Dec 3 13:07 /var/lib/chrony/drift
-10.241223 0.186852
13:57:18 up 6 days, 4:44, 2 users, load average: 0.00, 0.00, 0.00
I think the kernel assumes "2194917000" ticks represent a second. What does the drift mean in this case? This part is not explained well enough. What really happens during that amount of ticks?
It is the difference between the rate of the clock and 1 sec/sec.
Post by Olaf Hering
Does it mean after that amount of ticks the time advanced by (1*1000*1000*1000)+10241=1000010241us? So the system time needs to be slowed down by the drift value? In this case the real TSC frequency would be lower, like ((2194917*1000)*(1*1000*1000*1000))/((1*1000*1000*1000)+10241) = 2194894522 Hz = 2194.894 MHz?
Or did the time advance after the amount of ticks just by (1*1000*1000*1000)-10241 = 999989759us? So the system time needs to be accelerated by the drift value? In this case the real TSC frequency would be higher, like ((2194917*1000)*(1*1000*1000*1000))/((1*1000*1000*1000)-10241) = 2194939478 Hz = 2194.939 MHz?
Or is perhaps some of my math backwards?
Thanks,
Olaf
--
To unsubscribe email chrony-users-***@chrony.tuxfamily.org
with "unsubscribe" in the subject.
For help email chrony-users-***@chrony.tuxfamily.org
with "help" in the subject.
Trouble? Email ***@chrony.tuxfamily.org.
Loading...