I am using Azure Kubernetes Service ( AKS ) in several use cases. In one of the customer implementation, the AKS was hosted within a private network and the AKS worker node time was going out of sync. I tried adjusting the firewall and restarting the cluster. NTP daemon uses UDP port 123 for connecting with Network Time servers to sync the system clock. But after few days, the problem started again. The NTP daemons were running in the cluster, but somehow the time sync was not properly happening.
Clock in a container is the same as on the host machine because it’s controlled by the kernel of that machine. So if the worker node’s time is not in sync, it will affect the applications running in the cluster also.
I checked the time on various pods and cluster nodes. The timestamps are different in different nodes. The procedure to ssh into the worker node is explained in my previous post.
On detailed research, I found a better approach to sync the time across all the cluster nodes without depending on the NTP server.
The solution is to use chrony instead of NTP daemon. Chrony has more advanced features compared to NTP daemon. Chrony can sync the VM’s clock with the underlying hardware’s clock. In this way, we do not have to worry about NTP traffic getting blocked at the network layer. So this approach is best suited for VM’s running in private or isolated networks.
My AKS nodes were with Ubuntu operating system. So I have followed the below steps to install chrony in the existing cluster.
Login to each of the worker node and execute the following steps. The steps to login to each of the worker node is explained in my previous post.
Step 1: Update the repositories. Install chrony package
apt-get update
apt-get install chrony -y
Step 2: Take back up of the default configuration file.
cp -pr /etc/chrony/chrony.conf /etc/chrony/chrony.conf.bk
Step 3: Clean up the default contents and update the chrony configuration with the below content
> /etc/chrony/chrony.conf
vi /etc/chrony/chrony.conf
Update the file with the below content. The complete content of the configuration file is pasted below.
# Welcome to the chrony configuration file. See chrony.conf(5) for more # information about usuable directives. # This will use (up to): # - 4 sources from ntp.ubuntu.com which some are ipv6 enabled # - 2 sources from 2.ubuntu.pool.ntp.org which is ipv6 enabled as well # - 1 source from [01].ubuntu.pool.ntp.org each (ipv4 only atm) # This means by default, up to 6 dual-stack and up to 2 additional IPv4-only # sources will be used. # At the same time it retains some protection against one of the entries being # down (compare to just using one of the lines). See (LP: #1754358) for the # discussion. # # About using servers from the NTP Pool Project in general see (LP: #104525). # Approved by Ubuntu Technical Board on 2011-02-08. # See http://www.pool.ntp.org/join.html for more information. #pool ntp.ubuntu.com iburst maxsources 4 #pool 0.ubuntu.pool.ntp.org iburst maxsources 1 #pool 1.ubuntu.pool.ntp.org iburst maxsources 1 #pool 2.ubuntu.pool.ntp.org iburst maxsources 2 # This directive specify the location of the file containing ID/key pairs for # NTP authentication. keyfile /etc/chrony/chrony.keys # This directive specify the file into which chronyd will store the rate # information. driftfile /var/lib/chrony/chrony.drift # Uncomment the following line to turn logging on. #log tracking measurements statistics # Log files location. logdir /var/log/chrony # Stop bad estimates upsetting machine clock. maxupdateskew 100.0 # This directive enables kernel synchronisation (every 11 minutes) of the # real-time clock. Note that it can’t be used along with the 'rtcfile' directive. rtcsync # Settings come from: https://docs.microsoft.com/en-us/azure/virtual-machines/linux/time-sync refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0 makestep 1.0 -1
Step 4: Restart Chrony service
systemctl restart chrony
Step 5: Verify the clock sync.
Wait for few minutes and check the clock sync status using the below command.
chronyc tracking
The complete steps are present in this git link also.
Hi, I think these changes will be removed incase cluster is destroyed or scales out and adds a new node. Can we make it persistent may be using daemon set?
Let me know what you think. Keep sharing, thanks!