Leap Second Bug: Worth a Double Check…
In 2008 I vividly remember the impact that leap year/day/seconds can have on systems that are not prepared to handle the changes in time or date. It was the 29th of February and at the time I was working for a Service Provider offering Hosted Exchange services based on Exchange 2007. All off a sudden my provisioning scripts stopped working and we could not add, remove or modify Exchange Mailboxes.
After a day of frustration working with MS Support and dreading a full system rebuild the problem seemed to disappear the following day…the 1st of March. At the end of the day and after a couple of days of Microsoft scratching their head the Exchange Engineering team realised that they hadn’t allowed for the leap day somewhere deep in the bowls of their code which resulted in all account modifications not working during the 24 hours of the leap day.
Fast forward five years and the Earth’s rotation continues to slow and we have a situation where system administrators and operations teams need to be aware of another out of the norm situation that could affect systems and platforms. This time it’s due to a leap second adjustment which is scheduled for 30th of June 2015 at 23:59:60 UTC and it may cause issues for devices and operating systems that are NTP synchronised. Older Linux kernels seem to be the most affected by leap second with most vendors releasing KB articles regarding the leap second impact and how to work around it.
While this is not something that will bring down the internet it’s still something that all infrastructure IT professionals should be aware of and be double checking all systems to ensure there are no embarrassing time related incidents come the 30th of June.
ESXi and Other VMware Products:
ESX/ESXi utilizes the RFC-1589 clock model, appropriately handling leap seconds.
It is not necessary to enable Slew Mode for NTP in ESX/ESXi’s NTP client, or to otherwise work around leap seconds by disabling and re-enabling the NTP client before and after the leap second’s occurrence. For more information, see Enabling Slew Mode for NTP (2121016).
However, while ESX/ESXi server is not expected to experience negative impact from a leap second taking place, it remains possible for Guest Operating Systems and/or running applications to experience an impact, independent of ESX/ESXi, if it is not designed to handle one. VMware recommends customers to test their complete solutions.
DetailsWhen a leap second is added (for example, on June 30, 2015 at 23:59:60 UTC), the VMware vCloud Networking and Security (vCNS) Manager/App/Edge/Data Security appliances may become non-responsive. This issue can arise only at the time of the addition of the leap second. The issue occurs due to errors in leap second handling in the vCNS appliance kernels.Symptoms of this non-responsive state include the following errors when attempting to connect to an affected vCNS appliance:
- API/SSH(TCP) connection not possible
- CLI input/output not possibleAll versions of the vCNS appliances (vCNS Manager/App/Edge/Data Security) are affected. VMware NSX for vSphere 6.x installations may be configured to operate in a backward-compatibility mode that includes VCNS appliances. The vCNS appliances in such installations may also be affected by this issue.
SolutionIf the non-responsive condition lasts longer than 30 minutes, restart the affected appliance.