Y2K38
When the old Unix timestamp rolls over to negative. In 2038, in January, on the third Tuesday, if you were counting seconds since 1/1/1970 in a 32 bit number, that number becomes negative. (The high two billion values of the number representation are the negative values, it’s called ‘two’s compliment’ and it’s pretty much standard for representing negative numbers in computers since the 80s. **)
So Y2K38 is a simple thing really. Counting up since 1/1/1970, is coincidentally, how many operating systems do knowing what the time is. Unix and thence Linux and Android and Kindles and all that stuff … and iOS/OSX is kinda Unix under the hood, well at least brain-damaged NextStep under the hood – which had fairly BSD userland, on a Mach Kernel. Everyone except Windows, if you want to be picky. Including most routers, firewall appliances although a lot of the ‘internet of things’ devices don’t run Linux; they’re too small and stupid.
So all the Unix boxes die in January 2038? Well, not so fast.
Computers went to 64 bit CPU’s somewhere along the line from the early 1970s to late 2000’s depending on how fancy the computer was, and if you #define the macro for time_t as a 64 bit integer, the timestamp doesn’t go negative on a Tuesday.
The Linux Kernel project got all the time_t’s changed to 64 bit for the kernel and filesystems a while back. (To be honest, 32-bit support on Linux is getting to be a bit of a bear, really. Most major distros don’t do a 32 bit processor version of the kernel any more.)
So… is Y2K38 is a nothingburger?
Well, one of the actual real problems with Y2k38 is that ‘software packages’ (the file formats used to ship software around) for Linux all use 32 bit Unix timestamps, so the software update process will fail. Because the new package will try to unpack older files. Really old files. Is it 1901 yet?
Of course, there's no reason to worry about software updates not working... it's not like real systems need continuous security updates. But, oh but wait, they do, actually. On average, a really atrocious vulnerability that must be patched immediately (in servers) is found every month or so. Same deal with phones, only Android security updates happen mostly-quarterly, if you get them. Apple, of course update on an ad-hoc basis.
(Go read my essay on Software Updates to see why that doesn’t mean all your computers get updates. Or why you’re doomed, depending on your point of view.) As we like to say, the S in IOT stands for security.) To cut a long story short: You aren’t getting updates to most of the computers on home networks because your printer, wifi router and suchlike almost never get updates. The smart lightbulbs likely never get updates either. Bad news: They probably need security updates.
And now: I don’t get updates, you don’t get updates, nobody gets updates.
The most disturbing place that Y2K38 hits, is in software update file formats.
The software package formats for Linux use TAR (Tape Archive) files, or CPIO (COpy IO) archives under the hood, both of which firstly use 32 bit timestamps and secondly, the formats haven't been changed since the 70s. And that goes for everything from ‘.rpm’ and ‘.deb’ all the way to flatpacks and docker images. Embrace the 1970s, for you never left. (Put Hotel California on repeat, to get in the mood.)
The thing is, every tool that touches TAR and CPIO files needs (and by extension also .deb and .rpm files) that update for the new y2k38 resistant variant to the format. Hypothetically, virus scanners need it too... they open archive files. Be a shame if all those software updates got shredded on the way past that virus scanner on your firewall. And there is loads of testing needs to be done. We’re talking about changing something that last changed in about 1974. And to be clear, making a new tar version that does 64 bit timestamps doesn’t fix the problems inherent in package formats not being Y2K28 ready. It’s just essential groundwork.
Then people need to start testing systems carefully, systems running dates set post-Y2K38. And checking that there are no bugs in how dates are handled in and out of Certificates, and Cert request in network protocols, because that particular ball of wax (Certificates) uses a different format (DER and BER) to everything else in computing, because reasons. And there’s a lot of networked machinery for that, and even Wi-Fi has TKIP (Temporal Key integrity protocol) which means, yes, if you set the hardware clock too far wrong, a Linux machine can’t join a Wi-Fi Network. (There is a clock at either end. Not so restrictive for Windows, because Microsoft have always cared about ‘works’ more than ‘that’s how the protocol is supposed to work’, and a lot of the computing world gets tested by seeing if Windows will boot, and connect to the Wi-Fi, and no other testing… ever.)
Then, as part of our testing, people need to start running larger and larger system-of-systems in the future. And for shits and giggles, half in the future, and partly not, because not every clock is set exactly the same. For Docker andKkuberntes, for example, touch a raw time and fail?
And in the real-world, Kiribati's line island chain crosses the date line first at GMT+14, with the Howland islands at GMT-12. (26 hours, who’d have thunk it.)
We’ve got twelve years, enough for about three major software releases of everything. Three goes of the release is probably long enough to get it right, assuming people don’t get AI to write it.
And to say that I’m good at estimates, is an understatement. I’m the Liam Neeson of 'Taken' of estimates. That makes me utterly insufferable to work with. Three releases of everything seems prudent. One to get it mostly right, one to close the important gaps and one to put some polish on it before the epochalypse hits.
There are some flies in the ointment. It's hard to find programmers who can write C without trashing the program they're supposed to fix. The C language is largely mode from fishooks and broken glass at scale. And back-porting fixes into old versions of things is actually super difficult and poorly paid. Commercial Linux vendors pay programmers in third-world countries to do it (they're cheap.)
Coincidentally, the RedHat kernel backports team already has a defect injection rate greater than one. (For every bugfix to the kernel they backport, they make more/resurrect than one new/existing bug.) So no, those aren’t, as they say, the droids you’re looking for. (Kernel backport is not a glamorous or well paid job, mostly going remote developers in the global south, who find the money good enough.)
The Rust language will not solve this problem. Golang will not solve this problem; it’s existing software and it needs to be fixed, and because timestamps used to fit into integers, lots of software uses times that are not in time-like types by accident. The small bit of good news is that components like the Linux kernel are supposedly fixed already, at least for telling the time post January 2038, so the operating system itself won’t, to an extent fail while testing for other bugs. As George Bush senior said, trust but verify. (He was talking about checking that Missile silos were empty which is a hell of a lot more important, and the one thing he ever said everyone can get behind.)
Personally, I have too much RSI to do programming as a job any more, and besides, working on C programs is actually a pain in the arse. But being physically unable to do the job, I feel, is the significant blocker here for me.
Oh, and as is traditional, I am actually off-grid and in the country, far from the maddened crowd. Who’d have thought. Enjoy the future, but as the British say, ‘Mind the gap.’ I’ve got 13 years to get a food supply sorted… but wait, I live on a farm. 13 years to get my garden and orchard really pumping out food, in case there are supply-chain disruptions. Also, fuel supply disruptions, but I have an electric backup for a vehicle, if I’m not too fussy. And thanks to NOTPETYA taking out Maersk back in the day, we know that yes, you can run international logistics largely offline if you try really hard.
One last, hilarious anecdote: Back in the run-up to Y2K, the US Congress, that ‘bastion’ of common-sense, were debating drafting programmers to fix US Government systems, until someone pointed out that those unwilling programmers would be sabotaging everything they touched. They DOGE’d that that time, but didn’t DOGE it recently, if you know what I mean.
Ciao.
If you wanted a feel-good story, I’m afraid that, as the saying goes, ‘Sir, this is a Wendys.’
Or maybe it’s that Y2K was actually mitigated, so it’s humanly possible to mitigate Y2K38.
We just need to get working.
(Oh no, I actually included a call to action? Call it Christmas goodwill.)
Footnotes.
**Now there should be a lot of explanatory detail about how computers do numbers here, and why I keep referring to time_t, and I really think you should already know this.
But apparently that's just plain mean.
32 bit numbers are a multiple of eight for reasons related to that being the character set size of EBCDIC, and can represent any number you like from zero to 4.9 billion give-or-take. Or positive 2.4 billion down to zero, then on down to negative 2.4 billion. Approximately – they’re big numbers, and seconds do not really count here. Tee Hee.
EBCIDIC is a way for a computer that only knows about numbers do represent letters and numbers and some punctuation. It came from IBM and is largely obsolete, having been replaced with ASCII, which only needs seven bits to do the same job, and was replaced with UNICODE, which promised to represent all possible printable characters, of all languages using 16 bits, and has only been largely revised three times since, and now can represent every printable character, including brown poo, using an indeterminate number of bits. Nowadays, the Unicode consortium adds characters annually. This just means most people get cryptic little squares and can’t read each other’s text.
A Two’s compliment binary 32 bit number is an integer in the range 0-4.9 billion, but with half of all the values dedicated to representing negative numbers. For efficiency of later arithmetic reasons, you make negative numbers in two’s compliment by flipping the bits (all zeroes are now ones and vice-versa) and adding one. That means that what would have been the highest value is now where we put -1. And because computer math is very simple and can ignore overflow if you want, adding one to that gets us to zero, and one more, to one. That does mean that the largest positive number and largest negative number lie adjacent to one anther, but also, for ease of representation, all negative numbers have the most significant bit set. (There’s also ones-compliment which was used in vintage computers from before 1970!) Conveniently, modern computer Central Processing units (CPUs) can tell instantly if a number has the ‘high bit’ set, so viola, your computer can tell if a number is negative. (You can do math on text strings, but ‘stringy math’ is very slow, with the central processing unit reduced to literally counting on it’s fingers, so it’s often regarded as a disaster.)
Switching to unsigned times moves the problem to 2147. A fallback plan at best.
Floating point numbers (the ones with decimal places, or exponents if they’re really big) are handled completely differently in computers, and usually using 64 bits and thence 14 digits is ‘good enough’. The paragon of this ‘Floating point’ representation is either Excel spreadsheets, or if you’re a programmer, web-pages, wherin ECMAScript and it’s siblings are running. It’s not terribly accurate, but nobody does anything important with a spreadsheet, right? (Excel will not only give wrong answers when adding lots of numbers, it will randomly treat cells as dates. But you can always make it worse with AI.)
Fun fact: In some eastern-European languages, ‘Idiot’ means person who has no particular professional training.
time_t was the label of the macro in the Unix system that defined what type (_T) the time was. As stated earlier, it was a number, and for much of digital history, has been a 32 bit signed integer.
In a C program one defines macros by having a line that goes something like:
#define time_t word
(C has some really quaint historical baggage, including originally not defining type sizes very clearly or portably, and the C preprocessor, which is a very weak macro processor, mostly, one presumes, as a reaction to it's authors experiences with it's evil precursor, the M4 Macro processor, which came from a previous operating system project, and is alive and well and living in the dark corners of global C-programming toolchains, where it scares young programmers.)
So if this essay seems slanted towards ‘subject matter experts’ well, as Willy Wonka said, strike that, invert it. If you really want to know the basics of how the entire world works, beyond this sketchy little explanation, there’s Wikipedia, or perhaps secondary or tertiary schooling. None of this is new; maths is not new, boolean logic was discovered and the book on it was written in 1854, and Unix has been around since the early 1970s. Unix was, for complicated reasons, largely free and ran on relatively cheap computers, and as people discovered, by expecting 'nothing much' except a hard disk drive from the computer, was easy to port to new computer designs. And there was this AI boom that bust in the early 80s, so it all came together. Computers complicated enough to have these problems with dates have existed since the 1940s. Yes, for nearly a century. We've already had Y2K, you'd think people would learn. (Alas, human decision making is deeply flawed, and mostly based on simple heuristics and/or cognitive biases.)