34 years ago, Tony Hoare gave a very interesting, and somewhat prophetic, Turing Award lecture. In case anyone’s not familiar with him, he’s one of the great pioneers of computer science. Among other things, he invented Quicksort, and the CASE statement.
He talks about his work on ALGOL compilers, and one of the things he said has been on my mind recently:
In that design I adopted certain basic principles that I believe to be as valid today as they were back then. The first principle was security: The principle that every syntactically incorrect program should be rejected by the compiler and that every syntactically correct program should give a result or an error message that was predictable and comprehensible in terms of the source language program itself. Thus no core dumps should ever be necessary. It was logically impossible for any source language program to cause the computer to run wild, either at compile time or at run time.
A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interest of efficiency on production runs. Unanimously, they urged us not to—they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.
He said this in 1980, about work he had done in 1960, so this was known and understood to be a good idea as far back as 50 years ago. But, of course, the programming community in general didn’t listen. Several years later, the consequences came back to bite us, in the form of the Morris Worm.
It rampaged throughout the fledgling Internet of the day, crashing an estimated 10% of all systems connected to the Internet by exploiting buffer overruns in a handful of specific UNIX programs. The author, a sleazebag by the name of Robert Morris, later claimed that he just wanted to find a way to “count the number of computers on the Internet,” but his actions put the lie to that statement. He encrypted the Worm and used rootkit techniques to hide it from the file system, and he released it from a different university than the one he attended, in an attempt to cover his tracks. A person who believes they aren’t doing anything wrong doesn’t try to hide what they’re doing, and comments in his original source code make it clear that his intention was anything but benign; he was trying to build what we call a botnet today.
And all because of buffer exploits in a handful of C programs. That really should have put us all on notice. Hoare was right, and in any sane world, the C language would have been dead by 1990.
But it didn’t happen, and those who refuse to learn from history are doomed to repeat it, so once the Internet started becoming a big thing among the general public, in the early 2000s, we ended up with a bunch of new worms that snuck into Windows systems through buffer exploits. Anyone remember Slammer? Blaster? Code Red?
Hoare was right. We should have listened.
Why has all this been on my mind lately? If you’ve been paying attention at all to Internet news, you already know: Heartbleed. History has repeated itself yet again. A buffer exploit in a widely-used C library, affecting anywhere from 10% (there’s that figure again) to 66% of all servers on the Internet, depending on which estimate you listen to, with a horrendous vulnerability described by security expert Bruce Schneier as “on a scale of 1 to 10, this is an 11.”
Hoare was right. Will we listen this time? Probably not. So it’ll happen again.
Building any software with an inherent security requirement–browsers and other network-facing software, OSes, and so on–in C, C++, Objective-C or any other member of the C family ought to be regarded by now as an act of criminal negligence, by the programming community in general if not by the law.
Remember when Steve Jobs died, the minor kerfuffle over Richard Stallman’s quoting Chicago Mayor Harold Washington WRT the corrupt former Mayor Daley: “I’m not glad he’s dead, but I’m glad he’s gone”? It was just a few days later that Dennis Ritchie, the creator of C, died, and that’s exactly how I felt about it. As one of my coworkers at WideOrbit put it, Ritchie’s true legacy to the world is the buffer overflow.
Some people say “the language is not the problem; the problem is bad programmers using it incorrectly.” But that’s not true. The guy responsible for the Heartbleed vulnerability isn’t a bad programmer. Have a look at the commit where the bug was introduced. See if you can find the problem without being told where it is.
It’s clear that this is not the work of an incompetent n00b; this is someone who really knows his way around the language. But he made a mistake, and it’s a subtle enough one that most people, even knowing beforehand that that changeset contains a severe bug and knowing what class of bug it is (a buffer exploit vulnerability) won’t be able to find it.
To err is human, but when a mistake can have consequences of this magnitude, it’s also unforgivable. That puts the language, which forgives such mistakes all to easily, fundamentally at odds with reality vis a vis human nature. That means something’s gotta give, and it’s not going to be reality… and this is what happens when it does.
Others claim that the C language, with its unsafe low-level direct memory access, is necessary on many low-resource systems where counting bytes and cycles still matter. To this I say, wake up and smell the 21st century. In an age of Raspberry Pis and Arduinos, capable of running a full-fledged Linux operating system with enough hardware capacity left over to play HD movies, all for well under $100, the existence of such limited systems is laughably obsolete.
No, there’s really no excuse left for C, other than inertia. (Which, if you recall, is ultimately what ran the Titanic into that iceberg.) Can we let it and its entire misbegotten family die already? It’s 25 years overdue for its own funeral.
PS. It’s not like viable alternatives don’t exist. It’s worth noting that at the time the Morris Worm first brought the Internet to its knees by exploiting buffer overruns in C, Apple was already five years into its Macintosh project that ended up defining the entire future of operating system design… in Pascal.