alexfru wrote:
The standard didn't try to contain undefined behavior in any way.
It didn't define that, say, signed integer overflow is fully contained within the (say, multiplication) operator that causes it and that the operator only produces a wrong value and there are no other ill side effects.
Early computers and compilers, though, couldn't do much in terms of code analysis and optimizations and therefore a lot of instances of UB seemed contained and rarely surprising. Technological progress contributed to said analysis and optimizations and made UB bleed, creep and spread further beyond simple operations causing it.
My position is that a lot of code is indeed nonconformant (almost all of it). And it has happened because people have been getting away with it. Perhaps, today they don't get away with some of the tricks of the past, but they still do with others or they rely (often, unknowingly) on implementation-specific behavior, thinking that's *the* language, whereas it's just an extension to it.
You can embrace the language standard and enable all kinds of "sanitizers" to reveal various obscure UBs in the code.
That has been
exactly my position for many years. But, recently I've started to question that idea. How can we prove that nobody, even the early C programmers never actually
knew properly the language? How do you define "the language"? You can argue that the language is exclusively described by the first ISO standard but:
- How can you prove that no misinterpretation ever occurred and disprove the theory according to which an update of the standard in the transition C89 -> C99 really changed the spirit of the language?
- Even if no misinterpretation ever occurred, how can you be sure that the wording of the first C standard correctly expressed the intention of its creators? In other words, while trying to understand the philosophy of the language, are you willing to consider as evidence not just the first ISO standard but also the code written by its creators and the opinions they expressed over time?
IMHO, formally, the language now is defined solely by the ISO standard, but that doesn't prove that the language didn't fundamentally change since its inception. What we call today "C", is not anymore what was "C" was meant to be by its original creators, that's ultimately my point here. I'm not absolutely sure about that theory, just.. I'm starting to believe in it.
Some clues about that misinterpretation theory? Well, check the Dennis Richie's comments on his essay about "noalias" which, fortunately, was never included in the language, at least not in that form:
Dennis Richie wrote:
Let me begin by saying that I’m not convinced that even the pre-December qualifiers (`const’ and `volatile’) carry their weight; I suspect that what they add to the cost of learning and using the language is not repaid in greater expressiveness. `Volatile,’ in particular, is a frill for esoteric applications, and much better expressed by other means. Its chief virtue is that nearly everyone can forget about it. `Const’ is simultaneously more useful and more obtrusive; you can’t avoid learning about it, because of its presence in the library interface. Nevertheless, I don’t argue for the extirpation of qualifiers, if only because it is too late.
The fundamental problem is that it is not possible to write real programs using the X3J11 definition of C. The committee has created an unreal language that no one can or will actually use. While the problems of `const’ may owe to careless drafting of the specification, `noalias’ is an altogether mistaken notion, and must not survive.
He is very skeptic about const and volatile, let's not even mention "noalias". Now look at this other statement:
Dennis Richie wrote:
2. Noalias is an abomination
`Noalias’ is much more dangerous; the committee is planting timebombs that are sure to explode in people’s faces. Assigning an ordinary pointer to a pointer to a `noalias’ object is a license for the compiler to undertake aggressive optimizations that are completely legal by the committee’s rules, but make hash of apparently safe programs.
He was
deeply concerned about giving compilers the license to make aggressive optimizations.
Finally:
Dennies Richie wrote:
Noalias must go. This is non-negotiable.
It must not be reworded, reformulated or reinvented. The draft’s description is badly flawed, but that is not the problem. The concept is wrong from start to finish. It negates every brave promise X3J11 ever made about codifying existing practices, preserving the existing body of code, and keeping (dare I say it?) `the spirit of C.’
So, he wanted the ISO document to
codify the existing practices and preserve
the spirit of C while other people obviously pushed in another direction. After the ISO C89 standard was released (which, in my view was a big compromise between all the parties), people belonging to the other school of thought continued pushing harder and harder. At the end, a small battle after another, they've won completely. Maybe for good, maybe for bad. I don't wanna judge.
Because of that, indeed, today C is much more optimizable than what it was in the past.
I totally agree with that. For many things I'd agree that's for good, actually.
Just, I care to remark that
modern C is not what the C language was meant to be and that the theory according which for decades the UNIX programmers (even the ones working closely with K&R) had no idea of what C really was is
wrong and
unfair to them. It somehow belittles their work and them as programmers.
-------------
EDIT: link to Dennis Richie's essay:
https://www.yodaiken.com/2021/03/19/dennis-ritchie-on-alias-analysis-in-the-c-programming-language-1988/Second link in case the first gets broken one day:
https://www.lysator.liu.se/c/dmr-on-noalias.html