Brendan wrote:
alexfru wrote:
Brendan wrote:
c) the combination of bad language and bad compiler unnecessarily increases the number of bugs and security vulnerabilities users are exposed to for no reason whatsoever
For a long time I thought it was mainly a problem of teaching/learning the language properly, of availability of good books and articles. The language is clearly much less intuitive than others, less well defined than assembly language / CPU architecture and makes math even more inhumane.
I used to think it was mostly a problem with the programmers too (and in some cases it is). However; people with far more experience using C than I'll (hopefully) ever have are still creating the same bugs as everyone else. Mostly, it's easy to write code in C that "seems to work" that doesn't comply with the language's specification and has subtle bugs that might not be noticed for 10+ years even if/when hundreds of C programmers look at the code.
Like I said, distractions, overload and fatigue can make even good programmers write buggy code. It's understandable.
It's also understandable that if you happen to be about the only C expert working on the project, others won't be able to point at your bugs because they aren't nearly as qualified as you are. And in big projects you wouldn't expect many people to read and meticulously review the parts not directly related to theirs.
If C was a bit more friendly, there would be more people capable spotting C-specific bugs and there would be fewer of such bugs in the first place. I give you that.
But we don't have such a friendly dialect of C in existence or common use. So, if it's C, you're stuck with its problems and you can only help others learning it by pointing to the right resources, by reviewing their code with them and showing how you write your code, so they have good examples to learn from.
Brendan wrote:
alexfru wrote:
Brendan wrote:
d) this is probably the single largest "root cause" of security vulnerabilities
I'm not sure if it's the largest. You have to be pretty much in a paranoid mode when writing or fixing security-sensitive code. It's not a typical mindset/mode for most software developers. We also tend to overcomplicate things to the point at which it becomes hard to not miss an important edge case as such and not get overwhelmed by the amount of code we're dealing with. From my experience with Windows code I can tell that missing/insufficient/incorrect checks/validation were around the top issues implementation-wise. Even decent C/C++ coders from time to time will forget to check this or that.
It's much worse than that. For a simple test, see if you can write a 100% correct/valid version of this code:
Code:
int saturatingAdd(int a, int b) {
if(a + b > INT_MAX) return INT_MAX;
if(a + b < INT_MIN) return INT_MIN;
return a + b;
}
I'd be willing to bet that over half of the experienced/professional C programmers will end up with subtle bugs; especially if they're writing the code as part of their normal work and don't know it's a test/challenge.
That's probably a bad example. I wouldn't call anyone an experienced/professional C programmer if they wrote the above piece of code. Because it shows two basic problems and doesn't even touch trickier and more interesting things.
First, it shows that the programmer either isn't testing their code or isn't using compiler warnings. The condition expressions in the if statements are all false. I don't know why my 4.8.2 isn't showing any warnings (-O3 -Wall -Wextra -pedantic -std=c99), but Open Watcom C/C++ 1.9 does. I'd expect clang or Microsoft's C++ compiler to be able to issue a warning here at the appropriate warning level. This probably speaks in favor of your statement that gcc is a bad compiler.
Second, it would really be strange to not know that int+int yields int just as long+long yields long and double+double yields double and so on. The exception is when you get to deal with types smaller than int, e.g. short and char. This is where things become interesting and where even experienced C programmers can write nonsense. And the below variant for signed chars could have a chance to work:
Code:
signed char saturatingAdd(signed char a, signed char b) {
if(a + b > SCHAR_MAX) return SCHAR_MAX;
if(a + b < SCHAR_MIN) return SCHAR_MIN;
return a + b;
}
Typical and less immediately obvious bugs are like this:
Code:
uint32_t mul32_16x16(uint16_t a, uint16_t b) {
return a * b;
}
And like this:
Code:
uint32_t ReadAsLittleEndian32(uint8_t* p) {
return p[0] | (p[1] << 8) | (p[2] << 16) | (p[3] << 24);
}
While everything looks fine at first glance, most likely there's a UB hiding in plain sight.
Your saturating add for signed ints becomes interesting when we try to avoid overflow and any other UB and such. Extra conditions, moving operands between left-hand and right-hand sides, fun. Been there, done that.
But what people often do is something like
Code:
size_t bufsz, item1sz, item2sz, item3sz;
// get the sizes into the above variables
if (item1sz + item2sz + item3sz > bufsz)
{
// error handling
}
// copy all items into the buffer
They fail to think well about individual overflows in item1sz + item2sz and in sum_of_item1sz_and_item2sz + item3sz.
This becomes a larger mess when signed types get involved in size/count/index calculations.