OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 11:25 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 91 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7  Next
Author Message
 Post subject: Re: Secure? How?
PostPosted: Thu Mar 12, 2015 1:05 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

alexfru wrote:
Brendan wrote:
a) C is bad because it has too many rules that are not enforceable by the compiler


True. IOW, the compiler can help only so much with broken code.

Brendan wrote:
b) GCC is bad because it doesn't detect or report "invalid input" where it could (C's rules that are enforceable by the compiler)


Like what? I know that several years ago it didn't warn about indexing an array with an invalid (too large) index in code something like "int a[10]; a[11] = 1;". I also know that it doesn't always spot things like "a[ i] = i++;" (a bit more complex expression will cause it miss the problem). What else?


Like (where possible) detecting strict aliasing bugs, and making sure the source code actually checks the value return by "malloc()", and array indexes, and various signed integer shifts, and whether the input/output parameters for a "extern void foo(void)" declaration actually matches the definition, and probably hundreds of other corner cases.

Think of it like this. For C there's about 50 different static analysers. For just one of them (from the Coverty wikipedia page) we get things like "the tool was used to examine over 150 open source applications for bugs; 6000 bugs found by the scan were fixed, across 53 projects". Of those maybe about 15% are bugs that the compiler could have detected but failed to mention (basically everything where a static analyser can find the bug with no "false negatives"), and the remaining bugs are problems in the C standard.

alexfru wrote:
Brendan wrote:
c) the combination of bad language and bad compiler unnecessarily increases the number of bugs and security vulnerabilities users are exposed to for no reason whatsoever


For a long time I thought it was mainly a problem of teaching/learning the language properly, of availability of good books and articles. The language is clearly much less intuitive than others, less well defined than assembly language / CPU architecture and makes math even more inhumane.


I used to think it was mostly a problem with the programmers too (and in some cases it is). However; people with far more experience using C than I'll (hopefully) ever have are still creating the same bugs as everyone else. Mostly, it's easy to write code in C that "seems to work" that doesn't comply with the language's specification and has subtle bugs that might not be noticed for 10+ years even if/when hundreds of C programmers look at the code.

alexfru wrote:
Brendan wrote:
d) this is probably the single largest "root cause" of security vulnerabilities


I'm not sure if it's the largest. You have to be pretty much in a paranoid mode when writing or fixing security-sensitive code. It's not a typical mindset/mode for most software developers. We also tend to overcomplicate things to the point at which it becomes hard to not miss an important edge case as such and not get overwhelmed by the amount of code we're dealing with. From my experience with Windows code I can tell that missing/insufficient/incorrect checks/validation were around the top issues implementation-wise. Even decent C/C++ coders from time to time will forget to check this or that.


It's much worse than that. For a simple test, see if you can write a 100% correct/valid version of this code:
Code:
    int saturatingAdd(int a, int b) {
        if(a + b > INT_MAX) return INT_MAX;
        if(a + b < INT_MIN) return INT_MIN;
        return a + b;
    }

I'd be willing to bet that over half of the experienced/professional C programmers will end up with subtle bugs; especially if they're writing the code as part of their normal work and don't know it's a test/challenge.

alexfru wrote:
Brendan wrote:
e) with a better language and better compiler the majority of security vulnerabilities could've been avoided without any significant disadvantages


You'd have to trade some performance to make C well/better defined.


Maybe, yes. It depends too much on what you change and how. For a simple example, if the C standard said signed integer overflow causes wrapping (and isn't undefined) it'd probably be faster on most CPUs.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Thu Mar 12, 2015 1:15 pm 
Offline
Member
Member
User avatar

Joined: Wed Jan 06, 2010 7:07 pm
Posts: 792
Assuming integers do not overflow is actually pretty important for loop optimizations, so barring some other changes to the type system, defining it would actually make it slower.

_________________
[www.abubalay.com]


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Thu Mar 12, 2015 2:01 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Rusky wrote:
Assuming integers do not overflow is actually pretty important for loop optimizations, so barring some other changes to the type system, defining it would actually make it slower.


For C, I doubt it. More likely is that those loop optimisations rely on the assumption that array indices don't overflow.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Thu Mar 12, 2015 7:08 pm 
Offline
Member
Member
User avatar

Joined: Wed Jan 06, 2010 7:07 pm
Posts: 792
That's what I said- several important loop optimizations rely on integers not overflowing (array indices and otherwise).

_________________
[www.abubalay.com]


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Thu Mar 12, 2015 7:50 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Rusky wrote:
That's what I said- several important loop optimizations rely on integers not overflowing (array indices and otherwise).


Yes. After that I replied "For C, I doubt that".

Essentially I think it's unfounded nonsense; given that for C most loops either:
  • Don't use signed integers in the first place
  • Do use signed integers but it can be proven that the integer doesn't overflow anyway
  • Do use signed integers, but these hypothetical optimisations still work
  • Do use signed integers, but the integer/s are used for array indexes and therefore the compiler can assume they don't overflow regardless

Can you provide an example that's common enough to matter?


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Thu Mar 12, 2015 8:41 pm 
Offline
Member
Member
User avatar

Joined: Wed Jan 06, 2010 7:07 pm
Posts: 792
An awful lot of code does use signed int for the induction variable, and I would say any code where the assumption of non-overflow is important is also code that you can't prove much about the values (because they come in as function arguments or from user input).

As far as examples go, I have to admit I don't know enough to say. It does seem important enough to the GCC maintainers that they added -fno-strict-overflow in addition to -fwrapv, specifically to inhibit the optimizer in fewer situations: http://www.airs.com/blog/archives/120

_________________
[www.abubalay.com]


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Thu Mar 12, 2015 11:33 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Rusky wrote:
As far as examples go, I have to admit I don't know enough to say. It does seem important enough to the GCC maintainers that they added -fno-strict-overflow in addition to -fwrapv, specifically to inhibit the optimizer in fewer situations: http://www.airs.com/blog/archives/120


Originally, I think GCC just assumed that signed overflow never happened and optimised accordingly. Then everyone in the world complained (examples: 1, 2, 3) because it broke a massive amount of existing "not strictly correct" code (where the programmer just assumed wrapping behaviour because that's what CPUs do and what other compilers give you) and caused severe security vulnerabilities everywhere. They added "-fno-strict-overflow" and "-fwrapv" so people can make the compiler do what it always should have done. ;)


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 12:20 am 
Offline
Member
Member
User avatar

Joined: Wed Jan 06, 2010 7:07 pm
Posts: 792
Well, they originally added -fwrapv for that purpose, which specifies wrapping behavior. However (as explained in the link in my last post) they later added -fno-strict-overflow, which removes the UB without specifying the actual resulting value, for performance reasons. There are a couple micro-examples in the article I linked that I imagine come up occasionally.

_________________
[www.abubalay.com]


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 1:51 am 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 1108
Brendan wrote:
alexfru wrote:
Brendan wrote:
c) the combination of bad language and bad compiler unnecessarily increases the number of bugs and security vulnerabilities users are exposed to for no reason whatsoever


For a long time I thought it was mainly a problem of teaching/learning the language properly, of availability of good books and articles. The language is clearly much less intuitive than others, less well defined than assembly language / CPU architecture and makes math even more inhumane.


I used to think it was mostly a problem with the programmers too (and in some cases it is). However; people with far more experience using C than I'll (hopefully) ever have are still creating the same bugs as everyone else. Mostly, it's easy to write code in C that "seems to work" that doesn't comply with the language's specification and has subtle bugs that might not be noticed for 10+ years even if/when hundreds of C programmers look at the code.


Like I said, distractions, overload and fatigue can make even good programmers write buggy code. It's understandable.

It's also understandable that if you happen to be about the only C expert working on the project, others won't be able to point at your bugs because they aren't nearly as qualified as you are. And in big projects you wouldn't expect many people to read and meticulously review the parts not directly related to theirs.

If C was a bit more friendly, there would be more people capable spotting C-specific bugs and there would be fewer of such bugs in the first place. I give you that.

But we don't have such a friendly dialect of C in existence or common use. So, if it's C, you're stuck with its problems and you can only help others learning it by pointing to the right resources, by reviewing their code with them and showing how you write your code, so they have good examples to learn from.

Brendan wrote:
alexfru wrote:
Brendan wrote:
d) this is probably the single largest "root cause" of security vulnerabilities


I'm not sure if it's the largest. You have to be pretty much in a paranoid mode when writing or fixing security-sensitive code. It's not a typical mindset/mode for most software developers. We also tend to overcomplicate things to the point at which it becomes hard to not miss an important edge case as such and not get overwhelmed by the amount of code we're dealing with. From my experience with Windows code I can tell that missing/insufficient/incorrect checks/validation were around the top issues implementation-wise. Even decent C/C++ coders from time to time will forget to check this or that.


It's much worse than that. For a simple test, see if you can write a 100% correct/valid version of this code:
Code:
    int saturatingAdd(int a, int b) {
        if(a + b > INT_MAX) return INT_MAX;
        if(a + b < INT_MIN) return INT_MIN;
        return a + b;
    }

I'd be willing to bet that over half of the experienced/professional C programmers will end up with subtle bugs; especially if they're writing the code as part of their normal work and don't know it's a test/challenge.


That's probably a bad example. I wouldn't call anyone an experienced/professional C programmer if they wrote the above piece of code. Because it shows two basic problems and doesn't even touch trickier and more interesting things.

First, it shows that the programmer either isn't testing their code or isn't using compiler warnings. The condition expressions in the if statements are all false. I don't know why my 4.8.2 isn't showing any warnings (-O3 -Wall -Wextra -pedantic -std=c99), but Open Watcom C/C++ 1.9 does. I'd expect clang or Microsoft's C++ compiler to be able to issue a warning here at the appropriate warning level. This probably speaks in favor of your statement that gcc is a bad compiler. :)

Second, it would really be strange to not know that int+int yields int just as long+long yields long and double+double yields double and so on. The exception is when you get to deal with types smaller than int, e.g. short and char. This is where things become interesting and where even experienced C programmers can write nonsense. And the below variant for signed chars could have a chance to work:
Code:
    signed char saturatingAdd(signed char a, signed char b) {
        if(a + b > SCHAR_MAX) return SCHAR_MAX;
        if(a + b < SCHAR_MIN) return SCHAR_MIN;
        return a + b;
    }


Typical and less immediately obvious bugs are like this:
Code:
    uint32_t mul32_16x16(uint16_t a, uint16_t b) {
        return a * b;
    }


And like this:
Code:
    uint32_t ReadAsLittleEndian32(uint8_t* p) {
        return p[0] | (p[1] << 8) | (p[2] << 16) | (p[3] << 24);
    }


While everything looks fine at first glance, most likely there's a UB hiding in plain sight.

Your saturating add for signed ints becomes interesting when we try to avoid overflow and any other UB and such. Extra conditions, moving operands between left-hand and right-hand sides, fun. Been there, done that. :)

But what people often do is something like
Code:
    size_t bufsz, item1sz, item2sz, item3sz;

    // get the sizes into the above variables

    if (item1sz + item2sz + item3sz > bufsz)
    {
        // error handling
    }

    // copy all items into the buffer


They fail to think well about individual overflows in item1sz + item2sz and in sum_of_item1sz_and_item2sz + item3sz.

This becomes a larger mess when signed types get involved in size/count/index calculations.


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 8:05 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

alexfru wrote:
Brendan wrote:
It's much worse than that. For a simple test, see if you can write a 100% correct/valid version of this code:
Code:
    int saturatingAdd(int a, int b) {
        if(a + b > INT_MAX) return INT_MAX;
        if(a + b < INT_MIN) return INT_MIN;
        return a + b;
    }

I'd be willing to bet that over half of the experienced/professional C programmers will end up with subtle bugs; especially if they're writing the code as part of their normal work and don't know it's a test/challenge.


That's probably a bad example. I wouldn't call anyone an experienced/professional C programmer if they wrote the above piece of code. Because it shows two basic problems and doesn't even touch trickier and more interesting things.


I know it's buggy - the question is whether people can be expected to write a correct version. So far, nobody has had the courage to attempt this extremely simple piece of code.

For what it's worth, I don't want to attempt it either. For this case I'd be tempted to use inline assembly (where there is no undefined behaviour and where you can access the carry/overflow flags, and where it's very easy to be confident the code is correct). ;)

alexfru wrote:
And the below variant for signed chars could have a chance to work:
Code:
    signed char saturatingAdd(signed char a, signed char b) {
        if(a + b > SCHAR_MAX) return SCHAR_MAX;
        if(a + b < SCHAR_MIN) return SCHAR_MIN;
        return a + b;
    }


You've tried to avoid problems by using signed char instead of int; but even in that case are you sure your code is correct in all cases? Hint: Imagine a computer (maybe a DSP) where CHAR_BITS == 32, and where sizeof(int) == sizeof(long) == 1.

alexfru wrote:
While everything looks fine at first glance, most likely there's a UB hiding in plain sight.


Exactly. People (experienced C programmers) can't be expected to do trivial things in C correctly because the language itself makes it virtually impossible; and the only reason anything larger/more complex works is because everyone relies on "seems to work despite being technically wrong" (including relying on both undefined behaviour and implementation defined behaviour).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 10:05 am 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 3:45 am
Posts: 9301
Location: On the balcony, where I can actually keep 1½m distance
Just for the sake of trying (not tested at all):
Code:
int saturateAdd(int a, int b)
{
    if ((b > 0) && (INT_MAX - b < a)) return INT_MAX;
    if ((b < 0) && (INT_MIN - b > a)) return INT_MIN;
    return a+b;
}

_________________
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 10:46 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Combuster wrote:
Just for the sake of trying (not tested at all):
Code:
int saturateAdd(int a, int b)
{
    if ((b > 0) && (INT_MAX - b < a)) return INT_MAX;
    if ((b < 0) && (INT_MIN - b > a)) return INT_MIN;
    return a+b;
}


Is far as I can tell, that's correct (correct result, and no undefined or implementation defined behaviour).

For my next challenge, try saturating subtraction, saturating multiplication and saturating division. :)


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 11:03 am 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 1108
Brendan wrote:
alexfru wrote:
That's probably a bad example. I wouldn't call anyone an experienced/professional C programmer if they wrote the above piece of code. Because it shows two basic problems and doesn't even touch trickier and more interesting things.


I know it's buggy - the question is whether people can be expected to write a correct version. So far, nobody has had the courage to attempt this extremely simple piece of code.


I left it as an exercise for the others.

Brendan wrote:
alexfru wrote:
And the below variant for signed chars could have a chance to work:
Code:
    signed char saturatingAdd(signed char a, signed char b) {
        if(a + b > SCHAR_MAX) return SCHAR_MAX;
        if(a + b < SCHAR_MIN) return SCHAR_MIN;
        return a + b;
    }


You've tried to avoid problems by using signed char instead of int; but even in that case are you sure your code is correct in all cases? Hint: Imagine a computer (maybe a DSP) where CHAR_BITS == 32, and where sizeof(int) == sizeof(long) == 1.


Nope, I haven't tried avoiding that. I just said [it] could have a chance to work, implying it would be a more conventional system than mentioned in your hint. Perhaps, I should've been more clear.

Brendan wrote:
alexfru wrote:
While everything looks fine at first glance, most likely there's a UB hiding in plain sight.

Exactly. People (experienced C programmers) can't be expected to do trivial things in C correctly because the language itself makes it virtually impossible; and the only reason anything larger/more complex works is because everyone relies on "seems to work despite being technically wrong" (including relying on both undefined behaviour and implementation defined behaviour).


You have a point and it's a bit extreme. :)


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 1:25 pm 
Offline
Member
Member

Joined: Sat Mar 15, 2014 3:49 pm
Posts: 96
Saturation is hard to get right, and harder to get right portably. Its a great example of when C's low level is hard on even experienced practitioners.

I think it also shows how C compilers are slowly improving. Newer compilers - LLVM and GCC - are aligning builtins for lots of things in this vein (just not saturation yet).

Right now, there are branchless c recipes using only bitwise ops that you can find the web.

Saturated arithmetic is also available MMX and SSE intrinsics.

On the Mill all appropriate ops come in four flavours: modulo, excepting, saturating and widening. This shows our DSP roots.

When the time comes I will push for LLVM and GCC builtins for saturating arithmetic, in line with the recent alignment on overflow: http://clang.llvm.org/docs/LanguageExte ... c-builtins


Top
 Profile  
 
 Post subject: Re: Secure? How?
PostPosted: Fri Mar 13, 2015 1:38 pm 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 1108
willedwards wrote:
When the time comes I will push for LLVM and GCC builtins for saturating arithmetic, in line with the recent alignment on overflow: http://clang.llvm.org/docs/LanguageExte ... c-builtins


What about the same for types like uint32_t?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 91 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 17 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group