Hi,
Icee wrote:
Some replies here are missing an essential thing: in reality you don't get to analyze a snippet of code. You have to decide whether a given program (as in isolated service) is "safe" or not as a whole. This does not really boil down to proof by composition. A program composed of multiple "safe" pieces it not necessarily "safe". I am putting "safe" in irony quotes here because even providing the formal definition of what "safe" is is a huge task by itself, with PhD theses on this subject popping out every now and then.
Actually, it's the opposite. If you prove each piece is safe on its own then you've proven that the whole program is safe; but if you prove that some pieces are not safe on their own then the whole program can still be safe.
For a simple example:
Code:
int myArray[256];
int foo(uint8_t a, uint8_t b) {
return 1000 / (a+b);
}
void bar(uint8_t a, uint8_t b, int c) {
myArray[a+b] = c;
}
void baz(void) {
foo(1, 2);
bar(1, 2);
}
In this case it's easy to prove that "foo()" is unsafe on its own (possible division by zero) and easy to prove that "bar()" is unsafe on its own (potential array index out of bounds); but the whole program is still safe.
Of course these cases can also be a code maintenance disaster because an unsuspecting programmer could write code that does "foo(0,0);" or "bar(200, 200);" and make the whole program unsafe. It's like laying little traps for unsuspecting victims; where a programmer needs to know internal implementation details for "foo()" or "bar()" to use them and can't just treat functions as "opaque black boxes". For larger programs being written by a team of people you want to avoid that; which means that
in practice you want to insist on "all functions are safe in isolation". Shared libraries (where the internal implementation of the shared library can be changed in future) are just another example of wanting "all functions are safe in isolation" to avoid a code maintenance disaster.
Note that it can still be possible (using whole program analysis) to prove that a program is safe despite "unsafe in isolation" functions. In this case you still analyse functions in isolation, but instead of merely categorising each function as "proven safe" or "not proven safe" you have to use a much more elaborate/complex scheme where you determine "proven safe under these conditions" (e.g. the "foo()" in the example above would be "proven safe if a+b != 0"). This also means that you need depth first analysis, so that you can merge the conditions for child functions into the parent.
For example:
Code:
void baz(uint8_t a) {
foo(a, 2); // Note: "foo()" is proven safe if a+2 != 0,
// therefore this is safe when a != -2
// but a is unsigned and must be >= 0 so a can never be -2
// so this must be safe
bar(a, 2); // Note: "bar()" is proven safe if a+2 < 256,
// therefore this is safe when a < 254,
// therefore "baz()" is only safe when a < 254
}
This also means that you can't handle recursion (because you can't do "depth first analysis" in that case). However...
For recursion, if you assume that the function is always safe, you can prove if the function is always safe (or not always safe). For example, this function can be proven to be always safe:
Code:
void baz(uint8_t a) {
uint8_t temp;
foo(a, 2); // Proven always safe
if(a < 100) { // Proven always safe
temp = a+8; // Proven always safe
baz(temp); // Safe if "baz()" is always safe
}
}
For another example, this function can not be proven to be always safe:
Code:
void baz(uint8_t a) {
uint8_t temp;
bar(a, 2); // Only safe if a < 254
if(a < 100) { // Proven always safe
temp = a+50; // Proven always safe
baz(temp); // Safe if "baz()" is always safe
}
}
What this means is that there's still a chance of false negatives where "recursion and not proven to be always safe" is involved. However...
You can take it a step further, like this:
Code:
void baz(uint8_t a) {
uint8_t temp;
bar(a, 2); // Only safe if a < 254
if(a < 100) { // Proven always safe
temp = a+50; // Proven always safe
baz(temp); // Safe if "baz()" is safe
// and "baz()" might be safe if temp < 254
// therefore "baz()" might be safe if a+50 < 254
// therefore "baz()" might be safe if a < 204
// and "baz()" might be safe if temp < 204
// therefore "baz()" might be safe if a+50 < 204
// therefore "baz()" might be safe if a < 154
// and "baz()" might be safe if temp < 154
// therefore "baz()" might be safe if a+50 < 154
// therefore "baz()" might be safe if a < 104
// and "baz()" might be safe if temp < 104
// therefore "baz()" might be safe if a+50 < 104
// therefore "baz()" might be safe if a < 54
// and "baz()" might be safe if temp < 54
// therefore "baz()" might be safe if a+50 < 54
// therefore "baz()" might be safe if a < 4
// and "baz()" might be safe if temp < 4
// therefore "baz()" might be safe if a+50 < 4
// therefore "baz()" might be safe if a < -46
// but a is unsigned and can not be < -46
// therefore "baz()" is proven never safe
}
}
Now...
Did you notice that we had no reason to care if "baz()" halts or not for any of these examples? This is because, even for the "more complex/advanced than is beneficial" analysis,
the halting problem is still completely irrelevant.
Cheers,
Brendan