Hi,
On 32-bit 80x86, there really isn't enough registers; and there's problems with variadic functions and functions with more arguments than you have registers. Passing arguments in registers can make code faster, but it can also make code slower (e.g. caller saves "in use" values in registers onto the stack before it can store arguments in those registers; then callee pushes arguments from registers onto stack so it can use the registers itself); and this is partly why some registers are "callee preserved" (so the caller knows it won't need to save "in use" values in those callee preserved registers).
Don't forget that for GCC you can already tell it to use the "fastcall" calling convention (which passes the first 2 arguments in registers); and (for performance) this is possibly a good compromise between passing too many arguments in registers (and harming performance) and passing too many arguments on the stack (and harming performance).
In theory the best approach is "no ABI"; as this allows the compiler to customise/optimise the calling used by each function individually to suit the function itself and any callers the compiler knows about; and get the fastest code for each specific case. In practice most modern compilers already support this, but only for static functions or if/when you use whole program optimisation (and not for dynamically linked/shared libraries).
Also; by changing the ABI significantly you'll probably break the compiler's code optimisers, and will need to fix them. You're not talking about minor changes to prologue/epilogue and "function call generation" alone.
Of course you will be breaking more than just GCC (e.g. debuggers and linkers won't understand your calling convention either).
Hellbender wrote:
I was thinking that since there is no user data in the callstack (where the return address is), there are no pointers to the callstack memory in any typical code. Thus, any buffer overflow could not overwrite return address, because callstack is separated from framestack by (large number of) non-present pages. Am I missing something in this line of thought?
Instead of having buffer overflows (that can potentially corrupt data and return addresses on the stack and cause security vulnerabilities), you'll have buffer overflows (that can potentially corrupt data on a stack and cause security vulnerabilities). It doesn't prevent or solve the problem and only modifies the symptoms. Are you really sure it's worth the hassle?
Hellbender wrote:
My plan was something like the following speudo-code:
Code:
oldEBP = EBP;
push(EBP);
EBP = alloc(frame_size);
memcpy(EBP, oldEBP, arguments_size);
...
free(EBP);
pop(EBP);
Which calling convention will "memcpy()" use? Will you end up with infinite recursion (because each call to "memcpy()" requries a call to "memcpy()")?
Note: It'd make more sense to use EBP as a "top of data stack" and ESP as a "top of return stack"; so that instead of calling malloc and free you can just add/subtract the size from EBP.
Cheers,
Brendan