OSDev.org

The Place to Start for Operating System Developers
It is currently Tue Dec 11, 2018 11:02 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: Theory of implementing a new ABI for X86 LongMode
PostPosted: Wed Nov 07, 2018 3:36 pm 
Offline
Member
Member
User avatar

Joined: Sun Jan 13, 2013 6:24 pm
Posts: 87
Location: Grande Prairie AB
I've come to a point now where I need to start thinking of an ABI. Of the ones that are out there or at least those that I've had exposure too, FASTCALL is most appealing, but with a twist. Only use those registers that are specifically designed to be counters, pointers to arrays and indices. Keep in mind, this is only an example and I don't want to imply that I'm saving ABI registers above procedure frame, but rather whatever is essential to preserve is done outside the frame.

    RBX = Base pointer to arrays, structures or even arrays of structures.
    RCX = Counter
    RSI = Source index
    RDI = Destination index

Essentially only use the stack for local data and/or buffers by using this type of Prologue and Epilogue.
Code:
Proc:     push    rdi
          push    rsi
          push    rbx
          push    rbp
          mov     rbp, rsp             ; or maybe even ENTER ?,?

   ..... Body of Procedure ....

         leave
         pop     rbx
         pop     rsi
         pop     rdi
         ret

This has worked very effectively as now I don't need to be concerned about the stack pointer as LEAVE unroll RSP.

As my system is not intended to be compatible with POSIX, SYSTEM V or anything else, it has been plagued with a few tribulations where I've had to redraft, at times right to the beginning. What I'm looking for here, has anyone else designed such an OS and what sort of ABI did you implement. Although I appreciate what M$ had to do for backward compatibility maybe, using shadow space does introduce a lot of bloat.


Top
 Profile  
 
 Post subject: Re: Theory of implementing a new ABI for X86 LongMode
PostPosted: Wed Nov 07, 2018 4:26 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 614
I actually condone "shadow space". The callee can spill registers into the shadow space and make arguments consecutive in memory, so that <stdarg.h> can work purely on a single pointer. The x86_64 System V solution is more bloated IMO.

_________________
Glidix: An x86_64 POSIX-compliant operating system, aiming to be as optimized as possible, especially in graphics.
https://github.com/madd-games/glidix


Top
 Profile  
 
 Post subject: Re: Theory of implementing a new ABI for X86 LongMode
PostPosted: Wed Nov 07, 2018 6:15 pm 
Offline
Member
Member
User avatar

Joined: Sun Jan 13, 2013 6:24 pm
Posts: 87
Location: Grande Prairie AB
mariuszp wrote:
I actually condone "shadow space".

Probably the most poignant word to describe that paradigm, but I suppose it was the best alternative to maintain backward compatibility.


Top
 Profile  
 
 Post subject: Re: Theory of implementing a new ABI for X86 LongMode
PostPosted: Wed Nov 07, 2018 11:29 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 84
I'm not sure it would be worth the effort to me. Inventing a new ABI is all well and good, but then you have to tell the compiler about it. One look at gcc's source code and I knew I wanted no part of that.

All ABIs need to make tradeoffs. The System V i386 ABI has to cope with the fact that only 8 GP regs exist, 2 of which aren't all that GP. So putting arguments on the stack was a sound decision. For x86_64 however, enough registers exist to be able to avoid spilling to the memory, which is, in general, slower than just keeping it all in regs.

Personally, I like the PowerPC ABI a hell of a lot more than the System V x86_64 ABI, as it does handle variadic functions and still passes most things in registers. In x86_64, a variadic function is treated as if the variadic arguments were normal arguments to the function. In the PowerPC ABI, however, variadic args are always pushed to the stack. Non-variadic args are put into registers 3-10 (allowing for 8 arguments in registers), or the FP regs, as appropriate. So this ABI trades consistency for easy access to variadic args.

But you still haven't described your ABI: Which registers are volatile (clobbered by callee, i.e. caller-saved), which registers are non-volatile (or callee-saved) and how does argument passing work? From what you wrote, I can only conclude that RBX, RCX, RDI, RSI, RBP, and RSP are non-volatile. So RAX, RDX, and R8-R15 are all volatile? And all args are passed on stack? That's a lot of registers to save if I want to call a function. In System V I can keep a local variable in R15 and the callees will save it if they need to clobber it. But most functions don't need to do that, so my variable never makes it into memory.


Top
 Profile  
 
 Post subject: Re: Theory of implementing a new ABI for X86 LongMode
PostPosted: Thu Nov 08, 2018 1:07 am 
Offline
Member
Member
User avatar

Joined: Sun Jan 13, 2013 6:24 pm
Posts: 87
Location: Grande Prairie AB
NOTE: Probably should have mentioned everything is developed using assembly.

Maybe what I'm calling an ABI might be a little misleading. The volatility of any register would be dictated by the procedures intent. To clarify what I mean consider this example that could be thought of as STRLEN, but not necessarily looking for NULL.
Code:
        mov     rsi, WideTxt                          ; Points to a wide text buffer
        mov     eax, 0x1D2A                         ; Just a hypothetical terminating character
        mov     ecx, 1024                             ; Maximum
        call      STRLEN

So in this case, RAX & RCX will be the only two registers modified and callee would be responsible for preserving anything else it needs to accomplish the task. Return values would be in their respective registers meaning RCX would be text buffer length. If RCX = 0 && EAX = Original value then string was the exact length as specified by caller, otherwise EAX = whatever character is a ESI + ECX. Now to accomplish something like STRCAT
Code:
        mov     rdi, DestBuff
        rep      movsw

So that is considerably more compact than;
Code:
        mov     ecx, 1024
        mov     rdx, WideTxt
        mov      r8, 0x1D2A
        sub      rsp, 32
        call      StrLen                                 ; Then function would have move them again or at least R8 & RDX.
        add      rsp, 32

Obviously, at some point, I will need documentation, but that's the other objective I hope to address that when writing code what needs to be passed to callee be a little more intuitive based on target architecture.


Top
 Profile  
 
 Post subject: Re: Theory of implementing a new ABI for X86 LongMode
PostPosted: Mon Nov 12, 2018 12:36 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 614
TightCoderEx wrote:
mariuszp wrote:
I actually condone "shadow space".

Probably the most poignant word to describe that paradigm, but I suppose it was the best alternative to maintain backward compatibility.


Not sure what you mean here... in what way is it done for backwards compatibility?

My understanding is that the point is that the stack looks like this:

Code:
Return Address
(Space for 4 regs)
Other arguments


and then 4 arguments are pased in registers. If the callee is a variadic function, it can "spill" them onto the stack and iterate through them using a pointer, rather than requiring weird workaround like the SysV ABI.

_________________
Glidix: An x86_64 POSIX-compliant operating system, aiming to be as optimized as possible, especially in graphics.
https://github.com/madd-games/glidix


Top
 Profile  
 
 Post subject: Re: Theory of implementing a new ABI for X86 LongMode
PostPosted: Mon Nov 12, 2018 2:54 pm 
Offline
Member
Member
User avatar

Joined: Sun Jan 13, 2013 6:24 pm
Posts: 87
Location: Grande Prairie AB
mariuszp wrote:
Not sure what you mean here... in what way is it done for backward compatibility?

What I'm suggesting and probably doesn't apply to System V, is that CDELC and STDCALL in 32 bit, memory above EBP will look exactly the same as that of FASTCALL once callee has moved registers into shadow space, other than data being 4 bytes vs 8 respectively. Although I don't know for sure, but this would suggest an air of compatibility on M$ part.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group