OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 11:47 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: force gcc to emit RIP-relative code only?
PostPosted: Sun Nov 20, 2016 8:23 am 
Offline
Member
Member

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 587
I am on the x86_64.

My ELF dynamic linker is part of the C library, and currently I'm using a pretty dirty way to load it.

The kernel loads libc from the initrd, and performs relocations to move it to 1GB. It then maps the code segment into the address space of every executable that uses libraries.

Of course, libc is compiled as position-indepent code and hence uses GOT/PLT, which allows for the relocation to work.

However, libc does not refer to any external libraries at all, so the PLT/GOT are not actually needed. Is there a way to force GCC to assume that all symbols will be local after the final link, and hence only use RIP-relative addressing? This will mean that the kernel will not have to perform any relocations at all.

EDIT: I now realise that GCC seems to be using RIP-relative addressing anyway, and with "-fPIC" it just uses GOT/PLT for indirection. However, the lniker still complains about bad relocation values for R_X86_64_PC32, even if all the symbols are locally visible, which does not make sense to me. Is there a way to force the static linker to resolve all relocations and not emit any for the dynamic linker, since the distance between all symbols used is known to it?


Top
 Profile  
 
 Post subject: Re: force gcc to emit RIP-relative code only?
PostPosted: Sun Nov 20, 2016 3:38 pm 
Offline
Member
Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 360
I suppose the error is saying that R_X86_64_PC32 "can not be used when making a shared object". That is, I assume that you are trying to avoid PIC at the moment. Regretfully, I can not offer solution, but I can offer some insight. From your post I understand that your loader for libc cannot handle relocations. Otherwise you could coerce the linker by compiling with "-mcmodel=large". The result would use 64 bit memory references, turning the R_X86_64_PC32 relocations to type R_X86_64_64. The current complaint of the linker is that the modules may span more than 4 gigs and patching some of the relocations at load time may be impossible, but with 64-bit relocations this is not a problem. However, if your early-phase loader cannot apply relocations, this is not what you want. Just in case, here is some information on the subject.

You may ask, why do you need the relocations here to begin with? There are two features in Unix that necessitate relocatable self-references in libraries. Those features, with all their caveats and shortcomings, are copy relocations and resolution of symbols according to load order precedence. With load ordering, you can hook or override the calls of a library by providing identical exports in another library or the main executable. For example, a program can substitute the malloc allocator, overriding the default one exported by libc. All modules in that process will use the new allocator, not just the program's own code. Even libc, whenever calling through the official interface, will use the override. Similarly, by using "preloading", the user can inject additional libraries between the main executable and its shared object dependencies, allowing them to trace calls or substitute behavior.

Copy relocations move data residing in a shared object to the bss section of the main executable. The space there is reserved while linking the program (prone to resizing issues between library versions). The library has to be fixed-up or it will get split-brained from the rest of the process and therefore treats its own exported symbols as relocatable. The result is that the code segment of the non-PIC program does not need to be modified at load time, thus can be shared between processes. This potentially reduces the memory footprint. In both load ordering and copy relocations, libraries have to reimport their own exported symbols, because someone else may override their locations.

This article is a critique of the performance cost of these features, which may provide you with clarifications.

Basically, you need to tell the compiler to assume that copy relocations and load order resolution will not be used to override the library's symbols. I am not personally aware of such an option, but note that even if you find one, such behavior may surprise programs not written specifically for your OS, so depending on how you plan to populate your userland, it may be a slight point of concern.

Edit: In a nutshell, the code cannot use local references as long as the compiler is trying to solve some other problem. It is trying to make possible late substitutions of global symbols.

For illustration purposes a small example. If you build this code for a library:
Code:
int x() { return 1; };
int y() { return x(); }
Code:
gcc --shared -mcmodel=large libx.c -o libx.so
you can check with objdump -R libx.so that the library has relocation for its own x symbol. Now, if you build this code for the main executable:
Code:
int x() { return 2; }
int main() { return y(); }
Code:
gcc main.c -o main -L. -lx
you can see the override in action:
Code:
> LD_LIBRARY_PATH=. ./main; echo $?
2

On the other hand, if you build this code for the library:
Code:
int x() __attribute__ ((weak, alias ("x_")));
static int x_() { return 5; }
int y() { return x_(); }
you can check with objdump to see that the relocation is gone. The symbol x is still exported by the library, but the code only refers to the internal implementation behind the symbol. Actually, x is now a weak symbol with default value, but that is irrelevant. The exit code will will be obviously 1, but the important part is that the library is not relocated (for that symbol) anymore. As a side note, although as I said it is not relevant here, in Linux weak symbols are not treated specifically by the dynamic linker unless the LD_DYNAMIC_WEAK environment variable is defined.

Regards,


Top
 Profile  
 
 Post subject: Re: force gcc to emit RIP-relative code only?
PostPosted: Sun Nov 20, 2016 4:46 pm 
Offline
Member
Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 360
I checked a few executables for copy relocations. Namely in CentOS 7.1. Most were actually PIC. My claim that non-PIC executables happen to be prevalent does not hold water after all. The find command was the only one I found to be compiled non-PIC.
Code:
> objdump -R "$(type -P find)" | fgrep -i copy
000000000062eaa0 R_X86_64_COPY     __progname
000000000062eab0 R_X86_64_COPY     stdout
000000000062eab8 R_X86_64_COPY     stdin
000000000062eac0 R_X86_64_COPY     __environ
000000000062eac8 R_X86_64_COPY     __progname_full
000000000062ead0 R_X86_64_COPY     stderr
> objdump -R "$(type -P ls)" | fgrep -i stdout
000000000061afb0 R_X86_64_GLOB_DAT  stdout
The OS may choose to use PIC exclusively. If not, copy relocations must be supported, or the first program that references stdout is open to undefined behavior. The API override case may be rare (I don't know really), but working around it will probably require source level changes for each affected program.


Top
 Profile  
 
 Post subject: Re: force gcc to emit RIP-relative code only?
PostPosted: Sun Nov 20, 2016 6:20 pm 
Offline
Member
Member

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 587
no no, the kernel IS capable of relocating libc. THat's what I want to change. I don't want the kernel to do this.

I guess another solution would be to make libc an executable with -export-dynamic and link it to a specific location.


Top
 Profile  
 
 Post subject: Re: force gcc to emit RIP-relative code only?
PostPosted: Sun Nov 20, 2016 6:46 pm 
Offline
Member
Member

Joined: Thu Mar 25, 2010 11:26 pm
Posts: 1801
Location: Melbourne, Australia
Quote:
However, libc does not refer to any external libraries at all, so the PLT/GOT are not actually needed. Is there a way to force GCC to assume that all symbols will be local after the final link, and hence only use RIP-relative addressing? This will mean that the kernel will not have to perform any relocations at all.
Actually I think this is not always intended to be true. An application writer may decide to override the libc version of a function and supply his own. For example a developer may supply a malloc that has error checking to find a bug. If libc.so is fully resolved when it is built this rather important feature will be lost.

_________________
If a trainstation is where trains stop, what is a workstation ?


Top
 Profile  
 
 Post subject: Re: force gcc to emit RIP-relative code only?
PostPosted: Mon Nov 21, 2016 5:23 am 
Offline
Member
Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 360
Embarrassingly, the second article I linked above provides you with exactly the information you need. With protected visibility, the library symbols are exported, but cannot be interposed. You can use the "-Bsymbolic" linker flag to force it to treat all symbols as protected. So, when I compile my minimal library from my post thus:
Code:
gcc -shared -nostdlib -Wl,-Bsymbolic libx.c -o libx.so
the result has no remaining relocations.

It was already mentioned a few times, but by doing this, you will be preventing applications from "interposing" library functions, which they may expect to be able to. Furthermore, if they attempt to do so, the application and the library will be split-brained, each with their implementation and state for the functionality. Similarly, all the OS executables have to be compiled as PIC in order to avoid copy relocations. Otherwise, your executables will be massively working with stale copies of library datums. Using PIC will penalize the performance of the data references. All of these caveats are unavoidable if you want to dispose of the relocations in your libc.

Regards,


Top
 Profile  
 
 Post subject: Re: force gcc to emit RIP-relative code only?
PostPosted: Mon Nov 21, 2016 6:03 pm 
Offline
Member
Member

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 587
I see.

I think a better idea would be to just write the dynamic linker as a separate executable, unrelated to libc.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 21 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group