OSDev.org https://forum.osdev.org/ |
|
LD writes different LMA/VMA https://forum.osdev.org/viewtopic.php?f=1&t=32639 |
Page 1 of 1 |
Author: | henje [ Sat Dec 16, 2017 10:16 am ] |
Post subject: | LD writes different LMA/VMA |
While working on my build system I encountered a problem when using ld. I used the following simple linker script which only specifies a virtual address, but when i use readelf virtual and load address for .rodata are different. Code: OUTPUT_FORMAT("elf32-i386") ENTRY(_start) SECTIONS { . = 0x100000; .text : { *(multiboot) *(.text) } .data ALIGN(4096) : { *(.data) } .rodata ALIGN(4096) : { *(.rodata) } .bss ALIGN(4096) : { *(.bss) } } Code: Elf file type is EXEC (Executable file) Entry point 0x100090 There are 4 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x001000 0x00100000 0x00100000 0x0009e 0x0009e R E 0x1000 LOAD 0x00109e 0x0010009e 0x00102ebe 0x0000d 0x0000d R 0x1000 LOAD 0x002000 0x00101000 0x00103e20 0x00000 0x02000 RW 0x1000 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x10 The code I am using is just a minimal "Hello World!" kernel. When linking with gold or lld virtual and load address of .rodata is the same. I am not sure if my linker script is at fault, the ld invocation or ld is just bugging around. But I would assume it is my script. I invoke ld like this: Code: ld -T linker.ld *.o -o kernel -melf_i386 (I do not know why ld is the only linker to ignore OUTPUT_FORMAT) Thanks for any help. |
Author: | MichaelPetch [ Sat Dec 16, 2017 12:00 pm ] |
Post subject: | Re: LD writes different LMA/VMA |
It could be because you are not using a cross compiler. Maybe it is related position independent code. I'd recommend building an i686 cross compiler and using that to see if it changes. It can be dependent on the distro you are using (different default compiler options etc). What command line paramaters do you use to compile the source code? Does that layout cause problems for your code? If you posted your project to github (or similar) and told us what distro/OS you are using to build on we might be able to say. |
Author: | henje [ Sat Dec 16, 2017 2:20 pm ] |
Post subject: | Re: LD writes different LMA/VMA |
I uploaded the relevant parts of my project to https://github.com/Henje/LD-issue-minimal-example. I use clang as a cross-compiler, but I do not see how that is relevant to linking. Moreover, when using gold and lld, LMA and VMA are equal and my code works. The ld I am using is the standard ld on my Ubuntu 17.10. It says about itself: Code: GNU ld (GNU Binutils for Ubuntu) 2.29.1
Supported emulations: elf_x86_64 elf32_x86_64 elf_i386 elf_iamcu i386linux elf_l1om elf_k1om i386pep i386pe |
Author: | MichaelPetch [ Sat Dec 16, 2017 2:24 pm ] |
Post subject: | Re: LD writes different LMA/VMA |
The relevance of the compiler to linking is that that the compiler can emit information into the object files that can alter how things are placed in memory (things like alignment etc) by the linker. You'll also get differing results if your compiler happens to default to Position Independent code vs the default in other distros and cross compilers where code isn't position independent. (Ubuntu made this type of change around 16.04). Using a host compiler can make a difference in the output you see. Using a cross compiler can give you more consistent results for your builds in general. Edit: It appears you are using clang (and not gcc). clang will cross compile. Problem is your original question didn't say what tools you were using and I assumed gcc incorrectly. |
Author: | henje [ Sat Dec 16, 2017 6:09 pm ] |
Post subject: | Re: LD writes different LMA/VMA |
It is not like I could not use the other linkers, I am just curious as to why there is a difference in the first place. I see your point with the position independent code but the manual of ld does not even feature the term. From the linkers perspective only sections and symbols are of interest. From a compiler's view, PIC just disallows absolute jumps and the like. At the time of linkage those are all generated. Then again, I am no expert at PIC so I might as well be wrong. I tried linking with --nmagic, but the output did not change much. If it helps I attached the output of objdump -x. Code: kernel4: file format elf32-i386 kernel4 architecture: i386, flags 0x00000012: EXEC_P, HAS_SYMS start address 0x00100090 Program Header: LOAD off 0x000000a0 vaddr 0x00100000 paddr 0x00100000 align 2**4 filesz 0x0000009e memsz 0x0000009e flags r-x LOAD off 0x0000013e vaddr 0x0010009e paddr 0x00102ebe align 2**0 filesz 0x0000000d memsz 0x00002f62 flags rw- STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4 filesz 0x00000000 memsz 0x00000000 flags rwx Sections: Idx Name Size VMA LMA File off Algn 0 .text 0000009e 00100000 00100000 000000a0 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata.str1.1 0000000d 0010009e 00102ebe 0000013e 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .bss 00002000 00101000 00103e20 0000014b 2**0 ALLOC 3 .comment 0000002d 00000000 00000000 0000014b 2**0 CONTENTS, READONLY SYMBOL TABLE: 00100000 l d .text 00000000 .text 0010009e l d .rodata.str1.1 00000000 .rodata.str1.1 00101000 l d .bss 00000000 .bss 00000000 l d .comment 00000000 .comment 00000000 l df *ABS* 00000000 start.o 0010009a l .text 00000000 _stop 00103000 l .bss 00000000 kernel_stack 00000000 l df *ABS* 00000000 main.cpp 00000000 l df *ABS* 00000000 001000a0 l O .rodata.str1.1 00000000 _GLOBAL_OFFSET_TABLE_ 00100090 g .text 00000000 _start 00100070 g F .text 0000001f init 00100010 g F .text 0000005a _Z5printPKc What boggles my mind is the load address which the linker calculates. I can see no relation to any code. |
Author: | MichaelPetch [ Sat Dec 16, 2017 10:47 pm ] |
Post subject: | Re: LD writes different LMA/VMA |
Appears that clang maintains .rodata sections that may have trailing characters on the name. In the linker script you should use *(.rodata*) instead. With LD linker you should consider aligning the Load Memory Address (to the right of a colon on a section definition) to 4K if you want the LMA and VMA to match up. If you set the VMA (value to the left of the colon in the section definition), the LMA remains untouched. If you set both LMA and VMA in a section definition they are set separately. In your case you want to modify your linker.ld to look like: Code: OUTPUT_FORMAT("elf32-i386") The linkers may create the PHDRS differently, so to see the individual sections using LD you may want to use objdump -x kernel to view the full headers. The output is more readable than readelf IMHO Modify your linker line to add -nostartfiles and -nostdlib. We don't have C runtime initialization nor do we have standard library support. The command could look like this: ENTRY(_start) SECTIONS { . = 0x100000; .text : ALIGN(4096) { *(multiboot) *(.text) } .data : ALIGN(4096) { *(.data) } .rodata : ALIGN (4096) { *(.rodata*) } .bss : ALIGN (4096) { *(.bss) } } Code: ld -Tlinker.ld -nostartfiles -nostdlib *.o -o kernel -melf_i386 Be aware that if you are going to use C++ you will need to enhance the linkers script to deal with static construct and destructors. Your assembly code would have to loop through that data and call all the static constructiors. If you ever put class objects at global scope for example, to have them initialized these constructors have to be called. Normally the startupfiles do that, but since we are in a freestanding environment it is up to us to do that ourselves. I believe there is a forum post or OSDev wiki discussing this. |
Author: | henje [ Tue Dec 19, 2017 9:46 am ] |
Post subject: | Re: LD writes different LMA/VMA |
Thanks for your help, setting the LMA instead of VMA resulted in the right result for all linkers I tested with. As for the explanation, I am not so sure because the LD manual states "LMA is set so the difference between the VMA and LMA is the same as the difference between the VMA and LMA of the last section" (from here). There are other options but they boil down to "LMA is set to its VMA". This behaviour is especially weird, because I tested a different LD (2.23.1) which had no problem. Also good catch with the .rodata regex, it kind of worked, but was not what I intended. As for the C++ part, thanks for the heads up, but in my real project I got that handled. This was just a test for a different build system and had therefore no ctor, dtor stuff. |
Page 1 of 1 | All times are UTC - 6 hours |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |