OSDev.org https://forum.osdev.org/ |
|
C compiler "junk" in binary executable? https://forum.osdev.org/viewtopic.php?f=13&t=31108 |
Page 1 of 1 |
Author: | IanSeyler [ Thu Dec 22, 2016 10:42 am ] |
Post subject: | C compiler "junk" in binary executable? |
I've been messing around with C programs for my OS and noticed that the compiled binary size is much larger than I had figured it would be. Mainly there is unknown data between the code and data of the binary. Compile: Code: gcc -c -m64 -nostdlib -nostartfiles -nodefaultlibs -fomit-frame-pointer -mno-red-zone -o helloc.o helloc.c ld -T app.ld -o helloc.app helloc.o hello.c: Code: void b_output(const char *str); int main(void) { b_output("Hello world, from C!\n"); return 0; } void b_output(const char *str) { asm volatile ("call *0x00100010" : : "S"(str)); // Make sure source register (RSI) has the string address (str) } app.ld: Code: OUTPUT_FORMAT("binary") OUTPUT_ARCH("i386:x86-64") ENTRY(main) SECTIONS { . = 0x0000000000200000; .text : { *(.text) . = ALIGN(16); } .data : { *(.data) *(.rodata) . = ALIGN(16); } __bss_start = .; .bss : { bss = .; _bss = .; __bss = .; *(.bss); } end = .; _end = .; __end = .; } ndsiasm: Code: 00000000 4883EC08 sub rsp,byte +0x8 00000004 BF88002000 mov edi,0x200088 00000009 E80A000000 call qword 0x18 0000000E B800000000 mov eax,0x0 00000013 4883C408 add rsp,byte +0x8 00000017 C3 ret 00000018 4883EC08 sub rsp,byte +0x8 0000001C 48893C24 mov [rsp],rdi 00000020 488B0424 mov rax,[rsp] 00000024 4889C6 mov rsi,rax 00000027 FF142510001000 call qword [0x100010] 0000002E 90 nop 0000002F 4883C408 add rsp,byte +0x8 00000033 C3 ret 00000034 662E0F1F84000000 nop word [cs:rax+rax+0x0] -0000 0000003E 6690 xchg ax,ax 00000040 1400 adc al,0x0 00000042 0000 add [rax],al 00000044 0000 add [rax],al 00000046 0000 add [rax],al 00000048 017A52 add [rdx+0x52],edi 0000004B 0001 add [rcx],al 0000004D 7810 js 0x5f 0000004F 011B add [rbx],ebx 00000051 0C07 or al,0x7 00000053 089001000014 or [rax+0x14000001],dl 00000059 0000 add [rax],al 0000005B 001C00 add [rax+rax],bl 0000005E 0000 add [rax],al 00000060 A0FFFFFF18000000 mov al,[qword 0x18ffffff] -00 00000069 44 rex.r 0000006A 0E db 0x0e 0000006B 10530E adc [rbx+0xe],dl 0000006E 0800 or [rax],al 00000070 1400 adc al,0x0 00000072 0000 add [rax],al 00000074 3400 xor al,0x0 00000076 0000 add [rax],al 00000078 A0FFFFFF1C000000 mov al,[qword 0x1cffffff] -00 00000081 44 rex.r 00000082 0E db 0x0e 00000083 10570E adc [rdi+0xe],dl 00000086 0800 or [rax],al 00000088 48 rex.w 00000089 656C gs insb 0000008B 6C insb 0000008C 6F outsd 0000008D 20776F and [rdi+0x6f],dh 00000090 726C jc 0xfe 00000092 642C20 fs sub al,0x20 00000095 66726F o16 jc 0x107 00000098 6D insd 00000099 204321 and [rbx+0x21],al 0000009C 0A00 or al,[rax] 0000009E 0000 add [rax],al Hex output: Code: 00000000 48 83 EC 08 BF 88 00 20 00 E8 0A 00 00 00 B8 00 H...... ........ 00000010 00 00 00 48 83 C4 08 C3 48 83 EC 08 48 89 3C 24 ...H....H...H.<$ 00000020 48 8B 04 24 48 89 C6 FF 14 25 10 00 10 00 90 48 H..$H....%.....H 00000030 83 C4 08 C3 66 2E 0F 1F 84 00 00 00 00 00 66 90 ....f.........f. 00000040 14 00 00 00 00 00 00 00 01 7A 52 00 01 78 10 01 .........zR..x.. 00000050 1B 0C 07 08 90 01 00 00 14 00 00 00 1C 00 00 00 ................ 00000060 A0 FF FF FF 18 00 00 00 00 44 0E 10 53 0E 08 00 .........D..S... 00000070 14 00 00 00 34 00 00 00 A0 FF FF FF 1C 00 00 00 ....4........... 00000080 00 44 0E 10 57 0E 08 00 48 65 6C 6C 6F 20 77 6F .D..W...Hello wo 00000090 72 6C 64 2C 20 66 72 6F 6D 20 43 21 0A 00 00 00 rld, from C!.... The actual program code is 0x0 - 0x33 and the string is 0x88 - 0x9D What is the "junk" in 0x34 - 0x87? Is there a problem with my linker script? Thanks, Ian |
Author: | Gigasoft [ Thu Dec 22, 2016 10:46 am ] |
Post subject: | Re: C compiler "junk" in binary executable? |
I think it would be easier to answer that question if you disassembled the object file, rather than the end result. |
Author: | IanSeyler [ Thu Dec 22, 2016 11:33 am ] |
Post subject: | Re: C compiler "junk" in binary executable? |
Good idea! objdump -D helloc.o: Code: helloc.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <main>: 0: 48 83 ec 08 sub $0x8,%rsp 4: bf 00 00 00 00 mov $0x0,%edi 9: e8 00 00 00 00 callq e <main+0xe> e: b8 00 00 00 00 mov $0x0,%eax 13: 48 83 c4 08 add $0x8,%rsp 17: c3 retq 0000000000000018 <b_output>: 18: 48 83 ec 08 sub $0x8,%rsp 1c: 48 89 3c 24 mov %rdi,(%rsp) 20: 48 8b 04 24 mov (%rsp),%rax 24: 48 89 c6 mov %rax,%rsi 27: ff 14 25 10 00 10 00 callq *0x100010 2e: 90 nop 2f: 48 83 c4 08 add $0x8,%rsp 33: c3 retq Disassembly of section .rodata: 0000000000000000 <.rodata>: 0: 48 rex.W 1: 65 6c gs insb (%dx),%es:(%rdi) 3: 6c insb (%dx),%es:(%rdi) 4: 6f outsl %ds:(%rsi),(%dx) 5: 20 77 6f and %dh,0x6f(%rdi) 8: 72 6c jb 76 <b_output+0x5e> a: 64 2c 20 fs sub $0x20,%al d: 66 72 6f data16 jb 7f <b_output+0x67> 10: 6d insl (%dx),%es:(%rdi) 11: 20 43 21 and %al,0x21(%rbx) 14: 0a 00 or (%rax),%al Disassembly of section .comment: 0000000000000000 <.comment>: 0: 00 47 43 add %al,0x43(%rdi) 3: 43 3a 20 rex.XB cmp (%r8),%spl 6: 28 55 62 sub %dl,0x62(%rbp) 9: 75 6e jne 79 <b_output+0x61> b: 74 75 je 82 <b_output+0x6a> d: 20 35 2e 34 2e 30 and %dh,0x302e342e(%rip) # 302e3441 <b_output+0x302e3429> 13: 2d 36 75 62 75 sub $0x75627536,%eax 18: 6e outsb %ds:(%rsi),(%dx) 19: 74 75 je 90 <b_output+0x78> 1b: 31 7e 31 xor %edi,0x31(%rsi) 1e: 36 2e 30 34 2e ss xor %dh,%cs:(%rsi,%rbp,1) 23: 34 29 xor $0x29,%al 25: 20 35 2e 34 2e 30 and %dh,0x302e342e(%rip) # 302e3459 <b_output+0x302e3441> 2b: 20 32 and %dh,(%rdx) 2d: 30 31 xor %dh,(%rcx) 2f: 36 30 36 xor %dh,%ss:(%rsi) 32: 30 39 xor %bh,(%rcx) ... Disassembly of section .eh_frame: 0000000000000000 <.eh_frame>: 0: 14 00 adc $0x0,%al 2: 00 00 add %al,(%rax) 4: 00 00 add %al,(%rax) 6: 00 00 add %al,(%rax) 8: 01 7a 52 add %edi,0x52(%rdx) b: 00 01 add %al,(%rcx) d: 78 10 js 1f <.eh_frame+0x1f> f: 01 1b add %ebx,(%rbx) 11: 0c 07 or $0x7,%al 13: 08 90 01 00 00 14 or %dl,0x14000001(%rax) 19: 00 00 add %al,(%rax) 1b: 00 1c 00 add %bl,(%rax,%rax,1) 1e: 00 00 add %al,(%rax) 20: 00 00 add %al,(%rax) 22: 00 00 add %al,(%rax) 24: 18 00 sbb %al,(%rax) 26: 00 00 add %al,(%rax) 28: 00 44 0e 10 add %al,0x10(%rsi,%rcx,1) 2c: 53 push %rbx 2d: 0e (bad) 2e: 08 00 or %al,(%rax) 30: 14 00 adc $0x0,%al 32: 00 00 add %al,(%rax) 34: 34 00 xor $0x0,%al 36: 00 00 add %al,(%rax) 38: 00 00 add %al,(%rax) 3a: 00 00 add %al,(%rax) 3c: 1c 00 sbb $0x0,%al 3e: 00 00 add %al,(%rax) 40: 00 44 0e 10 add %al,0x10(%rsi,%rcx,1) 44: 57 push %rdi 45: 0e (bad) 46: 08 00 or %al,(%rax) I'll look into what .comment and .eh_frame are for. Thanks, Ian |
Author: | Schol-R-LEA [ Thu Dec 22, 2016 12:35 pm ] |
Post subject: | Re: C compiler "junk" in binary executable? |
They are non-code sections used for various housekeeping details by the ELF specification (though .eh_frame is mainly for DWARF, the debugging format meant to work in conjunction with ELF). Both can be stripped out, if you are certain you won't need them, but the linker won't do so automatically just because the filed is linked into a binary image rather than an ELF file. ELF64 Specification: ELF64 spec wrote: The comment section is reserved for revision control information. [...] The contents of a .comment section will be a sequence of NULL-terminated strings with the format of each string being: Code: toolname:vendor:revision:object StackOverflow: Why GCC compiled C program needs .eh_frame section? Quote: First of all, the original reason for this was largely political - the people who added DWARF-based unwinding (.eh_frame) wanted it to be a feature that's always there so it could be used for implementing all kinds of stuff other than just C++ exceptions, including:
Note that DWARF tables are also used for debugging, but for this purpose they do not need to be in the loadable part of the program. Using -fno-asynchronous-unwind-tables will not break debugging, because as long as -g is also passed to the compiler, the tables still get generated; they just get stored in a separate, non-loadable, strippable section of the binary, .debug_frame. Removing Unused Functions/Dead Codes with GCC/GNU-ld WxWidgets Wiki: Reducing Executable Size: GCC documentation page for optimization options: http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
|
Page 1 of 1 | All times are UTC - 6 hours |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |