I think you are trying to take too many steps at once. Because typically the compiler generates assembly, the assembler generates an object file, and only then will a linker generate an executable file. That is three transformation processes and four file formats (if you count the source file), and in each case, you simply have completely different job to do.
So the compiler generates assembly code. That is, it generates the directives and instructions necessary to get the assembler to generate a valid object file. For the most part, the assembly file is just a textual representation of the object file, but certain things still make it worthwhile to break up. For one, outputting to text makes your stuff way easier to debug, for two, you can leave stuff the assembler will be doing to the assembler and just concentrate on getting the compiler right.
The assembler will convert the assembly code into object code. For the most part this means it will generate a file in the applicable object code format. Most of those (and particularly the two mentioned here) will allow multiple sections in each of them, and it will be the assembler's job to concatenate possibly multiple declarations for each of them into just a single section. Then there are also address calculations. The assembler first has to pass over the code to identify only how large each directive and instruction will be and thus place the symbols, and only then can it go about assembling the instructions. Another important thing are relocations. The assembler will annotate certain bytes in the object file as having to contain specific addresses, and getting those right is probably the most important part of the assembler. For example:
Code:
.section ".rodata","a",@progbits
.LC1: .asciz "Hello World!\n"
.text
movq $.LC1, %rdi
callq printf
The assembler cannot know what the address of of .LC1 will be after linking, nor that of printf. So it generates the code for a 64-bit move, fills the field that will become .LC1 with zeroes, and generates a relocation entry that tells the linker to place the address of the .rodata section there. And for the call instruction, it fills the destination with zeroes, then creates a relocation entry that means those bytes should become the difference between the address of printf and that code address, plus four. This is because the destination of a call instruction is read to be relative to the end of the instruction.
Finally, the linker. The linker must read back the object file, and possibly multiple of those, concatenate like sections, and generate an executable file, while processing relocations. Since you mentioned Windows, you are going to have to deal with dynamic linking at least a little bit.
Ethin wrote:
o the ABI doesn't really matter (at least, I'm pretty sure I can forget the ABI since all you can call are functions you've specifically declared and defined);
Well, you are going to have to at least call a few system calls. At least exit() or ExitProcess() on Windows. So ABI is probably still a necessity. Since the compiler knows the target platform, this ought not to be a big problem, however.
Ethin wrote:
I just want to know what I absolutely need to add (excluding instructions obviously) to create a fully working binary that, unless I don't write an instruction properly, won't throw any signals or cause problems.
You must have the right headers and the right exit code at some place following the entry symbol, and you must execute that exit code. That being OS-specific.