azblue wrote:
The
A86 assembler has a file called A86MANU.TXT; the section "The 86 Instruction Set" has always been the simplest to understand in my opinion.
Additionally,
sandpile.org is indispensable (specifically look at their opcode encoding and opcode groups).
+1, that file contains a metric ton of useful data, was looking for something like that.
alexfru wrote:
You don't need many. Just the CPU manual (have both, intel and AMD) and a way to experiment and check your understanding of the manual. For the latter use an assembler and a disassembler. NASM (and its NDISASM) will work perfectly here. You may also want a hex file viewer and a programmer's calculator. That's all.
Start with e.g. the add instruction. Write a bunch of different variants of it like so:
Code:
; assemble: nasm -fbin file.asm -o file.bin
bits 16
add ax, bx
add ax, [bx]
add [bx], ax
add ax, [bx+di]
add [bx+di+2], ax
add [bx+di-2], ax
add [bx+di+1024], ax
add [bx+di-1024], ax
add word [bx+di-1024], 0x1234
Observe (from the assembly listing or from disassembly of the binary (e.g. "ndisasm -b 16 file.bin")) how they're encoded.
For fun try to do the reverse. Given an instruction description/encoding, try to encode it by hand and see that the disassembly of your bytes gives you the expected instruction.
Extend this to 32 bits, throw in segment override prefixes, etc.
Beware, some instructions may have alternative encodings.
That's a smart idea. I didn't know ndisasm existed. Although it would be useful to have a program that could differentiate between prefixes, opcodes and other bytes, instead of having them all written together.
Antti wrote:
I would recommend the disassembly method mentioned above for all assembly language programmers. It helps to get a grasp of instructions and the their logic. No need to spend too much time inspecting the bytes but maybe a few hours or a day? In addition to that, trying labels and seeing what values the assembler sets to those could be enlightning, e.g. how "mov ax, my_label" or "jmp my_label" translate to bytes and how the values change when code is modified or assembler directives are used. As a more advanded topic, check how object files handle relocations.
I'll definitely have to address jumps and calls sooner or later, since VBE code jumps around a lot.
Getting my code to recognize the instruction is the hardest part, emulating them is easy. After all I don't need all of them.