OSDev.org

The Place to Start for Operating System Developers
It is currently Mon Mar 18, 2019 11:36 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 12 posts ] 
Author Message
 Post subject: BIOS Calls in Emulators
PostPosted: Sun Oct 14, 2018 10:49 am 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1009
Hi,

I'm wondering on how this works:
You write a real mode emulator and let in run in protected mode, you tell it to execute int 10h, somehow it magically does a VBE mode switch.
Can somebody explain this magic part to me?

Also what is the minimum amount of instructions I would have to emulate in order to be able to execute int 10h interrupt handling code?
Why are you doing this? Well this topic interests me and sounds like a fun project (x86 Real Mode Emulator for Mode Switching).

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Sun Oct 14, 2018 11:00 am 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 3:45 am
Posts: 9280
Location: An der schönen blauen Donau, watching muppets
An emulator - such as libx86emu which was specifically written for just this - executes code pretending to be a real machine. It is a closed system - if you want it to affect the real machine, then the emulated one has to forward to the real one, often by sharing I/O ports, BIOS and video card memory ranges.

The instructions you need? No two video cards are equal. Be prepared to support at least everything a 486 has, protected mode and everything.

_________________
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Sun Oct 14, 2018 12:32 pm 
Offline
Member
Member
User avatar

Joined: Mon Jul 13, 2009 5:52 am
Posts: 82
Location: Rome
Hi Octacone,

I remember reading this older post, which contains a lot of explanation and useful information.

viewtopic.php?f=1&t=22363


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Fri Oct 19, 2018 1:22 pm 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1009
Combuster wrote:
An emulator - such as libx86emu which was specifically written for just this - executes code pretending to be a real machine. It is a closed system - if you want it to affect the real machine, then the emulated one has to forward to the real one, often by sharing I/O ports, BIOS and video card memory ranges.

The instructions you need? No two video cards are equal. Be prepared to support at least everything a 486 has, protected mode and everything.


Yup, after some deeper digging, it looks like I/O ports are the main thing that is used for mode setting internally. The interrupt code itself is just to generate the required values that need to be forwarded trough ports.
I also found out that protected mode support is not required and that all the code is 16 bit for compatibility reasons.

zity wrote:
Hi Octacone,

I remember reading this older post, which contains a lot of explanation and useful information.

viewtopic.php?f=1&t=22363


That was super useful.

I've started working on my emulator since, the thing that bothers me is, how do I handle single opcode multiple instructions?
For e.g. 0x80 can be add, adc, and, xor, or, sbb, sub, cmp.
Edit: I just discovered that opcodes are not just "randomly assigned numbers by Intel", there is a whole lot of things going on. Prefix bytes, m/r byte, SIB... a lot more that I initially thought.

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Mon Oct 22, 2018 4:58 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 626
Octacone wrote:
I've started working on my emulator since, the thing that bothers me is, how do I handle single opcode multiple instructions?
For e.g. 0x80 can be add, adc, and, xor, or, sbb, sub, cmp.
Edit: I just discovered that opcodes are not just "randomly assigned numbers by Intel", there is a whole lot of things going on. Prefix bytes, m/r byte, SIB... a lot more that I initially thought.


Each opcode is one instruction (though it might have different mnemonics in assembly to make things more clear). Sometimes prefix + opcode is a different instruction; that's specified explicitly where necessary.

As for the 0x80 thing: the ModR/M byte following the 0x80 opcode has 3 unsued bits (where normally you would specify a register operand) because it doesn't need 2 register operands. For example for "XOR r/m8, imm8" the encoding is "80 /6 ib", meaning the byte 0x80 is followed by a ModR/M byte where the 3 unsued bits are assigned the value "6", and then an immediate byte. The other instructions have different values in those 3 unused bits. As such, these 3 bits are actually an extension of the opcode, and that's how you differentied them.

(I wrote an x86 assembler as part of a project once, these things do get quite confusing. Trying to actually emulate these instructions is a whole new level of complex altogether)

_________________
Glidix: An x86_64 POSIX-compliant operating system, aiming to be as optimized as possible, especially in graphics.
https://glidix.madd-games.org/


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Wed Oct 24, 2018 2:54 pm 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1009
mariuszp wrote:
Octacone wrote:
I've started working on my emulator since, the thing that bothers me is, how do I handle single opcode multiple instructions?
For e.g. 0x80 can be add, adc, and, xor, or, sbb, sub, cmp.
Edit: I just discovered that opcodes are not just "randomly assigned numbers by Intel", there is a whole lot of things going on. Prefix bytes, m/r byte, SIB... a lot more that I initially thought.


Each opcode is one instruction (though it might have different mnemonics in assembly to make things more clear). Sometimes prefix + opcode is a different instruction; that's specified explicitly where necessary.

As for the 0x80 thing: the ModR/M byte following the 0x80 opcode has 3 unsued bits (where normally you would specify a register operand) because it doesn't need 2 register operands. For example for "XOR r/m8, imm8" the encoding is "80 /6 ib", meaning the byte 0x80 is followed by a ModR/M byte where the 3 unsued bits are assigned the value "6", and then an immediate byte. The other instructions have different values in those 3 unused bits. As such, these 3 bits are actually an extension of the opcode, and that's how you differentied them.

(I wrote an x86 assembler as part of a project once, these things do get quite confusing. Trying to actually emulate these instructions is a whole new level of complex altogether)


It is a real pain, so many variations. I've been reading on this topic for days and I'm still struggling to catch up.
There are just not many resources on 8086 instruction decoding.

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Wed Oct 24, 2018 8:57 pm 
Offline
Member
Member

Joined: Sat Feb 27, 2010 8:55 pm
Posts: 122
Octacone wrote:

It is a real pain, so many variations. I've been reading on this topic for days and I'm still struggling to catch up.
There are just not many resources on 8086 instruction decoding.


The A86 assembler has a file called A86MANU.TXT; the section "The 86 Instruction Set" has always been the simplest to understand in my opinion.

Additionally, sandpile.org is indispensable (specifically look at their opcode encoding and opcode groups).


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Wed Oct 24, 2018 10:51 pm 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 940
Octacone wrote:
It is a real pain, so many variations. I've been reading on this topic for days and I'm still struggling to catch up.
There are just not many resources on 8086 instruction decoding.


You don't need many. Just the CPU manual (have both, intel and AMD) and a way to experiment and check your understanding of the manual. For the latter use an assembler and a disassembler. NASM (and its NDISASM) will work perfectly here. You may also want a hex file viewer and a programmer's calculator. That's all.

Start with e.g. the add instruction. Write a bunch of different variants of it like so:

Code:
; assemble: nasm -fbin file.asm -o file.bin
bits 16
add ax, bx
add ax, [bx]
add [bx], ax
add ax, [bx+di]
add [bx+di+2], ax
add [bx+di-2], ax
add [bx+di+1024], ax
add [bx+di-1024], ax
add word [bx+di-1024], 0x1234


Observe (from the assembly listing or from disassembly of the binary (e.g. "ndisasm -b 16 file.bin")) how they're encoded.

For fun try to do the reverse. Given an instruction description/encoding, try to encode it by hand and see that the disassembly of your bytes gives you the expected instruction.

Extend this to 32 bits, throw in segment override prefixes, etc.

Beware, some instructions may have alternative encodings.


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Thu Oct 25, 2018 12:55 am 
Offline
Member
Member

Joined: Thu Jul 05, 2012 5:12 am
Posts: 921
Location: Finland
I would recommend the disassembly method mentioned above for all assembly language programmers. It helps to get a grasp of instructions and the their logic. No need to spend too much time inspecting the bytes but maybe a few hours or a day? In addition to that, trying labels and seeing what values the assembler sets to those could be enlightning, e.g. how "mov ax, my_label" or "jmp my_label" translate to bytes and how the values change when code is modified or assembler directives are used. As a more advanded topic, check how object files handle relocations.

_________________
Undefined behavior since 2012


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Fri Oct 26, 2018 2:45 pm 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1009
azblue wrote:
The A86 assembler has a file called A86MANU.TXT; the section "The 86 Instruction Set" has always been the simplest to understand in my opinion.

Additionally, sandpile.org is indispensable (specifically look at their opcode encoding and opcode groups).


+1, that file contains a metric ton of useful data, was looking for something like that.

alexfru wrote:
You don't need many. Just the CPU manual (have both, intel and AMD) and a way to experiment and check your understanding of the manual. For the latter use an assembler and a disassembler. NASM (and its NDISASM) will work perfectly here. You may also want a hex file viewer and a programmer's calculator. That's all.

Start with e.g. the add instruction. Write a bunch of different variants of it like so:

Code:
; assemble: nasm -fbin file.asm -o file.bin
bits 16
add ax, bx
add ax, [bx]
add [bx], ax
add ax, [bx+di]
add [bx+di+2], ax
add [bx+di-2], ax
add [bx+di+1024], ax
add [bx+di-1024], ax
add word [bx+di-1024], 0x1234


Observe (from the assembly listing or from disassembly of the binary (e.g. "ndisasm -b 16 file.bin")) how they're encoded.

For fun try to do the reverse. Given an instruction description/encoding, try to encode it by hand and see that the disassembly of your bytes gives you the expected instruction.

Extend this to 32 bits, throw in segment override prefixes, etc.

Beware, some instructions may have alternative encodings.


That's a smart idea. I didn't know ndisasm existed. Although it would be useful to have a program that could differentiate between prefixes, opcodes and other bytes, instead of having them all written together.

Antti wrote:
I would recommend the disassembly method mentioned above for all assembly language programmers. It helps to get a grasp of instructions and the their logic. No need to spend too much time inspecting the bytes but maybe a few hours or a day? In addition to that, trying labels and seeing what values the assembler sets to those could be enlightning, e.g. how "mov ax, my_label" or "jmp my_label" translate to bytes and how the values change when code is modified or assembler directives are used. As a more advanded topic, check how object files handle relocations.


I'll definitely have to address jumps and calls sooner or later, since VBE code jumps around a lot.
Getting my code to recognize the instruction is the hardest part, emulating them is easy. After all I don't need all of them.

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Mon Oct 29, 2018 10:08 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 166
Near and short direct jumps and calls are encoded relative to the end of the instruction. You get an opcode and then an offset, and you first set IP to the end of the instruction and then add the sign-extended operand to get the new IP.

So for instance, the following snippet:

Code:
hltloop:
  hlt
  jmp hltloop


is encoded:

Code:
F4 EB FD


That last FD being -3 when sign-extended.

Far and indirect calls and jumps encode their target absolutely. So for instance, in 16-bit mode, the code bytes

Code:
FF 27


mean

Code:
jmp [bx]


And that means: Look in memory at the word BX is pointing to and copy that into IP.

VBE code might also use software interrupts. If you don't know the function called in that case, you might also just emulate that as an indirect far call that pushes flags.


Top
 Profile  
 
 Post subject: Re: BIOS Calls in Emulators
PostPosted: Sun Nov 04, 2018 6:35 am 
Offline

Joined: Sun Nov 04, 2018 5:20 am
Posts: 4
Location: Sydney, Australia
I found this online tool quite useful for encoding / decoding x86 assembly. Works with both 32/64 bit code.

https://defuse.ca/online-x86-assembler.htm


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group