OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Apr 18, 2024 11:50 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: NASM Addressing and Raw Binaries
PostPosted: Wed Jul 19, 2017 10:30 pm 
Offline
Member
Member

Joined: Wed Jul 19, 2017 9:46 pm
Posts: 25
Hey everyone! Since this is my first post, I'd like to start by saying I love this community. I've been lurking for a few years on and off, and I decided to take the development plunge completely this time, since I have so much free time on my hands lately. This is the first time I've been able to get past the GDT and almost clear the IDT hurdle. I feel very enthusiastic and excited about my progress.

I'm using QEMU and developing strictly 32-bit (personal preference), using NASM as my assembler. Now I believe the way I put this disk image together is quite different from many suggestions here, and it may be part of my issue. I plan to do this all in assembly, no C whatsoever. Again, personal preference, even if it is an order of magnitude harder to do. :D

About my issue...
When loading up my IDT, I'm having issues with the section inside each entry that references the OFFSETs. I believe this is because when I compile, it is not truly one large ASM file, so the dynamic addressing/allocation in NASM is acting screwy and thus my IDT doesn't point to the proper places. It seems it's mostly properly configured because it is crashing when I press a key, so at least the interrupts are working!

Code:
%define SECTION_BASE 0x1000
PIC1         equ 0x20   ; IO base address for master PIC.
PIC2         equ 0xA0   ; IO base address for slave PIC.
PIC1_COMMAND   equ PIC1
PIC1_DATA      equ PIC1+1
PIC2_COMMAND   equ PIC2
PIC2_DATA      equ PIC2+1
PIC_EOI         equ 0x20   ; End-of_interrupt command code.


%macro ISR_OFFSETS 1
   ISRLOW%1 equ (SECTION_BASE + isr%1 - $$) & 0xFFFF      ; lower 16 bits of offset
   ISRHIGH%1 equ (SECTION_BASE + isr%1 - $$) >> 16      ; upper 16 bits of offset
%endmacro

%macro ISR_NOERRORCODE 1
   isr%1:
      cli
      push byte 0
      push byte %1
      jmp isr_common
%endmacro

%macro IDTENTRY 1
.entry%1:
   dw ISRLOW%1         ; Offset 0-15
   dw CODE_SELECTOR   ; Selector (from GDT)
   db 0            ; reserved
   db 10001110b      ; Present, Ring 0, (0) Storage, 32-bit int gate
   dw ISRHIGH%1      ; Offset 16-31
%endmacro

IDT:
IDTENTRY 0
...
IDTENTRY 33    ;keyboard
...
IDTENTRY 47
IDT_Desc:
   dw $ - IDT - 1      ; IDT size
   dd IDT         ; IDT Offset/Base
   

; Set up ISRs
ISR_NOERRORCODE 0
...
ISR_NOERRORCODE 33      ; keyboard
...
ISR_NOERRORCODE 47

; Define ISR location data for IDT Entries
ISR_OFFSETS 0
...
ISR_OFFSETS 31            ; end of built-in software interrupts
ISR_OFFSETS 32            ; PIC HARDWARE INTERRUPTS START HERE (0x20)
ISR_OFFSETS 33            ; PIC keyboard IRQ (remapped)
...
ISR_OFFSETS 47


PICmaster_Mask      dw 0
PICslave_Mask      dw 0


PIC_sendEOI:   ; send end-of-interrupt command to PIC(s)
   ; ARGS -> 1: irq #
   mov ebx, [esp + 4]      ; last irq on stack
   mov ax, PIC_EOI
   cmp ebx, 8
   jl PIC_sendEOI.skipSlave
   mov dx, PIC2_COMMAND
   out dx, ax
.skipSlave:
   mov dx, PIC1_COMMAND
   out dx, ax
   ret
   

PIC_remap:      ; bh = offsetMaster, bl = offsetSlave
   ; Save masks
   in al, PIC1_DATA
   mov byte [PICmaster_Mask], al
   in al, PIC2_DATA
   mov byte [PICslave_Mask], al
   
   ; Initialization command 0x11
   mov al, 0x11
   out PIC1_COMMAND, al
   out PIC2_COMMAND, al
   
   ; Update vector offsets
   mov al, bl
   out PIC1_DATA, al
   mov al, bh
   out PIC2_DATA, al
   
   ; Cascading (skip for now)
   xor al, al
   out PIC1_DATA, al
   out PIC2_DATA, al
   
   ; Additional environment information.
   mov al, 1
   out PIC1_DATA, al
   out PIC2_DATA, al
   
   ; Restore masks
   mov byte al, [PICmaster_Mask]
   out PIC1_DATA, al
   mov byte al, [PICslave_Mask]
   out PIC2_DATA, al
   
   ret


isr_common:
        ; Code doesn't even get here, so if this is erroneous, please don't mind (but tips are welcome)
   pushad               ; save state (EDI, ESI, EBP, ESP, EBX, EDX, ECX, EAX) -> 32 bytes
   
   mov ax, ds            ; ax = current data segment selector
   push eax            ; saved onto the stack (4 bytes)
   
   mov ax, DATA_SELECTOR   ; activate the ring 0 (kernel) data selector
   mov ds, ax            ; this handles calls with highest permission
   mov es, ax
   mov fs, ax
   mov gs, ax

   call isr_handler      
   pop eax               ; restore original selector to all data segments
   mov ds, ax                                  ; useful for userspace implementations WAY later
   mov es, ax
   mov fs, ax
   mov gs, ax
   
   popad            ; restore state
   add esp, 8            ; clean up extra stack variables from IRQ routines (pushed error codes and ISR numbers)
   sti               ; set interrupts
   iret
   
   
isr_handler:
        ; Code doesn't get here either, so don't mind.
   mov dword [esi], 0x00515249      ; "IRQ" <-- I did this instead of a string ptr because those don't work either
   mov dx, 0x0707
   mov bl, 0x0F
   call _screenWrite
   mov eax, [esp + 40]      ; reach back into the stack and pull out the IRQ# pushed earlier
   ;call _screenPrintDecimal
   ret


Also, here is my "kernel" that I do actual calls from... This thing is put into memory at 0x1000 from the bootloader, which I can share if you need. It has the paging setup in it (70000h to 73000h and maps out 0x0 to 0x100000 right before calling the kernel).

Code:
;NO org statement, loaded by bootloader to 0x1000
GLOBAL kernel_main
[BITS 32]
...
;includes and blah blah
...
; This is all functional. But actually pressing a key will cause a crash and instant reboot.
kernel_main:
   cld
   lidt [IDT_Desc]
   
   call _screenCLS

   pushad
   mov word [cursorOffset], 0x0A01
   mov dx, [cursorOffset]
   mov dword [esi], 0x00636465 ;"dce"
   mov bl, 0x0F  ; attrib
   call _screenWrite
   mov dword [esi], 0x00636465 ; just testing again for the cursor position updates
   mov bl, 0x4E
   call _screenWrite
   
   mov bl, 0x0A
   mov esi, szTestHello   ; <-- This isn't the problem I'm asking about per se, but it could be related
   call _screenWrite       ; Does NOT work, no matter how I try to move the pointer around.
   
   mov word [cursorOffset], dx
   popad

   
   mov bh, 0x20
   mov bl, 0x28
   call PIC_remap
   
   ; Unmask only the keyboard for now (bits !NOT! flagged are enabled IRQs)
   mov al, 0xFD      ; mask = 1111 1101 // PIC1 IRQ #1 (0 being the clock), keyboard enabled.
   out PIC1_DATA, al
   mov al, 0xFF
   out PIC2_DATA, al
   sti

        hlt


Over in my 'kernel' (as you probably read), I was trying to point the ESI register to a string pointer that was defined in KERNEL.asm and it would either crash the program or spit out garbage when it would actually move past the _screenWrite function. It seems that any defined pointers outside of the bootloader are causing errors, and this is the only constant obstacle I've faced.

And before I forget, here's my very simple compiling script... Yeh, I'm doing this on Windows (pls no bully :oops: )

Code:
nasm "BOOT.asm" -f bin -o "..\bin\boot.bin"
nasm "KERNEL.asm" -f bin -o "..\bin\kernel.bin"
dd if="..\bin\boot.bin" of="..\bin\image.img" bs=512
dd if="..\bin\kernel.bin" of="..\bin\image.img" bs=512 seek=1
"%PROGRAMFILES%\qemu\qemu-system-i386" -drive format=raw,index=0,file="..\bin\image.img"


Am I missing an important lesson on addressing/memory here? If you need any more info, please tell me.
Also, please go easy on me. :mrgreen:
Thanks in advance!

_________________
orchid: a 32-bit, flat-model, single-user operating system targeting legacy BIOS systems. Programmed entirely in Intel-x86 Assembly using NASM (compiler) and Atom (IDE).


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Wed Jul 19, 2017 10:38 pm 
Offline
Member
Member
User avatar

Joined: Tue Mar 06, 2007 11:17 am
Posts: 1225
Looking at your code, it is easy to see that it's not simple enough yet.

It has taken me at least 12 years just to make a very simple console kernel. It can load programs from floppy, can use the PS/2 keyboard, can use the timer, can print strings, can execute console commands, can pass command line to programs, can check if a program was fully loaded without errors by checking a header and footer signatures, but no paging or multitasking yet...

It has very clean code. It has been necessary for me for not getting lost in the assembly code.

It will very probably help you cut on around 10 years of random efforts showing in an easy and working way how to do things like that.

You can see several examples on how to print strings in its code:
BOOTCFG__v2017-06-16.zip

See how to use it, you just need DOS to launch it:
http://f.osdev.org/viewtopic.php?t=32121
http://devel.archefire.org/forum/viewtopic.php?p=4263&hl=en

_______________________________________________________________

For things like this, it is key that you mainly focus in creating the structure of everything you do in your mind from nothingness, from your creativity.

Then, after that, as you create the structure of your very own projects in your very own mind, you will be able to make effective and strong use of external information like this, and being able to adjust it to your own structure.

But you need to keep creating your own tricks fully by yourself to cover all of their logic and being able to extend it. It's part of your mind so it's logical that it's best to do things in this way.

_________________
Live PC 1: Image Live PC 2: Image

YouTube:
http://youtube.com/@AltComp126/streams
http://youtube.com/@proyectos/streams

http://master.dl.sourceforge.net/projec ... 7z?viasf=1


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Wed Jul 19, 2017 10:53 pm 
Offline
Member
Member

Joined: Wed Jul 19, 2017 9:46 pm
Posts: 25
~ wrote:
Looking at your code, it is easy to see that it's not simple enough yet.
...
It has very clean code. It has been necessary for me for not getting lost in the assembly code.

It will very probably help you cut on around 10 years of random efforts showing in an easy and working way how to do things like that.


Thank you for this, I completely agree.
With many of my past projects, they were built in waves of ideas that were just rushing out of my head. With it being so streamlined, it was difficult to adequately comment on the code because I was so full of ideas. Then when coming back later, I had to relearn what I had made previously and it was just tiresome. I'm sure there's a word out there for it somewhere. :D

I will consider your advice and slow it down, keep it tidy, and hope to not have to sanity-check my work 80 times every time I need to edit a source document. Thanks again for the response!

_________________
orchid: a 32-bit, flat-model, single-user operating system targeting legacy BIOS systems. Programmed entirely in Intel-x86 Assembly using NASM (compiler) and Atom (IDE).


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Thu Jul 20, 2017 6:57 am 
Offline
Member
Member
User avatar

Joined: Thu Mar 10, 2016 7:35 am
Posts: 167
Location: Lancaster, England, Disunited Kingdom
Code:
mov dword [esi], 0x00636465 ;"dce"
mov bl, 0x0F  ; attrib
call _screenWrite



I think what you are trying to do here is make esi point to a zero terminated string "dce" {it's actually "edc" but that's a trivial slip]
However, the code gives no clue what is in esi when this instruction is reached. What have you put in it?
I'm a bit afraid that the answer might be that you haven't initialised it at all - and if that is the case the instruction has written your zero terminated string to whatever part of memory esi happened to be pointing to at the time.

I'm also wondering if you properly understand the difference between "mov DWORD [esi] 0x00636465" and "mov esi 0x00636465" in more detail than just being able to say [esi] means a pointer and esi doesn't. Can you explain exactly what the processor does in these two cases and how memory is affected?

Forgive me if you do undertand properly, but the code you've posted really leaves me wondering. If you cannot answer the question, "What is in esi before the instruction executes?" you really do have understanding problems.


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Thu Jul 20, 2017 12:40 pm 
Offline
Member
Member

Joined: Wed Jul 19, 2017 9:46 pm
Posts: 25
MichaelFarthing wrote:
Code:
mov dword [esi], 0x00636465 ;"dce"
mov bl, 0x0F  ; attrib
call _screenWrite



...However, the code gives no clue what is in esi when this instruction is reached. What have you put in it?


In my bootloader, I've arbitrarily pointed the ESI register to address 0x60000. This is one of those concepts-in-the-works sort of things, where it's subject to change later of course (just like the rest of the project :D ). At least for now, I am mapping the stack to 0x90000, ESI to 0x60000, pages to 0x70000-0x73000, etc etc.

MichaelFarthing wrote:
I'm also wondering if you properly understand the difference between "mov DWORD [esi] 0x00636465" and "mov esi 0x00636465" in more detail than just being able to say [esi] means a pointer and esi doesn't. Can you explain exactly what the processor does in these two cases and how memory is affected?


I appreciate your response, I truly do. But I would rather focus on the core issue I am having, as errors printing strings is a trivial-ish issue that has less priority than finishing my IDT. Although, the IDT offset variables and these data string addresses share a related issue, and that is that their index or reference in memory is not being addressed properly during runtime.

The label szStringToPrint denotes the start of the bytes I am defining. When I mov it to ESI, it is making ESI point to the address labeled as szStringToPrint, so that when I lodsb (w/d/q), it is retrieving values pointed to by those locations and incrementing accordingly, until I break the loop by catching the null-terminator.

The issue is not how I print, or how I work with ESI, but rather with defining labels outside of the bootloader. The same thing happened when creating my GDT. Instead of doing the work and lgdt after the kernel jump (which was a bad idea anyways), I had to put the GDT in the bootloader for the labels to even work properly.

Needless to say that I am very bad at articulating my knowledge anyways, and I would make a horrible teacher. So if I sound ditzy about something, feel free to offer a correction if you wish to, but understand that I am very enthusiastic about learning everything I can, not to mention the process of using that information -- and any information the more experienced of you can offer is 100% valuable to me! :)

Thanks again!

_________________
orchid: a 32-bit, flat-model, single-user operating system targeting legacy BIOS systems. Programmed entirely in Intel-x86 Assembly using NASM (compiler) and Atom (IDE).


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Thu Jul 20, 2017 5:01 pm 
Offline
Member
Member
User avatar

Joined: Tue Mar 06, 2007 11:17 am
Posts: 1225
You could probably simply fix it by using ORG in your main kernel file.

For example, if you are loading your kernel to 0x100000 from your boot code, you could use:
Code:
org 0x100000



That will make all memory references point properly where they should, instead of pointing to base address 0.

In 64-bit the code is supposed to be much more position-independent, but it's always a good practice to specify the base address of the binary image.

It will probably always be necessary for specifying the addresses of data labels.

_________________
Live PC 1: Image Live PC 2: Image

YouTube:
http://youtube.com/@AltComp126/streams
http://youtube.com/@proyectos/streams

http://master.dl.sourceforge.net/projec ... 7z?viasf=1


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Thu Jul 20, 2017 5:23 pm 
Offline
Member
Member

Joined: Wed Jul 19, 2017 9:46 pm
Posts: 25
~ wrote:
You could probably simply fix it by using ORG in your main kernel file.
...
That will make all memory references point properly where they should, instead of pointing to base address 0.


Wow, that fixed it!!! =D> Such a simple solution, yet it's one I didn't think of even trying... A bit embarrassing, to be frank.

So for anyone browsing this thread in the future who's also using NASM, if your references to labels are not working and you are assembling an ASM-only kernel with a custom bootloader... Use ORG statements on each file that is not just a "%include" when you're trying to address a label. I'm probably pointing out the absolute obvious, but hey, if it happened to me it could happen to anyone else. :)

Thank you so so so much, Mr. ~! I can finally move on with working out the specifics of my IDT! You rock! [-o<

_________________
orchid: a 32-bit, flat-model, single-user operating system targeting legacy BIOS systems. Programmed entirely in Intel-x86 Assembly using NASM (compiler) and Atom (IDE).


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Thu Jul 20, 2017 9:32 pm 
Offline
Member
Member

Joined: Wed Jul 19, 2017 9:46 pm
Posts: 25
Also, I wanted to add something real quick, for the sake of some more conversation and dialogue with the folks here. Maybe this could get added to the section in the "ISR not working" page as a small aside in the ISR not returning section...

I was finishing up the keyboard ISR handler and it was weirdly not letting me iretd from the interrupt routine. I deliberated for hours, checking constantly that my stack was in order, that I was not trashing any registers, and that my functions were passing data between each other correctly. Then, it hit me, the interrupt was returning to a hlt instruction, with nothing after it but a small data section and empty space. [-X

So, I changed the "hlt" at the end of my kernel to this:
Code:
.repeatISRTest:
   mov ecx, 500
.repThis:
   mov eax, ecx
   loop kernel_main.repThis
   jmp kernel_main.repeatISRTest
A weird way to hang up the processor, but it worked like a charm. Hope this helps somebody someday! 8)

_________________
orchid: a 32-bit, flat-model, single-user operating system targeting legacy BIOS systems. Programmed entirely in Intel-x86 Assembly using NASM (compiler) and Atom (IDE).


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Fri Jul 21, 2017 2:55 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4594
Location: Chichester, UK
Wouldn't it be simpler to do:
Code:
jmp .


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Fri Jul 21, 2017 3:42 am 
Offline
Member
Member

Joined: Thu Aug 13, 2015 4:57 pm
Posts: 384
iansjack wrote:
Wouldn't it be simpler to do:
Code:
jmp .

Lol. Though you'll want to throw HLT in there, which both of you decided to leave out =)

Btw, what's with the moving to eax and double loop?

Presumably there was a reason why the ISR returned to a HLT, instead of looping that HLT you did something completely different?


Top
 Profile  
 
 Post subject: Re: NASM Addressing and Raw Binaries
PostPosted: Sat Jul 22, 2017 11:32 am 
Offline
Member
Member

Joined: Wed Jul 19, 2017 9:46 pm
Posts: 25
LtG wrote:
Lol. Though you'll want to throw HLT in there, which both of you decided to leave out =)

Btw, what's with the moving to eax and double loop?

Presumably there was a reason why the ISR returned to a HLT, instead of looping that HLT you did something completely different?

It was a quick-fix sort of solution to keep the processor busy while not receiving INTs. What I realized is that doing that useless loop, I was burning up a lot of CPU resources in the interim between my interrupts. I have since refined the code to a much simpler format that is much more energy-efficient as well (and isn't completely useless :D ).
Code:
; Hang and wait for some ISRs.
   sti
.repHalt:
   call _parserCheckQueue
   hlt
   jmp kernel_main.repHalt

I have since learned that if you only HLT at the end of your main code, then after an INT the EIP will return to the space directly after the HLT command and start trying to execute the next stuff in line, which for me was my misc data section. This loop is a much calmer, resource-efficient way to wait for IRQs and INTs to trigger command checks (such as 'did the user hit enter?').

Also, this is relevant, because in every %include file I've used to run a command and check the input buffer, I've created super-variables (or globals, w/e you want to call them) like COMMAND_QUEUE and INPUT_BUFFER and the only way to access those variables without causing a GPF or triple-fault was to add the KERNEL_OFFSET variable, which ~ told me about earlier.

That code is executed when the kernel is done initializing everything, right at the end of the main function. I'm adding a shell command to proceed to userspace from there, and eventually I'll probably just ditch the raw shell on startup altogether for straight userspace.

_________________
orchid: a 32-bit, flat-model, single-user operating system targeting legacy BIOS systems. Programmed entirely in Intel-x86 Assembly using NASM (compiler) and Atom (IDE).


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 164 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group