Placement of %include in my NASM bootloader affects program behavior

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
kushalv238
Posts: 4
Joined: Thu Apr 10, 2025 6:03 am

Placement of %include in my NASM bootloader affects program behavior

Post by kushalv238 »

I am learning to write a 16-bit bootloader using NASM with BIOS interrupts to print strings to the screen. I’ve created a simple print subroutine in an external file (printer.asm), and I'm using %include to bring it into my main bootloader file.

Here’s where it gets weird: depending on where I place the %include directive, I get completely different outputs.

Here's the code and all the placements of the include with the different o/p it gives:

Code: Select all

org 0x7C00
bits 16

%include "printer.asm" ; placing include here prints "S" (probably garbage)

mov bx, GREETING
call print

%include "printer.asm" ; placing include here prints the GREETING message twice (Hello WorldHello World)

loop:
    jmp loop

%include "printer.asm" ; placing include here prints the GREETING message as expected (Hello World)

GREETING:
    db "Hello World", 0

times 510-($-$$) db 0
dw 0xaa55

I understand that NASM is a flat binary format and memory layout matters, but I thought %include is just a textual paste. Why is it behaving this way?

Is the string or code overlapping? Is NASM putting instructions and data in conflicting places? What’s the correct way to organise includes and data to prevent this?

This is the printer.asm file (included file):

Code: Select all

; prints contents stored at BX

print:
    pusha
    mov ah, 0x0e ; tty mode

start:
    mov al, [bx]

    cmp al, 0
    je done

    int 0x10
    
    add bx, 1

    jmp start

done:
    popa
    ret
Thank you in advance!
User avatar
iansjack
Member
Member
Posts: 4792
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Placement of %include in my NASM bootloader affects program behavior

Post by iansjack »

The good news is that you are correct - %include pastes the file at the location of the directive.

Think about what this means. Your program is a flat binary, with no entry point defined. It starts execution at the top of the file. So, in the three cases, what is the first program instruction? And the result of this is …?

It can be instructive to write the programs out in full, inserting the instructions in the include file, and manually step through them without assembling and running on a computer.
sebihepp
Member
Member
Posts: 232
Joined: Tue Aug 26, 2008 11:24 am
GitHub: https://github.com/sebihepp

Re: Placement of %include in my NASM bootloader affects program behavior

Post by sebihepp »

Yes, %include in NASM is just a textual copy&paste.
BIOS jumps directly to either 0x07C0:0x0000 or 0x0000:0x7C00. Or any other combination which results in physical address 0x7C00.
Remember, BIOS didn't set up anything for you. You need to do all this by yourself. Even registers cs and ip can't be relied on.

Go through your code step by step and write what you think happens here in this thread. Then we can tell you where your error is. :-)
kushalv238
Posts: 4
Joined: Thu Apr 10, 2025 6:03 am

Re: Placement of %include in my NASM bootloader affects program behavior

Post by kushalv238 »

Okay, so let's consider the first case where it prints "S", I believe that is some garbage value (correct me if i am wrong). In this case, the print section is textually pasted by nasm, and since my file has no entry point, execution starts right at the top - which means it starts by executing the print section, even though it is never called and prints some garbage value stored in BX. If this happens, i had another question, what does ret do in this case where the section was never actually called? Also, after this, the program calls the print section again, so shouldn't it print that string too? In that case, shouldn’t the output be "SHello World" instead? Since the program runs infinitely (probably due to the loop section logic), I assume it never crashed.

If my understanding of the first case was correct, the second case, where "Hello World" is printed twice, makes sense. It's printed once from the explicit call print, and again because execution starts at the top, where the print routine gets executed directly with BX still pointing to the same message.

But then in the third case, where print is called before it’s pasted in via %include, why doesn’t the same thing happen? Shouldn’t it also end up printing twice — once from the call, and once because the print code is inserted later and somehow executed again? Is this because of the loop section?
User avatar
iansjack
Member
Member
Posts: 4792
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Placement of %include in my NASM bootloader affects program behavior

Post by iansjack »

Your understanding is correct.

In the first case the "ret" instruction will cause execution to jump to whatever is on the stack. The processor never gets any further (in your program) than that instruction, so never reaches the loop or the subroutine call. Hence it only prints once (random garbage - whatever was in the BX register when your program started). What happens after that "ret" is random - it could cause a crash, it could go into a loop, it could reach a point where it keeps executing "nop" instructions. The result is probably dependent upon your particular computer, the time of day, the phase of the moon, and the last time your dog had a meal (but see my disclaimer).

in the second case it executes the subroutine then executes the subroutine code as if it were part of the main program - but this time BX contains the value of the string you want to print, so it doesn't print random garbage. And, again, it returns to who knows where. But it never reaches the loop.

In the third case it does exactly what you want. Executes the subroutine then reaches the loop. So it never gets any further than that and doesn't execute the subroutine code again.

(Disclaimer - Someone with more knowledge than me might comment on this. I'm really not sure how random the value on the top of the stack is after a reset. It may depend on the BIOS and/or it may point to a location that will cause the processor to loop rather than crash. You should really set the stack pointer, the top value of the stack, and the segment registers to values that you control before doing any other processing. That way you control what happens.)
User avatar
eekee
Member
Member
Posts: 932
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: Placement of %include in my NASM bootloader affects program behavior

Post by eekee »

iansjack wrote: Mon Jun 09, 2025 3:15 am Your understanding is correct.

In the first case the "ret" instruction will cause execution to jump to whatever is on the stack. The processor never gets any further (in your program) than that instruction, so never reaches the loop or the subroutine call. Hence it only prints once (random garbage - whatever was in the BX register when your program started). What happens after that "ret" is random - it could cause a crash, it could go into a loop, it could reach a point where it keeps executing "nop" instructions. The result is probably dependent upon your particular computer, the time of day, the phase of the moon, and the last time your dog had a meal (but see my disclaimer).
Sounds about right, but it's the odd cases where it's consistent which catch me out. The CPU running unexpected code is an odd case where repeatability (such as printing S every time) doesn't mean anything. Some machines will boot to a consistent state; especially virtual machines where the host may zero the RAM before allocating it to the VM. Some real RAM may, sometimes or always, start up all-0s all-1s or in some pattern, but it doesn't mean anything; its just happenstance. Or, in real or virtual machines, the CPU might have landed in ROM or RAM initialized by the BIOS, but not at an address which was meant to be jumped to; data or the middle of some subroutine. Whatever. Whenever you find the CPU has jumped or branched or returned to an unknown address which just happens to have consistent data, don't let your brain fool you into thinking the consistent symptoms mean anything. Executing an unknown address is always the problem.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Post Reply