My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
joshw
Member
Member
Posts: 50
Joined: Wed Mar 05, 2008 4:41 pm
Location: San Francisco, California, USA
Contact:

My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by joshw »

Hello!

I decided to write an EFI loader with gnu-efi. About 17 years ago I rolled my own MBR bootloader and used the similar recursive page mapping scheme in the kernel. Decided to upgrade to 64-bit MP.

Relevant code is posted here.

The issue is that upon switching cr3, Qemu resets (I guess triple fault). There's something it doesn't like, but I've been logging extensively and everything looks correct.

Is there something I'm missing?

My assumptions are that EFI sets most stuff up for me, and when I change the page tables it should "just work." My thoughts go to maybe the calling convention into asm is wonky, or there is more protection setup to do.

Help?? Lol

Explanation of the code:
  1. Outside virtual.c there is some minimal setup to get the relevant files into memory (initrd.img) and locate kernel.elf.
  2. load_kernel() starts the work (skip measure_kernel(), I think I can factor that out)
  3. create_page_tables() creates a master page, initializing it using map_page() to recursively map it into itself.
  4. map_virtual_address_space() takes various things and maps them in using the new page tables, keeping a running "next_page" variable:
    1. Maps the kernel by calling in map_kernel(). This function maps in sections of the ELF, applying the associated page permissions.
    2. Maps stacks, with one unmapped guard page and several stack pages, per cpu (obtained with an EFI function abused in get_mp_info().
    3. Maps in the frame buffer (obtained outside this module; the screen was cleared to 0x181825, part of the catppuccin color palette).
    4. Maps in the initrd image.
    5. Identity maps parts of the EFI loader state, including the code and data (applying the no-execute bit to the data pages as in other data pages I've mapped). This uses a temporary memory map from EFI that I allocate and then free.
    6. Prints out a final sanity check tracing the virtual addresses through the page tables, into the bytes of the physical pages.
  5. load_kernel() then gets the final memory map (get_memmap()), and calls enter_kernel(), which passes the memmap key to EFI ExitBootServices(), and then calls trampoline() in arch/amd64/asm.S.
    1. The parameters (page_table, stack_pointer, boot_info, kernel_entry) I assume correspond to (rdi, rsi, rdx, rcx).
    2. I:
      1. clear interrupts,
      2. set the stack pointer,
      3. make the boot info rdi so it can be passed to the kernel,
      4. set cr3 (this is where the bug is exposed), and
      5. jump to the entry point.
Thanks so much for your help!!
nullplan
Member
Member
Posts: 1895
Joined: Wed Aug 30, 2017 8:24 am

Re: My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by nullplan »

The QEMU output tells me that right at the end, there is only a single page mapped, and it has the NX bit set. Is it possible you set your code section to NX?

Otherwise, the code you've posted seems correct, so the issue is obviously elsewhere. Why not just post it all to github so we aren't hobbled by you only being willing to show like 20 lines of your work?
Carpe diem!
joshw
Member
Member
Posts: 50
Joined: Wed Mar 05, 2008 4:41 pm
Location: San Francisco, California, USA
Contact:

Re: My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by joshw »

Alright, here you go. You could probably run it in a github codespace with `make run` or run it in a local dev container. That's how it's configured, I'm running it on an M3 mac. And Qemu is launched with Ctrl+T being bound to the Qemu monitor instead of Ctrl-A since it conflicts with my tmux key combo. Otherwise, that was the full contents lol aside from a few headers, the loader entry, and the stub for the kernel.

In the output, I'm just printing the contents of PML4 at the end (virtual.c:26), showing that page tables themselves have the NX bit set (0x80....).

Code: Select all

Entry 0: 0x800000001DE0E003
Entry 1: 0x0
…
Entry 509: 0x0
Entry 510: 0x800000001DE48003
Entry 511: 0x800000001DE4C003
This section for the loader code:

Code: Select all

Mapping in loader code section 0x1DE2A000 with 20 pages (identiy mapped: 0x0)...
Page 1 of 20...
Mapping page for virtual address 0x1DE2A000...
...
New entry: 0x1DE2A001 P
and this one for the kernel code:

Code: Select all

Kernel code pages: 1
Section       0xFFFFFF0000000000:
Allocating to 0xFFFFFF0000000000
Creating 1 new pages...
Created 1 new pages at 0x1DE49000. Mapping them in...
Page 1 of 1...
Mapping page for virtual address 0xFFFFFF0000000000...
...
New entry: 0x1DE49001 P
...show that neither have the NX bit set. :/
User avatar
iansjack
Member
Member
Posts: 4792
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by iansjack »

Well, there’s very little that can cause an exception in the mov to cr3 instruction. So it seems most likely that there is an error in your page table causing a page fault. As you have a debug option in your Makefile, I would set a breakpoint immediately before the mov. Then use the info mem instruction in the qemu monitor to display the current page mappings. Then single-step past the mov instruction and check the page mappings again. Anything obviously missing? Step again to trigger the exception (assuming the error is where you think it is) and inspect the error code on the stack and the faulting address (in cr2). Hopefully that will help to track the fault.

If a step or two after the mov doesn’t trigger an exception then you have to reevaluate your assumption about the location of the error.
Octocontrabass
Member
Member
Posts: 5822
Joined: Mon Mar 25, 2013 7:01 pm

Re: My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by Octocontrabass »

joshw wrote: Sun Jun 08, 2025 1:56 amIn the output, I'm just printing the contents of PML4 at the end (virtual.c:26), showing that page tables themselves have the NX bit set (0x80....).
Setting the NX bit in a paging structure entry prevents code execution in any page translated using that entry. If the NX bit is set in all of your PML4 entries, you can't execute code anywhere!

To confirm whether this is the problem, check QEMU's interrupt log ("-d int") for the exceptions leading up to the triple fault. The first exception will be a page fault caused by a privilege violation during an instruction fetch at an address where the CPU should be allowed to fetch instructions.
joshw
Member
Member
Posts: 50
Joined: Wed Mar 05, 2008 4:41 pm
Location: San Francisco, California, USA
Contact:

Re: My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by joshw »

Octocontrabass wrote: Mon Jun 09, 2025 11:35 am If the NX bit is set in all of your PML4 entries, you can't execute code anywhere!
That'll do it! I guess to prevent executing code in the page tables themselves, you still only need the bit set at the level 1 entries. Or at least just the single entry at the top.
Octocontrabass wrote: Mon Jun 09, 2025 11:35 am check QEMU's interrupt log ("-d int") for the exceptions leading up to the triple fault.
This helped me find my next bug, occurring after the very next instruction. Though a timer interrupt occurring just before ExitBootServices() was a red herring that stumped me fore a while. But in my Makefile the globbed .o file with the proper entry point wasn't being included in the kernel.
iansjack wrote: Sun Jun 08, 2025 10:37 am As you have a debug option in your Makefile, I would set a breakpoint immediately before the mov.
I did try this, but moved on soon after getting the error that the .EFI file wasn't in the right format.
User avatar
iansjack
Member
Member
Posts: 4792
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by iansjack »

For future reference, there’s an entry here about debugging EFI applications with gdb.

https://wiki.osdev.org/Debugging_UEFI_a ... s_with_GDB
joshw
Member
Member
Posts: 50
Joined: Wed Mar 05, 2008 4:41 pm
Location: San Francisco, California, USA
Contact:

Re: My little EFI loader. Qemu resets on cr3 load. Something I'm missing?

Post by joshw »

iansjack wrote: Tue Jun 10, 2025 11:30 pm For future reference, there’s an entry here about debugging EFI applications with gdb.

https://wiki.osdev.org/Debugging_UEFI_a ... s_with_GDB
Thank you! :D
Post Reply