OSDev.org
https://forum.osdev.org/

Failing very hard at 64 bit paging
https://forum.osdev.org/viewtopic.php?f=1&t=36435
Page 1 of 1

Author:  firewire [ Sat Jan 11, 2020 8:41 pm ]
Post subject:  Failing very hard at 64 bit paging

I'm just trying to set up 64bit paging. How is this even possible?

Image

First thing main does is call pagingInit(0x10000 * 4) then pml4[0] = (uint64_t)&page_dir_ptr_tab | 3 runs. All I did was make sure i was in protected mode and set up the stack and now I can't set variables.

this runs:
Code:
_start:
     cli

     mov rax, cr0
     or eax, 1
     mov cr0, rax

     mov esp, stack_top

     extern svmain
     push rbx
     call svmain

then this:
Code:
extern "C" void svmain (multiboot_info_t* mbinfo) {
    pagingInit(0x1000000 * 4);

then this:
Code:
uint64_t pml4[4] __attribute__((aligned(0x20)));
uint64_t page_dir_ptr_tab[4] __attribute__((aligned(0x20)));
uint64_t page_dir[512] __attribute__((aligned(0x1000)));
uint64_t page_tab[512] __attribute__((aligned(0x1000)));
uint64_t page_tab_2[512] __attribute__((aligned(0x1000)));

void pagingInit(uintptr_t start) {
    pml4[0] = (uint64_t)&page_dir_ptr_tab | 3;  // this doesn't work

I spent an entire day trying to do this. My previous attempt crashed when trying to set a 52 bit bit field to 0.
Code:
typedef struct page {
    uint32_t present        : 1;
    uint32_t rw             : 1;
    uint32_t user           : 1;
    uint32_t write_through  : 1;
    uint32_t cache_disable  : 1;
    uint32_t accessed       : 1;
    uint32_t dirty          : 1;
    uint32_t zero           : 1;
    uint32_t global         : 1;
    uint32_t gp             : 3;
    uint64_t addr           : 52;
} __attribute__ ((packed)) page_t;

void pagingInit(uintptr_t start) {
     uintptr_t address = 0;
     uintptr_t pt_start = start & 0xFFFFF000

     page_t *page_table = (page_t *)pt_start

     for (int i = 0; i < 512 * 512; i++) {
         page_table[i].present    = 1;
         page_table[i].rw         = 1;
         page_table[i].addr       = address; // crashes here


compiled with:
Code:
G++:
-nostdlib -ffreestanding -g -Wall -Wextra -mno-80387 -masm=intel -fno-pie -fno-pic -static -fno-exceptions -fno-non-call-exceptions -mno-red-zone -mno-mmx -mno-sse -mno-sse2 -std=gnu++1z -fno-exceptions -fno-rtti -fno-common -O0 -IT8/include
NASM:
-felf64

It's 4:30 AM. I can't do this anymore.

Author:  iansjack [ Sun Jan 12, 2020 12:46 am ]
Post subject:  Re: Failing very hard at 64 bit paging

Why aligned(0x20)?

Author:  nullplan [ Sun Jan 12, 2020 1:29 am ]
Post subject:  Re: Failing very hard at 64 bit paging

Ooh boy, someone has a very wrong conception of paging here. Get some sleep. Then we can talk about what processor mode you are in. Because you are using multiboot, you must still be in 32-bit mode, so this snippet doesn't do anything, because you already are in protected mode:
Code:
     mov rax, cr0
     or eax, 1
     mov cr0, rax


I'm guessing you are compiling your code for 64-bit mode, and this will not work with multiboot. I have my kernel in 64-bit mode, but I tell GRUB that's a module, and I have a 32-bit loader I give as kernel to GRUB. The actual kernel is an ELF file, so loading it is actually simple, because I know where it is supposed to go. And I can just tell GRUB to page-align it.

Next problem is the paging structures: Those are all one page in size, or 512 64-bit entities. And except for the PML4, there is more than one of each. I have a very simple allocator in the loader kernel: It allocates memory from the low memory that should be free. Reserving the first and last page of the initial 640 kB still gives you 158 pages. So what it does is:

Code:
static uintptr_t hwm = 0x1000;
void *alloc_page(void) {
    void *r = (void*)hwm;
    if (hwm == 0x9f000)
        panic("Out of conv. memory while booting");
    hwm += 0x1000;
    return r;
}

Then writing an mmap function is actually pretty simple:
Code:
uint64_t *next_level(uint64_t *this_level, size_t next_num)
{
    uint64_t *r;
    if (!(this_level[next_num] & 1) {
        r = alloc_page();
        memset(t, 0, 4096);
        this_level[next_num] = (uintptr_t)r | 3;
    } else
        r = (void*)(this_level[next_num] & -4096ull);
    return r;
}

static uint64_t *pml4;
void mmap(uint64_t physaddr, uint64_t virtaddr, size_t len, uint64_t attr)
{
    uint64_t *pdpt, *pdt, *pt;
    size_t pdpn, pdn, ptn, pn;
    if (!pml4) {
        pml4 = alloc_page();
        memset(pml4, 0, 4096);
    }
    assert((physaddr & 0xfff) == (virtaddr & 0xfff));
    len += physaddr & 4095;
    physaddr &= -4096ull;
    virtaddr &= -4096ull;
    len = (len + 4095) & -4096ul;
    for (; len; len -= 4096, virtaddr += 4096, physaddr += 4096) {
        pn = (virtaddr >> 12) & 0x1ff;
        ptn = (virtaddr >> 21) & 0x1ff;
        pdn = (virtaddr >> 30) & 0x1ff;
        pdpn = (virtaddr >> 39) & 0x1ff;

        pdpt = next_level(pml4, pdpn);
        pdt = next_level(pdpt, pdn);
        pt = next_level(pdt, ptn);
        pt[pn] = physaddr | attr | 1;
    }
}

Now to draw it all together, the actual mapping code:
Code:
    size_t pn = eh->e_phnum;
    for (Elf64_Phdr *ph = (void*)(base + eh->e_phoff); pn; pn--; ph = (void*)((char*)ph + eh->e_phentsize)) {
        if (ph->p_type == PT_LOAD) {
            mmap(base + ph->p_offset, ph->p_vaddr, ph->p_filesz, get_attr(ph->p_flags));
            size_t n = (ph->p_memsz - ph->p_filesz) >> 12;
            if (n) {
               void *p = alloc_page();
               for (size_t i = 1; i < n; i++) alloc_page();
               mmap(p, (ph->p_vaddr + ph->p_filesz + 4095) & -4096ull, get_attr(ph->p_flags));
            }
        }
    }
    void *stack = alloc_page(); alloc_page(); alloc_page(); alloc_page();
    mmap(stack, 0xffffffff00000000 - 4*4096, 4*4096, PF_WRITE | PF_NX);
    uint64_t stack_top = prepare_kernel_stack(stack, 4*4096, mb_info);
    extern char trampoline_start[], trampoline_end[];
    mmap((uintptr_t)trampoline_start, (uintptr_t)trampoline_start, trampoline_end - trampoline_start, 0);
    go64(pml4, eh->e_entry & 0xffffffff, stack_top);

And finally:
Code:
go64:
    mov eax, cr4
    or eax, 0x20
    mov cr4, eax
    mov eax, [esp+4]
    mov cr3, eax
    mov ecx, 0xc0000080
    rdmsr
    or eax, 0x100
    wrmsr
    mov edx, [esp + 8]
    mov edi, [esp+12]
    mov esi, [esp+16]
    mov eax, cr0
    or eax, 0x80000001
global trampoline_start
trampoline_start:
    mov cr0, eax
    jmp CSEG64:continue
bits 64
continue:
    mov esp, esi
    shl rsp, 32
    mov edi, edi
    or rsp, rdi
    xor eax, eax
    not eax
    shl rax, 32
    or rdx, rax
    jmp rdx
global trampoline_end
trampoline_end:

Author:  ~ [ Sun Jan 12, 2020 4:23 am ]
Post subject:  Re: Failing very hard at 64 bit paging

It took me a whole year of a formal project just to write basic 32-bit paging support.

Write your code massively and place a CLI-HLT at every step to correct any problem, print the results on screen and the emulator memory to check the sanity of the structures, no matter if you have to recompile at every step where you place a CLI-HLT. You will double your learning and capability to finish fast without errors with this method, as if you talked with the computer and it returned the real answer to what you ran.

Think on making very basic functions for what you imagine first (search free pages, free reserved pages, search virtual blocks of a given size (virtual blocks with the size of a page table, for example 1024 entries for 4MB virtual blocks) are the easiest to implement directly); free virtual blocks, zero out new paging structures to avoid using extra data (now if an entry is 0, it's fully free so you must keep it consistent at all times). It will be a library with low level functions that will at least end up providing malloc-like functions.

Probably you will need to evaluate the way malloc/realloc/free works, how to clean up if malloc/realloc fails, how to align your blocks, whether you need to group your pages in clusters or if you will only base your code on single-page allocations. It could probably take you 3 or 6 months.

I had to learn that you do NOT have to map paging structures themselves if you want to keep things sane. Use virtual pages and remap them temporarily to the paging structures themselves, or disable/enable paging to do it.

I had to keep things extremely slow and simple with a bitmap to keep track of physical pages, detecting memory manually and assuming that all of it is normal, not device/reserved memory to fully concentrate in the paging problem. It takes around 2 seconds for my code for example to map a 40MB block under Bochs and a dual-code 3GHz machine, but at least now I just need to improve my now-existing infrastructure by replacing the slow code with better and better algorithms, but anyway that's exactly what I want to learn (where paging can be simplified and accelerated the most in the simplest ways but learning the best tricks like the fact that paging structures don't need to be mapped for it to work once you are done manipulating them).

And all the effort I did for paging is mostly only useful for a basic C-based allocator that implements malloc/calloc/realloc/free, it seems to be the main intention of paging. Paging is just a compressed array for space and time that is friendly with any memory management algorithm in a generic way. It even has an "AVL" field, maybe it's suitable for AVL trees but it makes it possible for you to mark different contiguous blocks with a distinct value 1 to 7 (0 is normally considered as free or permanent/kernel area if all the other fields aren't 0).

Author:  iansjack [ Sun Jan 12, 2020 6:17 am ]
Post subject:  Re: Failing very hard at 64 bit paging

The OP is using gdb to debug their code. No need for a laborious cli/hlt process when you can just single-step the code and inspect memory.

Author:  firewire [ Sun Jan 12, 2020 3:27 pm ]
Post subject:  Re: Failing very hard at 64 bit paging

Alright thanks guys. I think I'll just do the loader method.

Page 1 of 1 All times are UTC - 6 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/