OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Mar 29, 2024 4:29 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 52 posts ]  Go to page 1, 2, 3, 4  Next
Author Message
 Post subject: Fixed: Higher Half in C?
PostPosted: Tue Jul 03, 2018 5:02 pm 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1134
Yeah, yeah, I know I might be annoying with all the questions, but Google apparently isn't almighty. :P

The question is simple (maybe not the answer):
How do I achieve a higher half kernel after enabling paging in C(++)? I don't want my loader to deal with paging in any way.
I tried mapping 1 MB -> 1 MB and 1 MB -> 3 GB and then jumping to some point and removing 1 MB -> 1 MB, but that didn't work.
I've heard it can be done, but how?

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Last edited by Octacone on Wed Jul 18, 2018 3:55 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Tue Jul 03, 2018 5:33 pm 
Offline
Member
Member
User avatar

Joined: Sun Apr 05, 2015 3:15 pm
Posts: 31
"that didn't work", but you did not specify what happened.
The way you word it, it sounds like you're trying to do this all in C, but you'll want to do this in assembly, before you enter your C code.
Anyway, there's a wiki article on that: Higher_Half_x86_Bare_Bones
Is there a particular reason you want to do this in C?

_________________
osdev project, goal is to run wasm as userspace: https://github.com/kwast-os/kwast


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 12:17 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
In booting, you generally have a program that loads your kernel and assorted modules into memory, the program that sets up the environment your kernel expects, and then the main program. How you organize this is entirely up to you.

Many people use a boot stub in their kernels. That is, they have a program as actual part of the kernel, actually containing the entry point, where a short assembly stub sets up a 32-bit C environment (for instance, clearing the direction flag, and passing the multiboot parameters as actual C parameters) and calls the boot stub main function. The boot stub main function then allocates some memory for the page tables, sets them up with identity mapping for the boot stub, and the mapping you want for the kernel. The rest of the kernel is linked to above 3GB. So essentially something like this (untested, off the top of my head, assuming ELF):

boots.s:
Code:
/* must be ".data", to avoid overwriting the kernel at run time */
.section ".data","aw",@progbits
.skip 4096
kstack:

.text
.align 4
mbhead:
.long 0x1badb002
.long 2
.long -0x1badb002-2
.long 0,0,0,0,0,0,0,0,0
.global _start
_start:
movl $kstack, %esp
cld
pushl %ebx
call stubmain
1:
cli
hlt
jmp 1b

.global go_kernel
/* enables paging and enters the kernel. Takes one argument: The address of the page directory. */
go_kernel:
    movl 4(%esp), %eax
    movl %eax, %cr3
    movl $0x80000001, %eax
    movl %eax, %cr0
    jmp _realstart
.global go_kernel_end
go_kernel_end:


Then bootc.c:
Code:
#include <stdint.h>
#include <stdnoreturn.h>
#define PF_PRESENT 1
#define PF_WRITE 2
#define PF_USER 4

extern noreturn go_kernel(uint32_t* pagedir);
extern char go_kernel_end[];
extern char ktext_offs[];
extern char ktext_start[];
extern char ktext_end[];
extern char kdata_offs[];
extern char kdata_start[];
extern char kdata_end[];

static uint32_t *pagedir;
static void mmap(uint32_t vaddr, uint32_t paddr, uint32_t len, uint32_t flags)
{
    /* assuming non-PAE paging for simplicity here */
    uint32_t *pagetab;

    if (!pagedir)
        pagedir = zalloc_page();
    if (vaddr & 0xfff)
    {
        len += vaddr & 0xfff;
        paddr -= vaddr & 0xfff;
        vaddr &= 0xfffff000;
    }
    len = (len + 4095) & -4096;
    while (len)
    {
        int pgd = (vaddr >> 22) & 0x3ff;
        int pgt = (vaddr >> 12) & 0x3ff;
        if (!pagedir[pgd])
        {
            pagedir[pgd] = zalloc_page() | PF_PRESENT | PF_WRITE;
        }
        pagetab = (uint32_t*)(pagedir[pgd] & -4096);
        pagetab[pgt] = paddr | flags;
        vaddr += 4096;
        paddr += 4096;
        len -= 4096;
    }
}

void stubmain(struct multiboot_info* info)
{
    /* whatever */
    mmap((uint32_t)go_kernel, (uint32_t)go_kernel, go_kernel_end - (char*)go_kernel, PF_PRESENT);
    mmap((uint32_t)ktext_start, (uint32_t)ktext_offs, ktext_end - ktext_start, PF_PRESENT);
    mmap((uint32_t)kdata_start, (uint32_t)kdata_offs, kdata_end - kdata_offs, PF_PRESENT | PF_WRITE);
    go_kernel(pagedir);
}


Then the linker script:
Code:
ENTRY(_start)
/SECTIONS/: {
    . = 0x100000; /* load to 1MB */
    .text.early : {
        boot?.o(.text*)
        boot?.o(.rodata*)
    }
    . = ALIGN(4096)
    .data.early : {
        boot?.o(.data*)
        boot?.o(.bss*)
        boot?.o(COMMON)
    }
    ktext_offs = .
    . = 0xC0000000
    .text : {
        _stext = .
        ktext_start = .
        *(.text*)
        *(.rodata*)
        _etext = .
        ktext_end = .
    }
     . = ALIGN(., 4096)
     kdata_offs = ktext_offs + (. - _stext)
    .data : {
        kdata_start = .
        *(.data*)
        _edata = .
        *(.bss*)
        *(COMMON)
       _end = .
       kdata_end = .
    }
}


This should at least give you a starting point. The kernel then needs to contain a symbol _realstart, which should be written in assembly and set up a boot stack and call the kernel main, which should then clear the bss section (everything between _edata and _end). Also, currently the multiboot info is not funelled through to the kernel, which you might want to rectify. And I don't know if the linker script magic does what it's supposed to. ktext_offs and kdata_offs are supposed to be the physical addresses of their respective sections.

Also, currently no stack is allocated. After turning on paging, the world only consists of the kernel data and text. Which isn't a problem if a boot stack is part of your kernel data, but since the boot stub already needs a page allocator, you might as well allocate the memory needed at the start and switch stacks within go_kernel.

Also also, you are never going to get rid of all assembly. Changing the PG bit is necessarily an assembly operation as directly after it you need to be able to avoid using the stack at all between turning on paging and switching stacks. Unless you identity map the stack as well, which seems... weird, to say the least.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 4:41 am 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1134
ndke wrote:
"that didn't work", but you did not specify what happened.
The way you word it, it sounds like you're trying to do this all in C, but you'll want to do this in assembly, before you enter your C code.
Anyway, there's a wiki article on that: Higher_Half_x86_Bare_Bones
Is there a particular reason you want to do this in C?


After I removed 1 MB -> 1 MB mappings, it crashed, physical address was not available. Which means I didn't do a proper jump (C code -> Assembly code -> C code).
My paging code is fully written in C++ and works as expected. I can map my kernel to 3 GB, but how do I jump to it since it's still executing in lower half.
I know there are many articles, but not what I need. They are all pre-enabling examples.
I want to do it this way because: I don't want double paging code, I don't want to waste space for paging table twice, I don't want to code anything paging related in assembly and certainly not before I enabled everything else I need, I plan on growing my kernel so manually adjusting all the temporary page tables to accommodate for every size change is not something I want to do. NASM doesn't support bitfields nor structs which is a big no no. I don't have any problems with writing code in Assembly it's just that some things are just easier in higher languages. Let's not discuss my design choices.

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 4:46 am 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1134
nullplan wrote:
In booting, you generally have a program that loads your kernel and assorted modules into memory, the program that sets up the environment your kernel expects, and then the main program. How you organize this is entirely up to you.

Many people use a boot stub in their kernels. That is, they have a program as actual part of the kernel, actually containing the entry point, where a short assembly stub sets up a 32-bit C environment (for instance, clearing the direction flag, and passing the multiboot parameters as actual C parameters) and calls the boot stub main function. The boot stub main function then allocates some memory for the page tables, sets them up with identity mapping for the boot stub, and the mapping you want for the kernel. The rest of the kernel is linked to above 3GB. So essentially something like this (untested, off the top of my head, assuming ELF):

boots.s:
Code:
--snip--


This should at least give you a starting point. The kernel then needs to contain a symbol _realstart, which should be written in assembly and set up a boot stack and call the kernel main, which should then clear the bss section (everything between _edata and _end). Also, currently the multiboot info is not funelled through to the kernel, which you might want to rectify. And I don't know if the linker script magic does what it's supposed to. ktext_offs and kdata_offs are supposed to be the physical addresses of their respective sections.

Also, currently no stack is allocated. After turning on paging, the world only consists of the kernel data and text. Which isn't a problem if a boot stack is part of your kernel data, but since the boot stub already needs a page allocator, you might as well allocate the memory needed at the start and switch stacks within go_kernel.

Also also, you are never going to get rid of all assembly. Changing the PG bit is necessarily an assembly operation as directly after it you need to be able to avoid using the stack at all between turning on paging and switching stacks. Unless you identity map the stack as well, which seems... weird, to say the least.


Looks like you didn't understand my question. I am really sorry that you had to write all this :| I apologize if my question was not clear.
I actually already have a functioning kernel with a custom bootloader and memory management architecture.
I was asking for a way to get my kernel to higher half after jumping to C++ code, enabling paging and mapping kernel to 3 GB. My paging code works just fine, it's the higher half part that I'm having troubles with.
This was already mentioned on this forum, concluded possible but not given any definitive answer since the question was just hypothetical.

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 4:47 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4591
Location: Chichester, UK
Octacone wrote:
I can map my kernel to 3 GB, but how do I jump to it since it's still executing in lower half.
You do a long jump.

You really have to do this in an assembler function. Trying to do it in C, with inline assembler, is highly unlikely to work. The reason that your Google search on doing this entirely in C produced no results is because that's not how it's done. You do all this sort of housekeeping in your boot code before you call your C kernel code.


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 4:55 am 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1134
iansjack wrote:
Octacone wrote:
I can map my kernel to 3 GB, but how do I jump to it since it's still executing in lower half.
You do a long jump.

You really have to do this in an assembler function. Trying to do it in C, with inline assembler, is highly unlikely to work. The reason that your Google search on doing this entirely in C produced no results is because that's not how it's done. You do all this sort of housekeeping in your boot code before you call your C kernel code.


Does that mean that I can't do this:
1.My bootloader loads my kernel to 1 MB.
2.My kernel gets executed.
3.It does some stuff and then jumps to C++ code.
4.Other systems get initialized.
5.Paging gets enabled, the kernel gets simultaneously mapped to both 1 MB and 3 GB.
6.External assembly functions gets called.
7.Higher half jump happens.
8.Identity mapping gets removed.
Does this plan look okay?
Could I fixed anything by making my code position independent?


Attachments:
idea_from.png
idea_from.png [ 39.1 KiB | Viewed 2785 times ]

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader
Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 5:19 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4591
Location: Chichester, UK
Much easier to

1. Load kernel
2. Set up and enable paging (with the current kernel position identity mapped)
3. Either move kernel to upper half, or just map upper half to current kernel position
4. Set your ds, ss, and sp to the appropriate new values
5. Long jump to the kernel in the upper half


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 5:59 am 
Offline
Member
Member
User avatar

Joined: Fri Aug 07, 2015 6:13 am
Posts: 1134
iansjack wrote:
Much easier to

1. Load kernel
2. Set up and enable paging (with the current kernel position identity mapped)
3. Either move kernel to upper half, or just map upper half to current kernel position
4. Set your ds, ss, and sp to the appropriate new values
5. Long jump to the kernel in the upper half


4.What does appropriate mean? I know I have to add 0xC0000000 to the ESP but what about the others? I can't just change them to whatever because then they won't reference a valid kernel mode GDT entry. How can they reference an entry while not being 0x10 (ES, FS, DS, SS, GS) or 0x08 (CS)?

I managed to increment EIP by 0xC0000000 by doing this:
Code:
... after paging gets enabled and everything ID mapped + higher half mapped ...
jmp 0x08: Higher_Half

extern Test //C function that prints "Higher Half" and removes ID mappings
Higher_Half:
    mov eax, Test
    add eax, 0xC0000000
    push eax
    ret

When I open Bochs I can see that my EIP and ESP are properly loaded, but as soon as I remove the ID mappings, Qemu freezes and Bochs says bx_dbg_read_linear: physical address not available for linear 0x000000000010234d

_________________
OS: Basic OS
About: 32 Bit Monolithic Kernel Written in C++ and Assembly, Custom FAT 32 Bootloader


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 6:03 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4591
Location: Chichester, UK
I can't tell you where to put your stack and data. That's part of the choices you make when designing your kernel. But, wherever you put them , you need to ensure that those addresses are mapped in your page tables.


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 9:47 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5103
Octacone wrote:
When I open Bochs I can see that my EIP and ESP are properly loaded, but as soon as I remove the ID mappings, Qemu freezes and Bochs says bx_dbg_read_linear: physical address not available for linear 0x000000000010234d

The stack still contains many addresses that still refer to the original identity-mapped portion of memory, including return addresses, so your EIP does not stay in the higher half for long.

If you want to run C(++) code before you've switched to the higher half mapping, you have maybe two choices.

The sensible choice is to have a separate "startup" section linked to execute at its load address, instead of in the higher half. This sort of design is easy to port to other architectures (e.g. 64-bit) later on, but the "startup" section and the higher-half section can't reference each others' symbols. It works something like this:
1. Your bootloader loads the kernel to 1MB.
2. Your kernel's assembly entry point gets executed.
3. It does some stuff and then jumps to C++ code in the startup section.
4. The C++ code in the startup section creates the initial page tables.
5. The C++ code returns to the assembly entry point code.
6. The assembly entry point code enables paging and jumps to the main higher-half C++ code.
7. The identity mapping gets removed.

The ridiculous choice, suggested in that screenshot you posted, is to use segmentation to wrap addresses around the 4GB mark so that virtual address 0xC0000000 maps to physical address 0x00100000 in your segments. This sort of design is specific to 32-bit x86, and it's strange enough that some CPUs might misbehave. It works something like this:
1. Your bootloader loads the kernel to 1MB.
2. Your kernel's assembly entry point gets executed.
3. It prepares a GDT with ridiculous descriptors, and loads the data and stack segments.
4. It performs a far jump to your C++ code, effectively using segmentation to put you in the higher half. (Virtual addresses are in the higher half, but linear addresses are not.)
5. Paging is initialized and enabled.
6. Some external assembly function is used to reload the segments, including CS, with flat segments. (Both virtual and linear addresses are now in the higher half.)
7. The identity mapping gets removed.


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 10:15 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
Octacone wrote:
Looks like you didn't understand my question. I am really sorry that you had to write all this :| I apologize if my question was not clear.
I actually already have a functioning kernel with a custom bootloader and memory management architecture.
I was asking for a way to get my kernel to higher half after jumping to C++ code, enabling paging and mapping kernel to 3 GB. My paging code works just fine, it's the higher half part that I'm having troubles with.
This was already mentioned on this forum, concluded possible but not given any definitive answer since the question was just hypothetical.


It's not a problem; after all, I was the one who decided to answer as lengthy as I did. Anyway, what you want then, is a point of no return. Instead of having a function that transitions to the high half, you should consider the first function in high half as another entry point. And then you can do it in C as well. It's all just a matter of how everything is linked. If the linker places a function in the lower half, that function will be lost as soon as you unmap the low half. That's why I suggested keeping early text and main text in separate sections. Ideally, you'd want a warning on cross-references.

Incidentally, I personally use a stub loader as an actual self-contained program (written partly in C), but that is because my loader needs to be 32-bits and my main kernel is 64 bits. With that approach, the low-half part becomes its own program, and references to the high-half part are impossible, unless they are explicit. However, the actual jump to high part is done in assembly, because I need to switch operating modes and explicitly encode 64-bit instructions at some point.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 11:38 am 
Offline
Member
Member
User avatar

Joined: Mon Dec 28, 2015 11:11 am
Posts: 401
Though I think that the best thing to do is to make GRUB execute your own bootloader, which would find the kernel file, load it to some address, and map that address onto 0xC0000000 and just jump to the new address immediately.
You can also make your bootloader interactive later, e.g. showing boot entries from a file, for example "entries.txt" and that way completely remove the dependence on GRUB in your operating system.

I have very little experience with GRUB and I don't know how it really works, but I'm sure that the idea can be implemented efficiently.


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Wed Jul 04, 2018 9:55 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
Lukand wrote:
Though I think that the best thing to do is to make GRUB execute your own bootloader, which would find the kernel file, load it to some address, and map that address onto 0xC0000000 and just jump to the new address immediately.
You can also make your bootloader interactive later, e.g. showing boot entries from a file, for example "entries.txt" and that way completely remove the dependence on GRUB in your operating system.

I have very little experience with GRUB and I don't know how it really works, but I'm sure that the idea can be implemented efficiently.


That seems wasteful, as GRUB is more than capable of loading arbitrary files as modules already. In my case, the stub program that sets up paging is named as the kernel and the actual kernel is named as module. The boot stub then parses the ELF headers and maps everything.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Higher Half in C?
PostPosted: Thu Jul 05, 2018 4:58 am 
Offline
Member
Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 360
Octacone wrote:
Does that mean that I can't do this:
1.My bootloader loads my kernel to 1 MB.
2.My kernel gets executed.
3.It does some stuff and then jumps to C++ code.
4.Other systems get initialized.
5.Paging gets enabled, the kernel gets simultaneously mapped to both 1 MB and 3 GB.
6.External assembly functions gets called.
7.Higher half jump happens.
8.Identity mapping gets removed.
Does this plan look okay?
Linux does something similar to what you describe. There are some details that need to be handled differently:
  1. My bootloader loads my kernel to 1 MB.
  2. My kernel gets executed.
  3. It does some stuff and then calls the lower-half C++ code.
  4. Other systems get initialized.
  5. Paging gets enabled, the kernel gets simultaneously mapped to both 1 MB and 3 GB.
  6. You return to the assembly code.
  7. You jump into the higher-half C++ code.
  8. Identity mapping gets removed.
You need to compile the "higher-half" and "lower-half" parts separately in a different set of object files and split their sections to different output ranges during your final link. They cannot call each other. If you do have common library code, you must either compile it twice to produce non-conflicting symbol names and use weakref aliases in your header files, or build intermediate relocatable file for each set of object files, linking against a static library and use symbol hiding (probably more elegant).

Octacone wrote:
Could I fixed anything by making my code position independent?
If you want to share the code between the lower-half and upper-half mapping, without separating the object files, you have to make it position-independent indeed. You have to use the -fpie option when compiling and "-shared -Bstatic -Bsymbolic -pie" when linking (may be check this post.) Then, you need to offset the GOT entries before jumping to the higher half mapping. Linux does this, but in a rather convoluted way. There is stub code, which decompresses the actual kernel image. The stub is itself relocatable and the decompressed kernel image is relocated separately. Still, I am going to illustrate how that is done for the stub code, because it is simpler. In the stub's linker script, two markers, namely _got and _egot, are placed around the .got.plt and .got sections (see here). After the stub is moved to its final location, corresponding in your case to the point where the upper half mapping is already established, but before jumping into it, the stub iterates the range between _got and _egot and fixes the entries, adding a rebasing offset (see here). This cannot be considered proper ELF relocation handling, but it works if your output contains only R_386_RELATIVE relocations targeting the GOT. Assuming that you are not importing symbols (which is why you use -pie and not -pic), there either should be no relocations or only this type of relocations. Namely, there shouldn't be any R_386_GLOB_DAT and R_386_JMP_SLOT. You will want to assert that in your build script.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 52 posts ]  Go to page 1, 2, 3, 4  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: Google [Bot] and 112 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group