How can I improve my OS code? What direction should I take t

PatSanf · **Joined:** Mon Oct 13, 2014 6:39 pm **Posts:** 25

This is a very broad question, and is more of a request for feedback. I'm in the process of writing my own operating system, and have successfully implemented a physical memory manager. You can see my code here:

https://github.com/PSanf2/POS_C9

I believe my memory manager implementation will allow me to move into paging. I would like to know where I could make improvements, or where I'm making major mistakes, before I go further. Because this is my first time writing an operating system I'm learning as I go, and wish to avoid a mistake hindering my progress later.

With my OS I'm loading the kernel at the 1M mark, and using everything after the kernel for free memory. I'm not touching anything before the kernel, and writing off the wasted megabyte as "GRUB space." I've read about having your kernel set up in the "higher half" starting at the 3G address, and then using all of the memory below that, and after the kernel, for the paging. I understand that this isn't a requirement, but how much more difficult am I making things on myself by loading my kernel in the "lower half?" I'm writing my OS for a 32-bit system, and believe that the higher half setup has something to do with making my OS compatible with other programs that wish to run at lower memory addresses. Is this correct, or am I worrying about nothing? My memory manager is currently set up to start allocating memory from the first address after the kernel, and relies on external values from the linker and GRUB to determine the proper place to start. Should I rework my OS to use a higher half kernel, and have the memory manager start allocating memory from address 0, or 1M? Once the kernel is loaded can I reclaim the "GRUB space," assuming I save the multiboot header information?

For my memory manager I'm using a doubly linked list to keep track of free memory. I'm doing this for the sake of simplicity, and aware that other data structures will provide a better algorithmic complexity. GRUB provides information telling me where my kernel ends, and how much memory I have available. Each node on the list has a "header" which I use to keep track of memory. The header on the first node is 256 bytes, and the header on each subsequent node is 64 bytes. The headers contain information telling me where the previous and next nodes in memory are, where the free memory associated with the node starts, and the size of the memory that the node contains. Allocating memory involves splitting an adequately sized node of free memory, removing the allocated node from the list, and returning a pointer to the first address of the allocated memory. The header always stays with the memory, even when it's removed from the free memory list. When I free memory, I give a function a pointer to the first address of allocated memory, back it up the proper amount to get to the header, and then put the node back on the list. From there I go over the list, and compact adjacent nodes. Basically, I have this: http://wiki.osdev.org/images/a/a4/Flat_list.png

Will I be able to use my memory manager to successfully implement paging? I understand that when I enable paging the way I use memory addresses should change. Is this an entirely logical process the programmer handles, where I translate the page index and offset into a physical address, or is the translation handled by the Memory Management Unit? I understand that to implement paging I'll need to define a page directory data structure at a 4K aligned address. I plan on implementing a new memory allocation function that will ensure the address returned is aligned properly by splitting an unallocated node at the proper place. My page directory will have 1024, 32-bit, pointers to page tables, totaling 4K in size. I'm thinking that I'll allocate 4K of memory at a properly aligned address, and that the pointer I get back will be what I give to cr3 for my page directory. The page tables will each contain 1024, 32-bit, pointers to 4K sections of memory. I'm shooting for this: http://wiki.osdev.org/images/7/77/Paging_Structure.gif When creating the page table entries, do I need to allocate the 4K memory needed by each page, or do I mark them as not being present in memory and wait until I allocate the page before allocating the memory? What's involved with allocating a page? Do I simply allocate memory for the page, and set a pointer in the page table to the allocated memory? Will leaving the headers on the allocated memory segments cause issues for paging? Do the starting addressed for each page need to be aligned at a 4K address, or can each page be place wherever's convenient for the physical memory manager? I understand the overall concept of paging, but the details are lost on me at this time. What kind of functions will I need to implement for allocating and freeing pages?

max · **Posted:** Tue Oct 14, 2014 1:50 am

I haven't read your code yet, but I'll try to say something to the rest

PatSanf wrote:

With my OS I'm loading the kernel at the 1M mark, and using everything after the kernel for free memory. I'm not touching anything before the kernel, and writing off the wasted megabyte as "GRUB space." I've read about having your kernel set up in the "higher half" starting at the 3G address, and then using all of the memory below that, and after the kernel, for the paging. I understand that this isn't a requirement, but how much more difficult am I making things on myself by loading my kernel in the "lower half?" [...]

Basically, it does not matter at all. When later compiling binaries for your OS, you can tell the linker to use whatever load address you want for your executable (or even make them relocatable/position-independet). Altough a higher half kernel has some advantages.

PatSanf wrote:

For my memory manager I'm using a doubly linked list to keep track of free memory. [...]

Right now I'm only wondering how you allocate space for the nodes of your linked list before having even the physical manager set up? For paging, you will want a physical memory manager that is able to give you actual physical pages. There are different solutions for this, I for example prefer a stack allocator

PatSanf wrote:

Will I be able to use my memory manager to successfully implement paging? [...]

As I said to the last paragraph, you need to store your physical memory page-wise (often called page frames). Having headers in the pages won't make sense because whoever is using that virtual space would have to make sure not to overwrite your headers. Or they could even do so to kill your entire system.

To actually set up paging you will then use your physical memory manager (pmm) to give you pages for the directory, the tables and each of the frames. When creating a page directory, by placing the address of a table into the directory and a page into the table you make this page available in this virtual address space (sure, depending on the flags). The MMU will then translate them for you like this:
If you place the table at index 532 of the directory, and the page at index 293 of the table, the address will be:
((532 * 1024 * 0x1000) + (293 * 0x1000) = 0x85000000 + 0x00125000 = 0x85125000
The same calculation can be done vice versa, to get the table index: ((0x85125000/ 0x1000) / 1024) = 532; or to get the page index: ((0x85125000/ 0x1000) % 1024) = 293;

Greets, Max

EDIT:
in your irq.c:
You check "if (regs.int_no >= 40)" to inform the slave PIC for an EOI; assuming you have correctly remapped the PIC IRQs to start at 0x20 instead of 0, you should therefore check if the interrupt number is ">= 0x20".

Octocontrabass · **Joined:** Mon Mar 25, 2013 7:01 pm **Posts:** 5137

I will preface this by saying I'm not particularly familiar with x86 paging specifically, but I do know a lot about paging in general, since many CPUs implement it (or support an external MMU that implements it).

PatSanf wrote:

I've read about having your kernel set up in the "higher half" starting at the 3G address, and then using all of the memory below that, and after the kernel, for the paging.

This is correct, assuming you mean the 3G virtual address. The physical address of the kernel is usually 1M, since it's convenient and likely to be RAM.

PatSanf wrote:

Once the kernel is loaded can I reclaim the "GRUB space," assuming I save the multiboot header information?

You can reclaim any RAM that isn't otherwise in use. (Don't try to reclaim the EBDA. That doesn't work.)

PatSanf wrote:

Will I be able to use my memory manager to successfully implement paging?

No.

In order to properly implement paging, you have to separate the data structures that track pages from the pages themselves. In addition to the problems Max mentioned, it also makes it impossible to allocate a contiguous block of memory larger than the largest available page. That's a bit of a problem when most programs are too big to fit in a single page, and require contiguous memory for their code and data.

PatSanf wrote:

I understand that when I enable paging the way I use memory addresses should change. Is this an entirely logical process the programmer handles, where I translate the page index and offset into a physical address, or is the translation handled by the Memory Management Unit?

The translation is handled by the MMU, based on page tables set up by the operating system. All addresses in your software will be virtual addresses, and the MMU will use the page tables to determine which physical address to use. You can leave portions of the virtual address space unmapped; it's very common to do this with the page at virtual address 0 to catch null pointer dereferences. (Note that I said address space, not memory. Paging affects all memory-mapped hardware, not just RAM.)

PatSanf wrote:

When creating the page table entries, do I need to allocate the 4K memory needed by each page, or do I mark them as not being present in memory and wait until I allocate the page before allocating the memory?

Demand paging is pretty popular, although it's probably more difficult to implement.

PatSanf wrote:

Do the starting addressed for each page need to be aligned at a 4K address, or can each page be place wherever's convenient for the physical memory manager?

The pages must be aligned to addresses (physical and virtual) that are a multiple of the page size. For 4kB pages, that means 4k-aligned addresses.

PatSanf · **Joined:** Mon Oct 13, 2014 6:39 pm **Posts:** 25

Thank you for the replies. I can see now that I've incorrectly implemented my physical memory manager, and I'll need to redo it.

max wrote:

Right now I'm only wondering how you allocate space for the nodes of your linked list before having even the physical manager set up? For paging, you will want a physical memory manager that is able to give you actual physical pages. There are different solutions for this, I for example prefer a stack allocator

I have a doubly linked list. Each node on the list is defined as a data structure, and has a pointer to a place in memory along with a value for the size of the memory pointed to by the node. When I initialize the memory manager I create the initial node on the list. The node's address will be the first byte of free memory after the kernel, the memory pointer will be the first address after the node data structure, and the size will represent all of the free memory. That gets put on a list of free memory areas. When I allocate some memory, I split the first node on the list that's large enough for the allocation into a properly sized node with a header, and a node for whatever's left over. I then remove the node being allocated from list of free memory nodes, and return the pointer to the beginning of the memory allocation. When I free memory I get the pointer that was returned, back up to the beginning of the header, put the node back on the linked list, and compact adjacent nodes.

I guess this would work if I was wanting to do a simple segmented memory manager, but I'm wanting to do paging.

max wrote:

Basically, it does not matter at all. When later compiling binaries for your OS, you can tell the linker to use whatever load address you want for your executable (or even make them relocatable/position-independet). Altough a higher half kernel has some advantages.

Since I'm going to have to redo the memory manager, would it be worth the extra effort to get things set up in the higher half before I get into that again? How will that affect writing to the VGA portion of memory? Will I need to successfully get paging going before the VGA addresses will be properly mapped, or am I worrying about nothing? Will doing things in the higher half require me to do the paging in ASM, before the kernel and vga stuff come into memory, or will I be able to easily do it just in C?

Octocontrabass wrote:

In order to properly implement paging, you have to separate the data structures that track pages from the pages themselves. In addition to the problems Max mentioned, it also makes it impossible to allocate a contiguous block of memory larger than the largest available page. That's a bit of a problem when most programs are too big to fit in a single page, and require contiguous memory for their code and data.

Maybe I'm not understanding where I'm supposed to store the data structures. I'm concerned that every time I allocate memory, I'll need to store the information in the data structure somewhere, and this storage requirement may be unpredictable. This is what led me to keep the data structure with the memory segments. I'd never need to worry about reserving a large enough segment to store headers because the headers would stay with the segments. I know that paging does provide the option to use a set amount of memory to store meta data about the memory being used. Do you know of a "for dummies" kind of guide to how the paging data structures are implemented and used?

Octocontrabass wrote:

The translation is handled by the MMU, based on page tables set up by the operating system. All addresses in your software will be virtual addresses, and the MMU will use the page tables to determine which physical address to use. You can leave portions of the virtual address space unmapped; it's very common to do this with the page at virtual address 0 to catch null pointer dereferences. (Note that I said address space, not memory. Paging affects all memory-mapped hardware, not just RAM.)

I think that's one of the big things I don't understand. How will software use the virtual addresses? When a piece use user software allocates memory, how does it then make use of it? Will the software need to calculate the proper segment:offset stuff to get the proper physical address, do I need to write functions to handle that on demand, or will it automatically be done by the MMU? If I enable paging, what will happen with all of the stuff I've already written which relies on using specific physical addresses? Like, the VGA driver to write text to the screen, or my keyboard driver using a specific area of memory for an input buffer. Will I need to rewrite those to use paged memory?

max · **Posted:** Tue Oct 14, 2014 1:13 pm

PatSanf wrote:

I have a doubly linked list. Each node on the list is defined as a data structure, and has a pointer to a place in memory along with a value for the size of the memory pointed to by the node. When I initialize the memory manager I create the initial node on the list. The node's address will be the first byte of free memory after the kernel, the memory pointer will be the first address after the node data structure, and the size will represent all of the free memory. That gets put on a list of free memory areas. When I allocate some memory, I split the first node on the list that's large enough for the allocation into a properly sized node with a header, and a node for whatever's left over. I then remove the node being allocated from list of free memory nodes, and return the pointer to the beginning of the memory allocation. When I free memory I get the pointer that was returned, back up to the beginning of the header, put the node back on the linked list, and compact adjacent nodes.

This sounds actually a lot like my first physical memory manager approach I wrote for the previous version of my kernel

This can be optimized though by not storing the address of the free region in its header, but instead just calculating (address_of_header + sizeof(header)) which gives you the address of the free memory behind.
Anyway, you can't use this for paging.

PatSanf wrote:

I guess this would work if I was wanting to do a simple segmented memory manager, but I'm wanting to do paging.

Xactly.

PatSanf wrote:

Since I'm going to have to redo the memory manager, would it be worth the extra effort to get things set up in the higher half before I get into that again? How will that affect writing to the VGA portion of memory? Will I need to successfully get paging going before the VGA addresses will be properly mapped, or am I worrying about nothing? Will doing things in the higher half require me to do the paging in ASM, before the kernel and vga stuff come into memory, or will I be able to easily do it just in C?

Whether you're doing higher half or not, I'd highly recommend you to implement paging as fast as possible because it will make things a lot easier.

PatSanf wrote:

Maybe I'm not understanding where I'm supposed to store the data structures. I'm concerned that every time I allocate memory, I'll need to store the information in the data structure somewhere, and this storage requirement may be unpredictable. This is what led me to keep the data structure with the memory segments. I'd never need to worry about reserving a large enough segment to store headers because the headers would stay with the segments. I know that paging does provide the option to use a set amount of memory to store meta data about the memory being used. Do you know of a "for dummies" kind of guide to how the paging data structures are implemented and used?

There's no "for dummies", because what you want to achieve is the first step to actually being able to call your software an "OS", and not just a booting "Hello world" binary. But don't worry, at one point it will *click* and you've got it.

PatSanf wrote:

I think that's one of the big things I don't understand. How will software use the virtual addresses? When a piece use user software allocates memory, how does it then make use of it? Will the software need to calculate the proper segment:offset stuff to get the proper physical address, do I need to write functions to handle that on demand, or will it automatically be done by the MMU? If I enable paging, what will happen with all of the stuff I've already written which relies on using specific physical addresses? Like, the VGA driver to write text to the screen, or my keyboard driver using a specific area of memory for an input buffer. Will I need to rewrite those to use paged memory?

Forget the segment:offset stuff nao (until you use vm86) because you do (i hope you do) protected mode, and theres no such thing.
Okay, I'll try to explain it to you very easy.
This will not be "the only way", or "the best way", but one way that actually works.

The first thing is, as you said: you'll never know how much memory you need to manage the memory. That is why the physical memory manager must be very simple. Lets make a stack allocator: the entire physical address space 0xFFFF FFFF consists of 1024 * 1024 pages of a size of 0x1000. To store the info for possibly each of these pages, you will at maximum need 1024 * 1024 pointers, each pointing to the start of a page: therefore you have 1024 * 1024 * 4 bytes, thats 4 MiB.
So first you find a space where the stack for your pages will be; then you read the multiboot memory map that GRUB provides you and add the address of EACH free page to that stack. This will be very fast, because pushing = set address and increment pointer; popping = decrement pointer and get address.

Now we have the physical memory manager ready. So, now when it comes to paging: Once you have enabled paging, the physical address space is no more accessible, EXCEPT the parts that you have mapped. That means, if you switch to a directory that contains only zeros, the full address space from 0x00000000 - 0xFFFFFFFF will exist, but you cant access ANY of this memory.
If you do now map a page into the directory, so that for example the page at 0xA0000000 is available, then 0xA0000000 - 0xA0001000 is the only area in the address space that can actually be written to. Now you will wonder: how can the CPU still access the code of my kernel if it is not available in the address space anymore? For this, it is very usual to simply identity-map the area where your kernel binary lays (lets say its 0x100000 - 0x102000). It is also very common to just identity map the entire lower megabyte into the first address space, so it's still accessible. Why identity-map? Because your binary must be available at where its loaded (if your code is not compiled as position independent).

If you have now set it up this way, you can enable paging, and your kernel will be happily running just where it is and you will be able to access the lower memory. But everything else in memory is no more accessible! Now you might wonder, "How can I then still edit the page directory and its tables??

" - For this, see the page about recursive mapping.

Once you have come this far, I bet you'll have a better understanding how paging works

From there on it shouldn't be too far anymore: if you want to have multiple processes, then each process gets its own page directory. Be careful to keep your kernel mapped inside the process directory; otherwise an interrupt will cause a double fault because the IDT is no more accessible, which will lead to a triple fault because the double fault handler couldn't be called either.

For the process part: Due to the fact that you made a page directory for each process, each process has its own "virtual address space" for his own. The kernel is responsible for managing the address space of each process. What does it mean that each process has its own space? Lets say you have "Process A" and "Process B". Each has its own directory. Now we map a physical page (1) into the directory of Process A, lets say so that it covers adress 0xDEADF000. Now we map a different physical page (2) at the same location in the page directory of Process B. On task switch, you will now always switch page directories: when Process A has its turn, the page directory of Process A is loaded as the current directory. When Process A now writes to the virtual address 0xDEADF000, he will write to the physical page 1. Now the timer interrupt comes, and we switch to Process B; also we also load the other page directory. Process B has it's turn, and if he now writes to the address 0xDEADF000, he will write to the physical page 2 instead!

If you have now understood paging, you will still have to figure out some things yourself. Like, "Where do I map the stack of the PMM to make it available after enabling paging?" But I'll keep that to you and your fantasy.
Hope this helps

eryjus · **Posted:** Tue Oct 14, 2014 2:34 pm

PatSanf wrote:

Thank you for the replies. I can see now that I've incorrectly implemented my physical memory manager, and I'll need to redo it.

I've recently given a lot of thought to my Memory Management Layer. While that was a singular layer, I was struggling (like I think you are) trying to get my head around it. What a mess (in my head).

Then, I recently found a post by Combuster that really distilled that down for me.

PatSanf wrote:

I have a doubly linked list. Each node on the list is defined as a data structure, and has a pointer to a place in memory along with a value for the size of the memory pointed to by the node. When I initialize the memory manager I create the initial node on the list. The node's address will be the first byte of free memory after the kernel, the memory pointer will be the first address after the node data structure, and the size will represent all of the free memory. That gets put on a list of free memory areas. When I allocate some memory, I split the first node on the list that's large enough for the allocation into a properly sized node with a header, and a node for whatever's left over. I then remove the node being allocated from list of free memory nodes, and return the pointer to the beginning of the memory allocation. When I free memory I get the pointer that was returned, back up to the beginning of the header, put the node back on the linked list, and compact adjacent nodes.

I chose a bitmap to do my physical memory management layer, with a couple of additions: I maintain (initially) 16384 qwords (128K) bitmap of 4K frames in physical memory -- 1 bit per frame. I leveraged the inspiration from Velko, except I do not layer up the bitmap. Instead I chose to maintain a pointer to the last qword where I found a frame of memory free and start the next search at that point. The logic is simple: I do not assume that any memory is freed (processes are greedy) and therefore why go back and look at memory I have already determined to be fully allocated. Of course, when I reach the end of the bitmap, I cycle back around to look again and check if I have gone all the way around.

The source is here, though not fully debugged, if you want to take a look.

PatSanf · **Joined:** Mon Oct 13, 2014 6:39 pm **Posts:** 25

I've been working on my system, and hope that things are starting to come together. Hopefully I'm now on the right track.

I found a pen-pal from the forum who's working on a similar project, and he's been very helpful with brainstorming. His project is at https://github.com/ExeTwezz/Basic_OS and I've been going over his code quite a bit. Most of this message is from an email I sent him. I'm hoping to get more feedback at this point. My new physical memory manager can be found at https://github.com/PSanf2/POS_C9/blob/n ... _manager.c

I think I found a couple of posts which might help explain things about paging.

viewtopic.php?p=83300&sid=556a4a95f972519c110bf84d0930c358#p83300

viewtopic.php?p=66664&sid=a5acb0cde0dc8bb44f4e092186270565#p66664

I'm trying to use those concepts as a guide at this point.

This is what I've done so far.

On the first level there's the physical memory manager. This is responsible for allocating and freeing blocks of physical memory in multiples of 4K. This is the portion that's often required to make sure things are aligned on a 4K boundary. The physical memory manager needs to keep track of which blocks of memory are allocated/free, and will often use a bitmap, stack, or buddy system for this. This is where you need to use the memory map from GRUB to make sure the kernel will only see the proper areas of memory as being available to allocate. All of the 4K blocks of physical memory should be adjacent to each other, with no data structure information embedded between memory blocks. The information detailing which blocks are free/allocated needs to be stored separately. An easy way to accomplish this is with a stack in low memory. The stack starts at 0x90000, grows downward, and when the stack pointer is 0 the stack is full. This stack can be used to store 32-bit physical address (requiring four bytes) of 4K blocks of memory that haven't been allocated. Here's a screenshot of what I'm getting after running my memory manager initialization function: http://i.imgur.com/vB5I1V2.png I'm using a multiboot header from GRUB to get the amount of memory on the system as well as getting the memory map address, and size. I'm iterating over the memory map and getting information about each region of memory. If the region type is 1 then it's a free region of memory I can use. For each free region I start at the base address, push that to the stack, increment the address by 4096, and decrement the stack pointer by 4. Before pushing a physical memory address to the stack I check to make sure the address doesn't start in my stack, or in my kernel.

This set up is proving to have some limitations/issues. The first region of memory available is the area between 0x0 and 0x9F400. This gives me 652288 addresses of memory to play with, or 637Kb. With my stack starting at 0x90000 I should have 589824 bytes of memory for the stack, or 576Kb for my stack. The next free region of memory starts at the 1M mark, with high memory. This is 0x7EFE000 in length, giving me 126MB of memory. 126 * 1024 *1024 = 132120576 memory addresses that I need to manage (minus the memory where the kernel is, which is ~36Kb). These range from 0x100000 to 0x7FFE000. Storing only each 4096th address on the stack means I'll need to put 132120576 / 4096 = 32256 address on the stack. With each address taking 4 bytes I'll need 32256 * 4 = 129024 bytes, or 126Kb to save all the addresses. I think I'm doing something wrong here. When I look at what my stack pointer is doing, it looks wrong. For the region of high memory that gets mapped, the stack pointer starts at 0x8FF00 and finishes at 0x10FB0. This means I used 0x8FF00 - 0x10FB0 = 0x7EF50, or 520016 bytes of the stack. That's way too much! Am I doing something wrong on my stack pointer? Do I not properly understand how much memory I actually need for each of the memory addresses?

Also, I've noticed that this implementation seems to be limited to systems with only 128MB of memory. If I try to run Qemu with anything more I end up running out of space on the stack before I save all of the addresses. I know that with paging coming up, I'll need 4MB of memory to map 4GB worth of virtual addresses. If I want to run a system with more than 128MB of memory (which I eventually will) then trying to use a stack in low memory is proving to be a bad idea. Should I just move the stack and have it live in the space after my kernel, or am I barking up the wrong tree?

The following I've yet to implement, but I'm thinking about.

On the second level there will be the virtual memory manager. This uses the physical memory manager to allocate and free blocks of physical memory, calling them pages. It puts the addresses of the blocks into page tables, and puts the page tables into page directories. The virtual memory manager will also handle things like swapping. I'm thinking that paging is a form of virtual memory. To do paging you get at least one page table set up in a page directory, and you need to map your kernel into it somewhere. I'm starting to get confused about what goes on here. I know identity mapping is going to come into play.

On the third level should be something called a heap manager. Heap does not refer to a specific data type. The word is being used in the general sense to mean any data type that's used to provide allocation and freeing. The heap manager can be used to allocate memory of any size. I think it's supposed to work mainly for small allocations of memory. This is where we start seeing funky things like linked lists with headers and footers being utilized, and concepts like first fit become relevant. I don't think it matters in this so much where you store the metadata. This is mainly to be used my user level processes running on the system. This is where you'd implement functions like malloc and free. Allocating memory would involve asking the virtual memory manager to allocate some pages, and then using a linked list routine to find holes, make smaller holes, compact memory, and that sort of thing to break them up into smaller portions of memory for an allocation request.

What are you thinking of all that? I'm hoping I'm not going on a wild goose chase.

max · **Posted:** Sun Oct 19, 2014 6:43 am

PatSanf wrote:

An easy way to accomplish this is with a stack in low memory. The stack starts at 0x90000, grows downward, and when the stack pointer is 0 the stack is full. This stack can be used to store 32-bit physical address (requiring four bytes) of 4K blocks of memory that haven't been allocated. Here's a screenshot of what I'm getting after running my memory manager initialization function: http://i.imgur.com/vB5I1V2.png I'm using a multiboot header from GRUB to get the amount of memory on the system as well as getting the memory map address, and size. I'm iterating over the memory map and getting information about each region of memory. If the region type is 1 then it's a free region of memory I can use. For each free region I start at the base address, push that to the stack, increment the address by 4096, and decrement the stack pointer by 4. Before pushing a physical memory address to the stack I check to make sure the address doesn't start in my stack, or in my kernel.

No!

you don't want to overwrite the lower memory, because there is the IVT and important information that the BIOS provides to you. See the memory map but don't rely on it: use GRUB's memory map to search for a free block of memory above 1MB (behind the kernel code, for sure) that you can use for storing this information. The rest sounds good

PatSanf wrote:

This set up is proving to have some limitations/issues. The first region of memory available is the area between 0x0 and 0x9F400. This gives me 652288 addresses of memory to play with, or 637Kb. With my stack starting at 0x90000...

See above; use a memory area that GRUB tells you. You will need 1024 * 1024 * 4 bytes (thats 4MiB or 256 pages) to have enough space for 0x00000000 - 0xFFFFF000. But you can also first read the memory map, determine how much space is actually available, and then size your stack accordingly.

To the virtual stuff, try to get your kernel running when paging is enabled, so you get an idea of how paging changes stuff. To the heap thing, you'll have a kernel heap, thats where you can use all those cool algorithms, but also consider that you'll have tasks that need a heap too and how you can implement this, I wrote something about this here lately.

martinFTW · **Joined:** Sun Jun 08, 2014 10:39 am **Posts:** 15

Quote:

This means I used 0x8FF00 - 0x10FB0 = 0x7EF50, or 520016 bytes of the stack. That's way too much!

1. It seems you stumbled upon pointer arithmetics here. Decrement/increment stackptr with one instead of four, then you will only use a forth of memory for your stack.

Change:

Code:

   //vga_buffer_put_str("\nPush ");
   //vga_buffer_put_hex(addr);
   //vga_buffer_put_str(" stack_ptr=");
   //vga_buffer_put_hex((u32int) stack_ptr);
   stack_ptr -= 4;
   *stack_ptr = addr;
(...)
      vga_buffer_put_str("\nHalting.");
      for (;;) {}
   }
   u32int addr = *stack_ptr;
   stack_ptr += 4;
   return addr;
}

to

Code:

//vga_buffer_put_str("\nPush ");
   //vga_buffer_put_hex(addr);
   //vga_buffer_put_str(" stack_ptr=");
   //vga_buffer_put_hex((u32int) stack_ptr);
   stack_ptr --;
   *stack_ptr = addr;
(...)
      vga_buffer_put_str("\nHalting.");
      for (;;) {}
   }
   u32int addr = *stack_ptr;
   stack_ptr ++;
   return addr;
}

2. I don't like the idea of a stack-only PMM at all. You don't know how much free pages there are(and present-day PCs have many gigabytes), so you are very likely to have a bufferoverflow and to overwrite something in lower memory as max pointed out. This huge stack must be placed somewhere, where it does not overwrite code, data, the usual stack, memory allocated by the mmap or later by the PMM itself. I use a stack in combination with a bitmap. The stack is implemented as a static array with 1000 integers. So i don't need to care about hardcoded addresses or pointer arithmetics

. if this stack is empty my kernel will refill it with 1000 addresses from bitmap. Certainly a disadvantage of this approach is that refilling the stack can mske up a very long latency. So I should not refill the PMM stack in interupt mode, but inside a kernel thread. It would be also a good idea to have something like a watermark for the bitmap.

max · **Posted:** Sun Oct 19, 2014 12:04 pm

martinFTW wrote:

2. I don't like the idea of a stack-only PMM at all. You don't know how much free pages there are(and present-day PCs have many gigabytes), so you are very likely to have a bufferoverflow and to overwrite something in lower memory as max pointed out. This huge stack must be placed somewhere, where it does not overwrite code, data, the usual stack, memory allocated by the mmap or later by the PMM itself. I use a stack in combination with a bitmap. The stack is implemented as a static array with 1000 integers. So i don't need to care about hardcoded addresses or pointer arithmetics

. if this stack is empty my kernel will refill it with 1000 addresses from bitmap. Certainly a disadvantage of this approach is that refilling the stack can mske up a very long latency. So I should not refill the PMM stack in interupt mode, but inside a kernel thread. It would be also a good idea to have something like a watermark for the bitmap.

You do know what and how many free pages are available, because either memory probing or the multiboot memory map that your bootloader gave you can tell you. See it realistically, in normal home computers you have (normally) at most 16GB memory at the moment, this amount of memory can put on a page stack with a size of 1024*1024*16 bytes, that's only 16MiB. If you first find out how much memory is actually available, and then size your stack accordingly, then find a place to store this stack, and then put the free page addresses on it it will be fine

PatSanf · **Joined:** Mon Oct 13, 2014 6:39 pm **Posts:** 25

max wrote:

No!

you don't want to overwrite the lower memory, because there is the IVT and important information that the BIOS provides to you. See the memory map but don't rely on it: use GRUB's memory map to search for a free block of memory above 1MB (behind the kernel code, for sure) that you can use for storing this information. The rest sounds good

Basically, I should put my stack in the second free region that GRUB gave me here: http://i.imgur.com/vB5I1V2.png I also have my kernel in there, so my stack would start right after the kernel? Also, should I be making sure that the physical addresses on the stack are "page aligned" to addresses divisible by 4096? Should I only be putting addresses that are in the high memory on the stack?

max wrote:

You do know what and how many free pages are available, because either memory probing or the multiboot memory map that your bootloader gave you can tell you. See it realistically, in normal home computers you have (normally) at most 16GB memory at the moment, this amount of memory can put on a page stack with a size of 1024*1024*16 bytes, that's only 16MiB. If you first find out how much memory is actually available, and then size your stack accordingly, then find a place to store this stack, and then put the free page addresses on it it will be fine

Isn't it also true that a 32-bit x86 system can only work with 4GB of physical memory? With that said, can't I just use a 4MB stack for all the pages? If the system can only use 4GB of physical memory, low memory is unavailable, and so are various parts of high memory, then I should never fill up a 4MB stack.

~~~~

Made some changes. They can be seen here: https://github.com/PSanf2/POS_C9/blob/n ... _manager.c Here's a screenshot: http://i.imgur.com/gggcsSj.png

I have it set up so it gets the amount of memory from GRUB. I then figure out how many KB I have, divide that by 4, and get a number telling me how many entries I'll need to have on the stack. I use the address right after the end of the kernel for the low address of my stack. I add to that the size of the stack, and get the high address. I then keep the stack in that area of memory. From there, I go over the memory map, and for any free regions in high memory, I put the 4096th address on the stack. This seems to be working properly to get things set up. I've tested it with an emulator giving it 2GB of memory, and the stack comes out to being 2MB in that situation. I can't give the emulator any more memory to test it with due to the real world limitations of my box.

Should I be getting that free portion of low memory on to the stack, or can I just ignore it? Basically, just leaving low memory alone entirely. Should I be putting 4K aligned addresses on the stack at this point? If so, I was thinking I'd just make sure the base address mod 4096 is 0, and increment it in a loop until it is. If I do this, I'll be invalidating the length value provided by the memory map. I'm assuming that length is telling me how many addresses I'm dealing with in the region detailed in the memory map. Can I just decrement that when I increment the base address to keep things working properly?

max · **Posted:** Mon Oct 20, 2014 1:48 am

PatSanf wrote:

Basically, I should put my stack in the second free region that GRUB gave me here: http://i.imgur.com/vB5I1V2.png I also have my kernel in there, so my stack would start right after the kernel? Also, should I be making sure that the physical addresses on the stack are "page aligned" to addresses divisible by 4096? Should I only be putting addresses that are in the high memory on the stack?

Not the "second free region", but the first free region that matches the size that you need. Write something dynamic here.
Yes, the addresses must be page-aligned to 0x1000, otherwise you can't use them for paging (the bits in the range 0xFFF are used for flags in the dir/table).

PatSanf wrote:

Isn't it also true that a 32-bit x86 system can only work with 4GB of physical memory? With that said, can't I just use a 4MB stack for all the pages? If the system can only use 4GB of physical memory, low memory is unavailable, and so are various parts of high memory, then I should never fill up a 4MB stack.

Yes thats right, you can simply check this yourself: 32bits means the highest usable address is 0xFFFFFFFF (one hexadecimal character is a nibble) = 0xFFFFFFFF addressable bytes = 0xFFFFFFFF / 1024 / 1024 = 4 Gibibyte of memory.

PatSanf wrote:

I have it set up so it gets the amount of memory from GRUB. I then figure out how many KB I have, divide that by 4, and get a number telling me how many entries I'll need to have on the stack. I use the address right after the end of the kernel for the low address of my stack. I add to that the size of the stack, and get the high address. I then keep the stack in that area of memory. From there, I go over the memory map, and for any free regions in high memory, I put the 4096th address on the stack. This seems to be working properly to get things set up. I've tested it with an emulator giving it 2GB of memory, and the stack comes out to being 2MB in that situation. I can't give the emulator any more memory to test it with due to the real world limitations of my box.

Sounds good. But 4096th address? You should page-align-up the start and page-align-down the end address of each area to make sure you don't use unusable memory.

PatSanf wrote:

Should I be getting that free portion of low memory on to the stack, or can I just ignore it? Basically, just leaving low memory alone entirely. Should I be putting 4K aligned addresses on the stack at this point? If so, I was thinking I'd just make sure the base address mod 4096 is 0, and increment it in a loop until it is. If I do this, I'll be invalidating the length value provided by the memory map. I'm assuming that length is telling me how many addresses I'm dealing with in the region detailed in the memory map. Can I just decrement that when I increment the base address to keep things working properly?

Pull out a piece of paper, draw the memory as the memory map tells you it is, and find out yourself how you can calculate the addresses best.

JAAman · **Posted:** Mon Oct 20, 2014 8:23 am

max wrote:

PatSanf wrote:

Isn't it also true that a 32-bit x86 system can only work with 4GB of physical memory? With that said, can't I just use a 4MB stack for all the pages? If the system can only use 4GB of physical memory, low memory is unavailable, and so are various parts of high memory, then I should never fill up a 4MB stack.

Yes thats right, you can simply check this yourself: 32bits means the highest usable address is 0xFFFFFFFF (one hexadecimal character is a nibble) = 0xFFFFFFFF addressable bytes = 0xFFFFFFFF / 1024 / 1024 = 4 Gibibyte of memory.

no, actually thats not true at all... 32-bit PMode allows for 32-bit virtual address space, that is 4GB per process for virtual not physical address space, 32-bit PMode does allow access to the entire address space, and beginning with the P6 architecture (originally released as the PentiumPro in 1995) this has been at least 36 bits (allowing at least 64GB physical address space) and much larger on newer systems (hint: search for PAE in the Intel manuals)

max · **Posted:** Mon Oct 20, 2014 9:03 am

JAAman wrote:

no, actually thats not true at all... 32-bit PMode allows for 32-bit virtual address space, that is 4GB per process for virtual not physical address space, 32-bit PMode does allow access to the entire address space, and beginning with the P6 architecture (originally released as the PentiumPro in 1995) this has been at least 36 bits (allowing at least 64GB physical address space) and much larger on newer systems (hint: search for PAE in the Intel manuals)

Oh, right, I did not consider PAE, so that was wrong. Without PAE though you can calculate it like this.

PatSanf · **Joined:** Mon Oct 13, 2014 6:39 pm **Posts:** 25

max wrote:

JAAman wrote:

no, actually thats not true at all... 32-bit PMode allows for 32-bit virtual address space, that is 4GB per process for virtual not physical address space, 32-bit PMode does allow access to the entire address space, and beginning with the P6 architecture (originally released as the PentiumPro in 1995) this has been at least 36 bits (allowing at least 64GB physical address space) and much larger on newer systems (hint: search for PAE in the Intel manuals)

Oh, right, I did not consider PAE, so that was wrong. Without PAE though you can calculate it like this.

OK, that's good because that's what I'm going for. I figure I'll dive into PAE, and 64-bit stuff once I have a solid handle on legacy stuff because the topics build on each other. I'm shooting to get a DOS-type system with the current iteration of my project. I'll be happy once I can get a monotasking system that makes use of the hard drive written. I figure that at that point it'll be time to start another iteration, and look into multitasking, PAE, and 64-bit stuff. :-D

---------------------------------------------------------------------------------------------------------------------------------------------------

I hope this is it. Here's my newest revision, and some screenshots.
https://github.com/PSanf2/POS_C9/blob/m ... _manager.c
http://imgur.com/a/XiEDy
The first image shows what I currently get. The second image was of a debug run where I had it print out what was being pushed to the stack, and what the stack pointer was. The duplicated line is the result of an emulator running in an emulator being given a stop command, and so can be safely ignored. The third image was a debug run where I wanted to make sure the loop stopped where I was expecting it to.

What I have set up doesn't touch low memory. Anything in that first MB of space doesn't get put on the stack. I'm just writing off low memory because there's very little free space there, and it's safer to just leave it alone. I'm making the assumption that I'll have enough room for the stack right after the kernel, and so I'm putting my stack there. My kernel starts at the 1M mark, so it's always being put in high memory. This also puts my stack near the beginning of high memory. I'm calculating the size of the stack from the amount of system memory so it'll never be run out of space, or be too wasteful. For each free region of memory that I'm given by the memory map, I check to see if each address should go on the stack. If the address doesn't point to where my kernel is, or where my stack is, I make sure it's 4K aligned and push it to the stack.

I'm hoping that what I now have will meet all of the physical memory manager requirements that I need to implement before writing a virtual memory manager for paging. The only thing I think I might be iffy on is the assumption that I'll have at least 4MB of memory available immediately after the kernel. I'm guessing that this is a safe assumption to make. If it isn't, then I can correct it by looping over the memory map to find a guaranteed area large enough for the stack, or putting a line reading ". = . + 4M" in my linker script to eliminate the assumption (I think). What do you guys think of my assumption, possible solutions if it causes a problem, and code?

Thanks for all the help! I wouldn't be this far along without you guys.

I am getting curious about memory mapping. If I don't have the addresses that the kernel occupies on the stack will I be able to properly identity map the memory, or is that something I only need to worry about with a higher half kernel?

OSDev.org

How can I improve my OS code? What direction should I take t

Who is online