What I know:
*Paging is used for mapping physical addresses to virtual ones. Virtual address space can be greater than the physical one. CR3 contains a pointer to a page directory which contains 1024 page tables and there are 1024 pages within those page tables. MMU uses that page directory to figure out how individual addresses are mapped. CR3 must be a physical address. CR0 is used for enabling paging by setting the PG bit to 1. Addresses must be page aligned (4KB aka 0x1000). I need to identity map a first few megabytes of my kernel. Page faults occur when you try to access an address that doesn't exist.
This sounds pretty much correct to me, apart from the last part, I think it would be more accurate to say "Virtual addresses which are not mapped to a physical address result in a page fault." In addition, page faults can also occur for a few other reasons, such as writing to a read only page, or not having the required privileges to read / write to a page. The main thing you should take away from that is that you can try to access any virtual address you want as soon as you turn paging on, but the CPU will page fault if you're doing something you're not supposed to.
Also of note, the addresses needing to be page aligned is kind of a side effect of two things, one, the bottom 12 bits are used for something other than describing where the page tables / page mappings go to (permissions, caching flags, etc), because some flag bits of some kind are helpful to have in the structure and that's a good place to put them, and two, paging only ever operates on the top 20 bits of an address, so forcing all structures to be page aligned helps significantly in simplifying the design of the paging system in the CPU, since it never needs to do things to the bottom 12 bits of the address line.
What I don't know:
The difference between pages and frames. What is the exact technical order of thing that need to be done in order for paging to be enabled (identity mapping of the first few megabytes included). Where am I supposed to put my page directory? (at which address) Do I need dynamic memory allocation for paging to be enabled? What does it mean to allocate a frame and free it? Why do we allocate pages and frames? How is it done?
To enable paging, the gist of it is:
- Set up the paging structures somewhere in memory
- Load the CR3 Register with the address of said structures
- Set the Paging Enable bit in CR4
Of course, the first step is the real meat of the problem, which I'll get to in a moment. You mention that you know that identity mapping needs to be done in order for paging to work. Just to clarify, one way of thinking about it is that when you turn paging on, all the memory address lines in the CPU now go through a translation unit (that's what the MMU is) before going into the RAM electronics. So, as soon as you enable paging, the very next instruction fetch will be attempted to be translated by the MMU, and if your paging structures aren't set up correctly, the MMU will page fault, and at this point, you're probably going to triple fault because any subsequent accesses will also have problems and trigger page faults. You probably knew this already, but I'm just trying to make sure I cover everything relevant to the stuff you don't understand.
So, in order to set up paging, you first need to set up the identity paging for any code which is meant to be run without address translation, otherwise you'll run into the problems above. Once you've enabled paging, you can either map the rest of your kernel into somewhere else in the virtual address space, or you can map the kernel into it's actual location so that any virtual memory accesses to the kernel make the same accesses to physical memory (AKA, identity mapping the kernel). While there's some debate about whether you should have the kernel identity mapped or moved somewhere else, it's my personal opinion that most of the arguments for either side are weak, and the best advice is that you shouldn't make the choice based on whether you're having trouble accomplishing it or not. Also, don't forget, if you move your kernel, you'll at least need to identity map the parts of it which enable paging, for the reasons described above, and them jump to your kernel's new location (this is the essence of what the "Higher half bare bones" tutorial is about, understand it thoroughly if you do decide to move your kernel.)
You seem pretty confident with the way the paging structures work, so to begin, you should probably just consider "I want my kernel to be mapped from XXX to XXX, how would I make the paging structures accomplish that?", and then create the paging structures which achieve this (Drawing a diagram like this
will likely help). Where you put this structure is completely 100% up to you, as long as you know where it is (in physical memory). Once you've done that, go through the wiki pages about paging and make sure you've set up the various flags and permissions correctly (if the wiki is ambiguous on something, sorry, but you'll have to consult the Intel manual to get the full details about a particular flag or field, section 4.3 of volume 3 of the manuals should help). Be sure that you're absolutely certain that you know what each bit does and why it's set the way it is (I know from experience it's extremely easy to make mistakes here, so give a double check if things aren't what they seem.)
Once you've set up the paging structures and enabled paging, you shouldn't notice anything different if you identity mapped, or, if you moved your kernel, you should be able to jump to your kernel's new location and continue as if it was loaded into physical memory there in the first place. Remember, the whole point of setting up this stuff with our kernel is just so that it works while we use address translation for other purposes. Enabling paging actually takes a little more than I mentioned above, section 4.1.2 of Vol. 3 of the Intel manuals describes the exact process.
In regards to frames, my understanding of the term (at least, the way the wiki seems to describe it to me), is that allocating and freeing page frames is the act of creating and removing areas of translation in the virtual address space, so I guess you could call a frame "an area of virtual memory where you're going to map something to". A simple example is that you want to run a new program that takes up 10k in code and 6k in R/W data, and the program runs under the assumption that it is loaded at 0x10000000. You'd allocate a frame by simply changing the paging structures so there's a read only area at 0x10000000 that's 12k long (we have to abide to page granularity, hence 12 and not 10) and a Read / Write area at 0x10003000 which is 8k long. To illustrate this, we can add to our little diagram we made before like so
. It doesn't matter where we put the program in physical memory, because we mapped it to the right place in virtual memory. In fact, our program could be spread out in several pieces in physical memory as long as all the pieces were mapped back together in the right places again. You don't NEED dynamic memory allocation to do things with paging, but as you can see, being able to know where there's space to load the program in physical memory (so that you can later map it to the right spot it needs to be in) is going to be very helpful.
2.Dynamic memory allocation:
What I know:
*Dynamic memory allocation is use for allocating contiguous chunks of memory. Most common functions are malloc, free, calloc, realloc. Malloc allocates memory aka returns a pointer to it. Free frees it. Calloc is similar to malloc but with zeros. Realloc is expanding already allocated memory. In order to allocate something I need to return a pointer to it. (also I need to increment its last address) People usually use lists for this. It is complicated.
What I don't know:
How it's done. How to use lists with it? How to free memory? What does it have to do with paging? Do I need to allocate pages? Will it mess up my paging? Can it override paging stuff aka pages? How do I actually code it? I don't seem to understand how it works from point A to point B. How do I keep track of allocated addresses? What happens when I free multiple bytes but in different locations, how do I merge them?
It's super important to realise there are several "kinds" of memory management depending on the situation you're in. The things you're mentioning in the "What I know" section above sound very much like a program's heap memory, which is where it stores arrays and other dynamically sized structures at runtime. Note that this is on the scale of bytes for things such as malloc and free (eg. "I need 12 bytes for an array of 3 ints, where can I put them?") An entirely different kind, although one you're probably going to want to consider now, is memory management on a page level. From my understanding, a process basically gets a small sandpit of memory to do its heap work (such as malloc and free). If it finds it's running out of room, it expands this sandpit area through some kind of system call to get more memory for the process (on linux, these are the sbrk and mmap system calls I believe). For example, a process may start of with 16K of heap space, and if a malloc call ends up using up all of that space, it will ask the kernel for more, eg. "kernel, please give me up to 32k to use for various variables and arrays and things on the heap" These calls ask the kernel to do memory management on the scale of pages, not bytes, and this is usually done by editing the page tables, as well as various other structures the OS is using to keep track of usage and locations of stuff.
It's critical to understand the differences between the various kinds of "memory" resources that are managed by the OS, and some basic theory for how this is done, and you seem to be a little confused, or at least, unaware, of some of these. That's OK, it's not easy, a lot of the terms change meaning depending on context and the kind of resource they're referring to. This seems to stem from a small problem I used to have when I was working on my OS, which was "This feature is important in an OS, therefore I should learn how to operate it." Instead, and maybe this is a bit radical, but my personal suggestion is to abandon paging for now and spend some time designing
your system, especially in terms of the kernel and how it loads programs and gives them stack and heap space. A lot areas of OSdev are based around solutions to problems which are not apparent until you're in the middle of designing a system, and so, you end up trying to learn a solution to a problem you don't have.
As an example, surely you're aware of the difference between polling the keyboard and waiting for interrupts from it. When you start off with this area, you might think "Polling works fine, the CPU is very fast, so surely polling will work fine." That's OK, until you realise that not everybody wants not write some code in their program to poll the keyboard, especially if they're not using it. It's almost as if you need a way for the hardware to signal that something happened without waiting for the active program to take care of it... Interrupts! In the same way, paging seems confusing at first, until you consider the logistics of loading and running multiple programs in the same address space, especially when their expectations of the location of data clashes. If you have access to a library, I recommend you read the chapter on memory management in Andrew Tanenbaum's "Modern operating systems". He explains the problem from the ground up, and it made it really clear to me why paging and memory management exist in the first place, as well as outlining various methods of solving the problems that come up in those areas. If you read that, or take some time to encounter these problems yourself, I think it will help immensely with your understanding of all of the things you've mentioned in part two of your question.
Side note: While I was writing this, alexfru already provided a lot of great answers. I'm going to admit upfront that I'm not nearly as experienced or well practised as they or other members of the forum, but I think my answers are nonetheless still useful, as I try to provide a more intuitive explanation and try to use my own learning experiences to try to help others find solutions that work for them. As always, take my answers with a large grain of salt, there's no warranties, I'm just trying to help where I feel I can. This doesn't sound great, but the
best way to learn about how the CPU works is the Intel manuals, read chapter 4 of volume 3 at least a few times, take time to cross reference all the diagrams and thier explanations, and maybe even take out a pencil and paper and draw some diagrams which intuitively explain to yourself what the manual is talking about. The manual is dry, not structured in the way you'd expect, and often words things in a seemingly strange way, because it's a technical document and correctness comes before readability. Just give it a shot and see how it goes, and come back with any parts from your first question you don't understand, or, for that matter, parts of the manual you don't understand. People will be surprisingly willing to help you read the manual as opposed to providing explanations of the manual.