Hi,
SpyderTL wrote:
Does that mean that, technically, you could just have 1 4KB page, and you could just swap in the requested page and at the same time "swap" in the requested page table entry and then jump back to the faulting instruction?
I know it would be slow, but technically would it work?
In the virtual address space; you must have:
- An IDT with a minimum of 14 entries
- A TSS (at least the stack related parts of a TSS)
- A GDT with a minimum of 4 entries (excluding the "NULL" entry which can overlap something else)
- Interrupt handlers (that may be minimal stub/s that do little more than switch to the kernel's virtual address space)
- A (potentially tiny) stack that the interrupt handler/s can use
In theory (with severe performance implications) all of that can be crammed into a single 4 KiB page.
Note: If you run applications at CPL=0 then you can do it without a TSS, but all the rest is still needed and it'd still need a minimum of a single 4 KiB page.In addition to that; consider an instruction like "push dword [0x1234FFFF]". For this instruction to succeed (without page faults); at a minimum the code at RIP, the top of the stack, and the data at address 0x1234FFFF must all be in the virtual address space at the same time. There are other "3 memory location" instructions (e.g. "movsd") but there are no "4 memory location" instructions (that I can think of). However, all of the locations maybe be misaligned; such that the CPU needs 2 pages to fetch the entire instruction, 2 pages to access each piece of data, etc.
This means that a minimum of 7 pages (2 for code, 2 for data, 2 for user stack or data, plus one for kernel) need to be mapped into the virtual address space to guarantee that all possible instructions can be executed successfully and all exceptions can work correctly.
criny wrote:
I want to make user process use full virtual memory space 256TB in x86_64 machine...
An alternative is to have "thread specific storage". For example, if you split the virtual address space into 1 GiB of "process space", 2 GiB of "thread space" and 1 GiB of "kernel space"; then a process with 131072 threads would end up having a total of "1 + 2*131072 GiB" of virtual memory (or slightly more than 256 TiB of virtual memory) and you wouldn't even need a 64-bit CPU to do that.
Cheers,
Brendan