OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 10:42 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Single address space in Long mode
PostPosted: Wed Jan 06, 2010 6:46 am 
Offline
Member
Member
User avatar

Joined: Fri May 16, 2008 7:13 pm
Posts: 301
Location: Hanoi, Vietnam
Since Long mode requires Paging enabled, I guess I will have to Identity map the whole system memory. What do you think of it? Good of bad idea?

_________________
"Programmers are tools for converting caffeine into code."


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Wed Jan 06, 2010 6:57 am 
Offline
Member
Member
User avatar

Joined: Tue Dec 25, 2007 6:03 am
Posts: 734
Location: Perth, Western Australia
In my experience, unless you are doing an OS like a console kernel (single process only), doing complete identity mapping is a bad idea. Heck, even if there will only be one "address space", you would still want to have it be contiguous, without having to worry about the copious amount of holes in the x86 physical memory space.

So, in short, BAD IDEA!

_________________
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Wed Jan 06, 2010 7:02 am 
Offline
Member
Member
User avatar

Joined: Sun Oct 22, 2006 7:01 am
Posts: 2646
Location: Devon, UK
/agree

Unless you have a really specific reason for doing so, I wouldn't identity map the whole of the physical memory.

Cheers,
Adam


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Wed Jan 06, 2010 7:25 am 
Offline
Member
Member
User avatar

Joined: Fri May 16, 2008 7:13 pm
Posts: 301
Location: Hanoi, Vietnam
Thank you all.
So are there any alternative way to implement flat memory model in Long mode?

_________________
"Programmers are tools for converting caffeine into code."


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Wed Jan 06, 2010 10:03 am 
Offline
Member
Member
User avatar

Joined: Tue Dec 25, 2007 6:03 am
Posts: 734
Location: Perth, Western Australia
Well, long mode is inherently "flat" (meaning there are no segments).

I think what you mean is contiguous (so there is memory from 0 to as far as you can go)
That should be relatively easy, just make a memory manager and the call MM_Allocate(0), MM_Allocate(0x1000), ... to fill the address space.

_________________
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Wed Jan 06, 2010 10:37 pm 
Offline
Member
Member
User avatar

Joined: Mon Jun 05, 2006 11:00 pm
Posts: 2293
Location: USA (and Australia)
All you have to do is identity map the amount of memory you have in the system (to avoid memory mapping the entire range and wasting space with unnecessary entries).

_________________
My OS is Perception.


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Wed Jan 06, 2010 11:53 pm 
Offline
Member
Member
User avatar

Joined: Fri May 16, 2008 7:13 pm
Posts: 301
Location: Hanoi, Vietnam
MessiahAndrw wrote:
All you have to do is identity map the amount of memory you have in the system (to avoid memory mapping the entire range and wasting space with unnecessary entries).

That's what I mean! LOL. In the first post, I said "the whole *system* memory", not the whole range of virtual memory.

So, any pointer made by any process will be globally correct, right?

_________________
"Programmers are tools for converting caffeine into code."


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Thu Jan 07, 2010 8:25 am 
Offline
Member
Member
User avatar

Joined: Fri May 16, 2008 7:13 pm
Posts: 301
Location: Hanoi, Vietnam
I recently found an interesting point in AMD64 Manual - Vol.2:
Code:
As a result, SYSCALL and SYSRET can take fewer than
one-fourth the number of internal clock cycles to complete than the legacy CALL and RET
instructions. SYSCALL and SYSRET are particularly well-suited for use in 64-bit mode, which
requires implementation of a paged, flat-memory model.


Are they really that fast? The manual also mentioned about "paged, flat memory model" in 64-bit.

_________________
"Programmers are tools for converting caffeine into code."


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Thu Jan 07, 2010 11:16 am 
Offline
Member
Member
User avatar

Joined: Fri Jun 13, 2008 3:21 pm
Posts: 1700
Location: Cambridge, United Kingdom
I presume it means ones through call gates?


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Thu Jan 07, 2010 6:49 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

quanganht wrote:
MessiahAndrw wrote:
All you have to do is identity map the amount of memory you have in the system (to avoid memory mapping the entire range and wasting space with unnecessary entries).

That's what I mean! LOL. In the first post, I said "the whole *system* memory", not the whole range of virtual memory.


I'd assume MessiahAndrew meant "contiguously mapped" (e.g. RAM from 0x000000 to EBDA mapped to virtual addresses 0x000000 to "x", RAM from 0x00100000 to the first hole mapped to virtual addresses "x" to "x + y", etc).

quanganht wrote:
So, any pointer made by any process will be globally correct, right?


Yes. The problem is getting multiple processes to share an address space (without segmentation), which means using position independent code. If all RAM is contiguous in the virtual address space then there's also virtual address space fragmentation issues and problems implementing certain optimisations (swap space, memory mapped files, "copy on write", etc); and no easy way to implement protection/isolation (without something complex like software isolation; any process can trash any other process, and there's no way for a process to prevent other processes from accessing sensitive data like the user's passwords, etc).

quanganht wrote:
I recently found an interesting point in AMD64 Manual - Vol.2:
Code:
As a result, SYSCALL and SYSRET can take fewer than
one-fourth the number of internal clock cycles to complete than the legacy CALL and RET
instructions. SYSCALL and SYSRET are particularly well-suited for use in 64-bit mode, which
requires implementation of a paged, flat-memory model.


Owen wrote:
I presume it means ones through call gates?


I'd assume it means call gates.. :)

quanganht wrote:
Are they really that fast? The manual also mentioned about "paged, flat memory model" in 64-bit.


Yes, and no.

The SYSCALL/SYSRET instructions themselves (for at least some CPUs in some situations) probably are 4 times faster than call gates (which are typically a little faster than software interrupts). This is because a call gate (or a software interrupt) involves fetching several entries from the GDT (and IDT for software interrupts) which adds extra overhead (and potential cache misses).

However, the number of cycles taken by the SYSCALL/SYSRET instructions alone isn't the entire story. Extra code (and additional overhead) may be needed to make SYSCALL/SYSRET as secure as a call gate or software IRQ. For a specific example (and a big warning), as far as I can tell it's possible for user level code to set RSP to "kernel space" before doing SYSCALL and tricking the kernel into trashing it's own data. For example, try doing "mov rsp,0xFEDCBA9876543210" or "mov rsp,8" before SYSCALL...

To avoid potential problems the kernel's SYSCALL handler would need to either check RSP before it uses the caller's stack, or switch to a "known good" stack at the start of the SYSRET handler. For an example (with numbers I made up), if a call gate costs 40 cycles, then SYSCALL might cost 10 cycles (4 times faster), and doing "mov rbp, rsp; mov rsp,[gs:safe_stack]" then "mov rsp,rbp" might add another 10 cycles (to make SYSCALL secure); so the end result is 20 cycles (only twice as fast as a call gate).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Fri Jan 08, 2010 8:35 am 
Offline
Member
Member
User avatar

Joined: Fri May 16, 2008 7:13 pm
Posts: 301
Location: Hanoi, Vietnam
Hi,

Brendan wrote:
If all RAM is contiguous in the virtual address space then there's also virtual address space fragmentation issues and problems implementing certain optimisations (swap space, memory mapped files, "copy on write", etc); and no easy way to implement protection/isolation


Well, I agree about the position independent code thing, but for now, it shouldn't be a big problem. Modern CPUs are said to support it, and make it run just as if they are position dependent (in term of performance).

Fragmentation can be solved by using some kind of block(slab?). So, up on application's request, system allocator will give it a chunk of memory, say 1MB, or even 1GB. Then application is free to do anything with it. This is kind of similar to exokernel idea, where application have it's own memory allocator (internal allocator). It helps eliminate memory fragmentation and as application knows how to use memory in it's most efficient way, global performance is increased.

Another optimization that only SAS can have is shared memory. IPC, RPC, shared data is *very* efficient in SAS because data is placed in one place only, then owner can pass it's pointer to anyone
quanganht wrote:
any pointer made by any process will be globally correct

Similar to this http://forum.osdev.org/viewtopic.php?f=15&t=21393#p170267

Plus, SYSCALL/SYSRET requires flat, paged memory model. That is doubled speed over call gate/SW interrupt. Well 20 cycles times some million calls is something really different :)

And about this
Brendan wrote:
and no easy way to implement protection/isolation

I doubt it. IMO we only have to watch out page tables, and mark pages with permissions corresponding to applications.

Any ideas?

PS: Thanks Brendan for a very enthusiastic reply :)

_________________
"Programmers are tools for converting caffeine into code."


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Sat Jan 09, 2010 4:16 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

quanganht wrote:
Fragmentation can be solved by using some kind of block(slab?). So, up on application's request, system allocator will give it a chunk of memory, say 1MB, or even 1GB. Then application is free to do anything with it.


That works perfectly fine for a normal OS, because a normal OS allocates space (and not RAM). For "contiguously mapped RAM" you can't give every process a large slab because you end up wasting heaps of RAM. For example, the computer I'm using now isn't doing too much, but there's about 50 processes running. If you give each process 1 GiB then you'll need to use significant amounts of swap space (but you can't implement swap space for "contiguously mapped RAM" either).

Probably the easiest way around the problem is to have (the equivalent of) a global heap; but then the RAM allocated by a single process will be scattered all over the place (not so good for cache/TLB locality). Of course that's assuming you can prevent heap corruption.

A better idea is to not use "contiguously mapped RAM" to begin with - that solves all the problems.

quanganht wrote:
Another optimization that only SAS can have is shared memory. IPC, RPC, shared data is *very* efficient in SAS because data is placed in one place only, then owner can pass it's pointer to anyone


Boring old fashioned OSs (e.g. Unix clones) have been using shared memory (without SAS) for several decades.

The optimisation that you're thinking of means that you can pass a pointer to something in shared memory rather than passing an offset from the start of shared memory. For e.g. (for a more conventional OS) the first process can do "offset = pointer - address_of_shared_memory" and send the offset, and the second process (the receiver) can do "pointer = offset +address_of_shared_memory"; and instead of this a SAS OS can just send the pointer. It saves you one subtraction and one addition, or about 2 cycles on a modern CPU (until the receiver has to check if the pointer it received is sane, where you need to do 2 comparisons instead of one).

The main advantage (the only significant advantage) that SAS has is eliminating the cost of switching virtual address spaces (e.g. TLB misses). Unfortunately this advantage is often over-estimated.

The TLBs in a modern CPU aren't large enough to cover the entire virtual address space (even with "large pages" they aren't enough to cover 4 GiB of the virtual address space). The TLBs only cover the most recently used areas of the address space. This means a SAS OS runs one process (and the TLB fills with entries for that process), then the SAS OS switches to a different process (and almost all of the TLB entries for the old process and aren't used, and get replaced by TLB entries for the new process, and you get the same number of TLB misses). Of course that's a "worst case".

If the OS is constantly switching between "n" processes, and if those processes use less data than the TLBs cover, then a SAS OS does avoid lots of TLB misses. That's the "best case".

quanganht wrote:
quanganht wrote:
any pointer made by any process will be globally correct

Similar to this http://forum.osdev.org/viewtopic.php?f=15&t=21393#p170267

Plus, SYSCALL/SYSRET requires flat, paged memory model. That is doubled speed over call gate/SW interrupt. Well 20 cycles times some million calls is something really different :)


SYSCALL/SYSRET doesn't a require flat paged memory model - for a 32-bit OS it'll work without paging and you can still use a limited amount of segmentation (e.g. for data segments). Long mode requires flat paged memory model (therefore a flat paged memory model is required for SYSCALL/SYSRET in long mode).

Of course "requires a paged flat memory model" only means you can't rely on segmentation and have to use paging (but it doesn't matter *how* you use paging - it could be a monolithic OS with a virtual address space for each process, or a SAS OS, or any other alternative).

quanganht wrote:
And about this
Brendan wrote:
and no easy way to implement protection/isolation

I doubt it. IMO we only have to watch out page tables, and mark pages with permissions corresponding to applications.


"Mark pages with permissions corresponding to applications."???

If you've got a single address space for all processes, then you could mark all "user-level pages" that the currently running process shouldn't be able to access as "not present" or "supervisor" (e.g. during the task switch); but if you do that you'll need to flush/invalidate all of the TLB entries that you modify (and if you need to flush all of the TLB entries then it's easier and faster to use multiple address spaces and just change CR3 instead).

For a SAS OS, the only sane options are:
  • are no security/protection at all - e.g. a games machine or something where no data matters much, or an embedded system where you can guarantee that all the code that can be run is "safe" (where all the code is extremely well tested and stored in ROM or something).
  • Software isolation. This could mean processes run inside a virtual machine - e.g. byte-code that is interpreted, or compiled while it runs (dynamic translation/JIT), or compiled when it's installed.

For these options, software isolation is probably the most interesting, but it's a massive amount of work for a small (potential) performance improvement (unless you can recycle someone else's work).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Tue Jan 12, 2010 7:36 am 
Offline
Member
Member

Joined: Wed Oct 31, 2007 9:09 am
Posts: 1385
Brendan wrote:
For a specific example (and a big warning), as far as I can tell it's possible for user level code to set RSP to "kernel space" before doing SYSCALL and tricking the kernel into trashing it's own data. For example, try doing "mov rsp,0xFEDCBA9876543210" or "mov rsp,8" before SYSCALL...


Of course, no sane OS will put kernel data on the user stack, and a switch to kernel stack will always be the first thing the SYSCALL handler will do. SYSENTER has an explicit ESP stored in an MSR, but that means the scheduler must update the MSR when switching tasks.


JAL


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Tue Jan 12, 2010 7:50 am 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 3:45 am
Posts: 9301
Location: On the balcony, where I can actually keep 1½m distance
jal wrote:
but that means the scheduler must update the MSR when switching tasks.
Not necessarily. You can just have it contain a pointer to a valid temporary stack, then update ESP manually. (instead of finding out the value for ESP, then bothering the slow WRMSR with it)

_________________
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]


Top
 Profile  
 
 Post subject: Re: Single address space in Long mode
PostPosted: Tue Jan 12, 2010 8:13 am 
Offline
Member
Member
User avatar

Joined: Fri May 16, 2008 7:13 pm
Posts: 301
Location: Hanoi, Vietnam
Do you use SAS, Combuster?

_________________
"Programmers are tools for converting caffeine into code."


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: 8infy, Bing [Bot], MichaelPetch and 73 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group