OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Apr 26, 2018 4:53 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 86 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6
Author Message
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Thu Apr 12, 2018 11:24 am 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1147
Location: Athens, GA, USA
It just occurs to me that rdos might be conflating 'segmentation' with the concept of a 'modified Harvard architecture'.

Just to recap on this: the Harvard architecture, named after the Harvard Mark I electromechanical computer, is a type of stored-program computer in which the instructions store is physically separate from the data store. In the Mark I, this was done for practical reasons relating to how the instructions and data were routed to the CPU and ALU (which in most early systems were also physically separate units) - instructions would go to the CPU, computation data would go to the ALU, and the CPU would tell the ALU which operation to perform.

There was no straightforward way to transfer between the two memories. This wasn't seen as a problem, because the whole idea of a stored program system was in its infancy, and it was assumed that program stores would always be the smaller of the two. Most of the other systems of the time (the Colossus, the Atanasoff-Berry Computer, the Zuse Z3, and the ENIAC, and so forth) weren't stored-program systems at all (though ENIAC was later rebuilt as one), and the importance of that approach wasn't appreciated until around 1946.

The Von Neumann architecture (after Johnny von Neumann), which arose a bit later following the e 'Summer Camp' conference in 1946 and first used in the EDVAC and EDSAC computers, is the other common way to design a stored-program computer, and became almost but not quite universal in the 1950s and later. In this design, a single memory is used for both the instructions and the data, and the instructions are capable of modifying other instructions on the fly. This was risky, but was sometimes useful, or in some really early systems, necessary for some basic operations such as function calls (like with most things when they are being done the first few times, the early designers were often guessing at what would or wouldn't be needed, and often made a lot of mistakes - some of which would get permanently enshrined in the system later).

As with the Harvard architecture, this was originally just an engineering solution - with memories based on things such as mercury delay lines, and a CPU and ALU built on vacuum tubes, it was easy to re-route the signals to the right part of the system, but expensive to build two separate memories.

There was a disagreement from the start about whether the Harvard design was safer that the Von Neumann design, apparently, but in the end, practical engineering issues trumped the questions about how safe self-modifying code was for the first and second generations of stored-program electronic computers.

By the time transistor-based systems with ferrite-core memories were making those engineering considerations moot, the Von Neumann approach had proven useful, if not necessarily as safe. Computer designers started trying to come up with a way to secure the instructions part of the time, while still allowing the privileged system software to load programs as needed, or even monkey-patch code (e.g., in a combination loader and linker) before locking it again in order to run the program securely.

This led to the the 'Modified Harvard architecture', which is what is what we are really talking about when we discuss 'memory protection'. Any modern system with memory protection built into the CPU's memory management is, in effect, a Modified Harvard system (even though most introductory textbooks will still call it a 'Von Neumann' architecture). This would come to be standard on mainframes by 1970, and minis by around 1978 or so, but wouldn't start to supplant pure Von Neumann designs in microcomputers until the late 1980s.

Paging? Segmentation? Separate matters entirely. They both solve a different set of problems from the memory protection, as well as from each other. Segmentation, as I said before, is about stuffing an m-bit address space into n address lines when n < m. Paging is about moving part of the data or instructions in a fast memory storage to a slower one and back in a way that is transparent to the application programmers (that is, without have to explicitly use overlays and the like).

All three overlap with yet another separate idea, virtual address spaces. While VAS is often mistakenly thought to provide additional memory protection, this is not actually the case on the x86 - it is always possible to access other memory address spaces, if the memory protection doesn't prevent it, because the separate address spaces are all built on top of either paging, or segmentation, or in the case of the x86, both at the same time. However, by default the memory protection does do this for all non-privileged (i.e., user) code.

A memory protection system may need to work in conjunction with whatever other memory management sub-systems exist on a CPU, and may even incorporate side properties of them in order to organize the memory being managed (more on this shortly), they are orthogonal concerns from both memory protection and from each other. You can have memory protection without either segmentation or paging at all.

Caching adds some complexity to this, but since cache consistency is a problem anyway, those issue get resolved as part of the caching itself. Caches basically add a limited form of content addressable memory, where the memory tag is says block of memory is is caching, and those cache blocks may or may not correspond to pages or segments - mind you, paging fits better, since cache blocks are of a fixed, small size, which can be mapped to similarly sized pages (hence the performance difference sometimes seen when actively manipulating segmentation on the x86). However, no current systems applies tagging to the entire memory space, nor are the cache's tags accessible to the system software - they are entirely internal and managed by the hardware.

However, since both paging and segmentation involve breaking the memory into separate blocks, and the memory protection has to do the same, the memory protection can just use the blocks defined by the other sub-system rather than having its own blocks. This works out well, because the protection system has to check the validity of every memory access, while in most implementations, both paging and segmentation are translating between the effective addresses and the physical memory locations on each and every main memory access. Since both the protection checks and the translations are both necessary every time, it is easiest to do them together as much as feasible - no point in doing operations repeating operations that overlap a lot.

To sum up: on the x86 in 32-bit protected mode, there is no difference whatsoever in the degree of protection one gets by actively using segments from that gotten from by setting the segments to a flat virtual space. None. Period.

Segmentation only wins over paging if you are using separate segments for every individual data element - as in, every variable has its own segment. Even then, the only advantage is in how well the segment size matches the object's size; you can do the same thing with pages, but since the sizes are fixed it almost always has a size mismatch.

This isn't practical for either segments or paging on the x86, in any case, because of how the page tables and segment registers work. Individually protecting every object would require a radical redesign, something along the lines of... well, a capability-addressing system.

Capability-based addressing would be part of the memory protection as well, being basically a more fine-grained form of the same memory protections, except that it checks the source of the access rather than the element being accessed. Since the burden of proof is on the requester, rather than the object's state, it turns the entire approach on its head, and becomes a lot more flexible and secure.

My guess is that this is what rdos think segmentation is giving the system, but if so, he is mistaken. The memory protection system doesn't really provide that at all in any current CPU design, which is a damn shame because it would do exactly what he (and several others, myself included) seems to be looking for.

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Last edited by Schol-R-LEA on Fri Apr 13, 2018 9:41 am, edited 5 times in total.

Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Fri Apr 13, 2018 5:50 am 
Offline
Member
Member

Joined: Wed Jul 18, 2007 5:51 am
Posts: 162
Schol-R-LEA wrote:
As for those suggestions @tom9876543 made, the problem with it is that it would introduce a lot of complexity, and perhaps more importantly, demand a lot of IC real estate. The team developing the 80286 had the same problem that was one of the major factor in the 432's failure: they couldn't fit it onto a single chip without making it so large that the failure rate would make it economically infeasible ......
The suggested approach would have pushed the 80286 over the limit of transistor densities of the time


I disagree Schol-R-LEA.

My proposed 80286 does NOT have the following:
16 bit protected mode segments
limit checking
GDT
LDT
TSS
4 privilege levels
complicated transition between privilege levels
hardware task switching
instructions such as ARPL, LAR, LSL etc

My guess is the number of transistors required for a simple 32bit paging and "flat mode", would be similar to the number of transistors wasted on 16bit protected mode.
So my suggestion is feasible.


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Fri Apr 13, 2018 8:58 am 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1147
Location: Athens, GA, USA
tom9876543 wrote:
Schol-R-LEA wrote:
As for those suggestions @tom9876543 made [...] The suggested approach would have pushed the 80286 over the limit of transistor densities of the time

I disagree [...] My proposed 80286 does NOT have the following:
[ ... ]
My guess is the number of transistors required for a simple 32bit paging and "flat mode", would be similar to the number of transistors wasted on 16bit protected mode.


OK, I am not a CPU designer, but if I understand correctly, all of those things together would have been dwarfed by the addition of any kind of 'simple 32-bit paging'. Memory management units were seen as very expensive, rightly so given the limits of the technology of the time.

No one designing microprocessors had yet done paging on the same die as a CPU at that point, and it is my understanding that the reason for this was because paging would have taken up as much of the die as the rest of the CPU.

Indeed, no one had even made one for any of the mainstream microchips as a co-processor, AFAIK - for example, the M68451 MMU co-processor for the M68010 was released in the same year as the 80186 and 80286. The 8086 line had pinouts to communicate with one from the start (as someone - octocontrabass I think - pointed out already), but I don't think anyone made any for the 8086 until around that time, either. I believe that there were experimental ones being made in 1980 (e.g., the Berkeley RISC I and Stanford MIPS projects were getting started around then, though I don't know if an MMU was part of the plan at that stage given that they were supposed to be student projects, and I seem to recall that an effort to put a LispM on a set of chips was going on then too), and the 432 was certainly intended to have an MMU co-processor, but no one had actually made one for sale to the best of my knowledge.

So they were expensive to include. They were in general also seen as unnecessary given the uses that microprocessors were expected to be put to at the time. Even companies who took the home computer market seriously, such as MOSTek (who by then had been bought up by Commodore), Zilog, and Motorola, didn't see a need for them as an integral part of the CPU design. My understanding is that most of the industry thought Intel were crazy for trying to make one a required sub-system for the 432 (especially one using implementing capabilities) - and at the time, they were right to be skeptical.

I may be wrong about this, however, so if anyone more familiar with topic can chime in, I would appreciate it.

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Fri Apr 13, 2018 7:13 pm 
Offline
Member
Member

Joined: Wed Jul 18, 2007 5:51 am
Posts: 162
Schol-R-LEA wrote:
tom9876543 wrote:
Schol-R-LEA wrote:
As for those suggestions @tom9876543 made [...] The suggested approach would have pushed the 80286 over the limit of transistor densities of the time

I disagree [...] My proposed 80286 does NOT have the following:
[ ... ]
My guess is the number of transistors required for a simple 32bit paging and "flat mode", would be similar to the number of transistors wasted on 16bit protected mode.


OK, I am not a CPU designer, but if I understand correctly, all of those things together would have been dwarfed by the addition of any kind of 'simple 32-bit paging'. Memory management units were seen as very expensive, rightly so given the limits of the technology of the time.

No one designing microprocessors had yet done paging on the same die as a CPU at that point, and it is my understanding that the reason for this was because paging would have taken up as much of the die as the rest of the CPU.


The Motorola MC68451 MMU seems to have used 34,000 transistors:
http://blog.ehliar.se/post/58268464354/ ... 68851-pmmu
https://patpend.net/technical/68000/68000faq.txt

Wikipedia states the 8086 had about 29,000 transistors and the 80286 had about 134,000 transistors.

It is fairly clear: the 80286 could have been an 8086 + 32bit flat addressing mode + 32bit mmu.


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Sat Apr 14, 2018 3:13 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8338
Location: At his keyboard!
Hi,

tom9876543 wrote:
Schol-R-LEA wrote:
OK, I am not a CPU designer, but if I understand correctly, all of those things together would have been dwarfed by the addition of any kind of 'simple 32-bit paging'. Memory management units were seen as very expensive, rightly so given the limits of the technology of the time.

No one designing microprocessors had yet done paging on the same die as a CPU at that point, and it is my understanding that the reason for this was because paging would have taken up as much of the die as the rest of the CPU.


The Motorola MC68451 MMU seems to have used 34,000 transistors:
http://blog.ehliar.se/post/58268464354/ ... 68851-pmmu
https://patpend.net/technical/68000/68000faq.txt

Wikipedia states the 8086 had about 29,000 transistors and the 80286 had about 134,000 transistors.


The Motorola MC68451 MMU didn't support paging - it had 96 "variable sized blocks" (segments). 80286 protected mode had up to 16383 segments (split into "global segments" and "local segments").

tom9876543 wrote:
It is fairly clear: the 80286 could have been an 8086 + 32bit flat addressing mode + 32bit mmu.


It might be clear in hindsight, but foresight is never clear.

If you could travel back in time to when 80286 was being designed (around 1980) and tell Intel's engineers to include paging, they probably would have told you that paging isn't worth bothering with because almost nobody uses a multi-tasking OS and almost nobody cares about security. Even when Intel did include paging (80386, several years later) it was "mostly unused" for an entire decade (until Windows 95 was released).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Sun Apr 15, 2018 1:37 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 1242
Brendan wrote:
The Motorola MC68451 MMU didn't support paging - it had 96 "variable sized blocks" (segments).

It does support paging. You can set those segments to be all the same size, and use them as a TLB for a much larger translation table. In fact, it's very similar to the MIPS R4000, except the R4000 manual calls them "pages" instead of "segments".


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Sun Apr 15, 2018 3:01 pm 
Offline
Member
Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 241
Octocontrabass wrote:
Brendan wrote:
The Motorola MC68451 MMU didn't support paging - it had 96 "variable sized blocks" (segments).

It does support paging. You can set those segments to be all the same size, and use them as a TLB for a much larger translation table. In fact, it's very similar to the MIPS R4000, except the R4000 manual calls them "pages" instead of "segments".
Personally, I find such option interesting, because it could be useful in specific cases. At the same time, it would be sacrificing quite a lot of flexibility.

First, it means no demand swap of memory. Then again, you probably shouldn't swap in a well designed system.

Also, no caching of memory mapped storage. That is - no zero-copy I/O path that provides system-wide caching. You can manually cache buffered I/O in the application, but that could be slower and has no way to respond to system-wide memory pressure. Nonetheless, it will be possible to share the allocation of file content, as long as it is kept entirely resident.

There will be fragmentation issues as well. For example, if a process grows its heap, it will have to relocate the heap segment to a larger vacancy in memory. Also, there wont be enough segments for every thread stack, so stacks will have to be heap allocated, thus they will have to be of fixed size (the pointers to stack variables cannot be easily redirected).

That said, there will be smaller latency on random access. This can be preferable to the large pages in x64, where the page sizes are either too small or too large. Or alternatively, this may enable simpler CPU design without things like out-of-order execution which mitigate the various latencies. But you will be missing OS functionalities, which may ultimately cause more redundant memory or even storage device traffic.

I find this interesting nonetheless, for things like special purpose embedded CPUs or as an alternative to large pages. I cannot see it becoming part of the mainstream without paging present to supplement it.


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Mon Apr 16, 2018 4:12 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 1242
simeonz wrote:
Octocontrabass wrote:
Brendan wrote:
The Motorola MC68451 MMU didn't support paging - it had 96 "variable sized blocks" (segments).

It does support paging. You can set those segments to be all the same size, and use them as a TLB for a much larger translation table. In fact, it's very similar to the MIPS R4000, except the R4000 manual calls them "pages" instead of "segments".
Personally, I find such option interesting, because it could be useful in specific cases. At the same time, it would be sacrificing quite a lot of flexibility.

I think you misunderstand. The "segments" in the MC68451 are functionally equivalent to pages, and absolutely nothing like segments in x86. They are always a fixed power-of-two size, with both physical and virtual addresses that are a multiple of the size. A single virtual address space may contain as many segments as you want; when you want more than 32 segments in a single address space you'll have to swap segments in and out of the MC68451 much like how an x86 CPU must periodically swap page definitions in and out of its TLB.


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Mon Apr 16, 2018 7:32 am 
Offline
Member
Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 241
Octocontrabass wrote:
A single virtual address space may contain as many segments as you want; when you want more than 32 segments in a single address space you'll have to swap segments in and out of the MC68451 much like how an x86 CPU must periodically swap page definitions in and out of its TLB.
I didn't know you had limited size control. So - does it maybe generate TLB miss exception that the OS handles and you have OS controlled TLB thrashing rather than hardware controlled TLB thrashing. Apparently (according to wikipedia) Itanium also had option for this kind of thing. But that is not the same performance advantage. So, I wonder, what would be the advantage - CPU circuitry optimization or could it benefit software in particular scenarios?


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Mon Apr 16, 2018 8:17 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 1242
simeonz wrote:
So - does it maybe generate TLB miss exception that the OS handles and you have OS controlled TLB thrashing rather than hardware controlled TLB thrashing.

Yep, this is exactly how the MC68451 and R4000 work.

simeonz wrote:
So, I wonder, what would be the advantage - CPU circuitry optimization or could it benefit software in particular scenarios?

The main advantage is simpler (and therefore cheaper) MMU circuitry. Hardware TLB management takes a lot more transistors than software TLB management, and the way the MC68451 does it is about as simple as you can get. The flexibility it provides can also be an advantage, since you can define your own page table format and come up with your own TLB fill algorithm. (You can also mix page sizes, but you can do that with the MC68851 too, so it's not an advantage specific to the software-controlled TLB.)

The main disadvantage is that, unless your TLB fill algorithm is very very good, it will be slower than a hardware TLB fill.


Top
 Profile  
 
 Post subject: Re: Memory Segmentation in the x86 platform and elsewhere
PostPosted: Tue Apr 17, 2018 4:02 am 
Offline
Member
Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 241
Octocontrabass wrote:
The main advantage is simpler (and therefore cheaper) MMU circuitry. Hardware TLB management takes a lot more transistors than software TLB management, and the way the MC68451 does it is about as simple as you can get. The flexibility it provides can also be an advantage, since you can define your own page table format and come up with your own TLB fill algorithm. (You can also mix page sizes, but you can do that with the MC68851 too, so it's not an advantage specific to the software-controlled TLB.)

The main disadvantage is that, unless your TLB fill algorithm is very very good, it will be slower than a hardware TLB fill.
Understood. Thanks for clarifying.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 86 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group