OSDev.org

The Place to Start for Operating System Developers
It is currently Wed Aug 12, 2020 6:23 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 71 posts ]  Go to page Previous  1, 2, 3, 4, 5
Author Message
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Wed Apr 18, 2012 12:03 pm 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 2249
More results:

2-core Intel Atom (at 3GHz)
near: 24.4 million calls per second
gate: 2.4 million calls per second
syscall (load ss:esp): 1.3 million calls per second
syscall (don't load ss:esp): 3.8 million calls per second
syscall (load GS but not ss:esp): 3.4 million calls per second

Dual core Atom is even more broken in it's support for loading SS. When the SYSENTER interface is used in conjunction with loading ss:esp, performance drops to half, while when ss:esp is not loaded, performance is 60% higher. General segment register loads are not very costly here either, it is SS specifically that has a lousy implementation.

Just to prove that Brendan is wrong about SYSENTER/SYSEXIT being equal to a near call, I also tested to remove the far return in the code. That results in 7.6 million calls per second. It means that on Intel Atom, the far return takes the same amount time as the rest of the code.


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Wed Apr 18, 2012 12:12 pm 
Offline
Member
Member
User avatar

Joined: Tue Feb 08, 2011 1:58 pm
Posts: 496
Brendan wrote:
The only difference between the "sysenter" and "alternative sysenter" method is that the former loads a different SS:ESP while the latter doesn't.

May I suggest to read the opinion of sysenter's creator about the subject: http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Wed Apr 18, 2012 12:38 pm 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 2249
Just for an interesting comparison, here are the results for the processor that has the fastest call gate implementation:

6-core AMD Phenom: (at 2.8 GHz)
near: 44.7 million calls per second
gate: 12.0 million calls per second
syscall (load ss:esp): 11.1 million calls per second
syscall (don't load ss:esp): 19.4 million calls per second
syscall (load GS but not ss:esp): 18.7 million calls per second
syscall (no far ret or ss:esp load): 29.8 million calls per second

Even on this processor, the trend is similar. Using SYSENTER and manually loading ss:esp is slower on this processor as well than a call gate. Loading a general segment register has a small impact, and using far return takes about half as long as the rest of the SYSENTER code.

I think that it can be concluded that the primary segmentation issue on modern processors is changing SS register. As little of RDOS kernel-mode code manipulates the stack, it would be a good idea to use a flat SS selector in kernel mode on modern processors. That in itself, in conjunction with using SYSENTER/SYSEXIT could provide a considerable speed-up of syscalls.

In the absence of using the stack for parameters and variables, proper protection of the thread stack can be achieved like this:
1. Allocate 3 linear pages for the stack
2. Set the lower and upper page as invalid so it page-faults when referenced in kernel
3. Set the initial ESP to base + 0x2000


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Wed Apr 18, 2012 1:36 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

turdus wrote:
Brendan wrote:
The only difference between the "sysenter" and "alternative sysenter" method is that the former loads a different SS:ESP while the latter doesn't.

May I suggest to read the opinion of sysenter's creator about the subject: http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET


Please understand that RDOS's "sysenter method" is a lot of baggage that happens to use SYSENTER, and that RDOS's "alternative sysenter method" is slightly less baggage that also happens to use SYSENTER. At no point (including where RDOS calls it syscall) do any of these use or refer to the SYSCALL/SYSRET instructions.

For what is better, an instruction that isn't supported can't be better than an instruction that is supported. For RDOS (which is limited to 32-bit and 16-bit code) this gives the following cases:
  • "Pentium II or later" Intel CPU: SYSCALL isn't supported for 32-bit code (even if it's a modern CPU that supports SYSCALL in 64-bit code). SYSENTER is the only usable option.
  • Recent AMD CPU: Both SYSCALL and SYSENTER can be used for 32-bit code, and the difference between them will be negligible (especially when you add RDOS's baggage).
  • Less recent (32-bit only) AMD CPU: SYSENTER isn't supported. SYSCALL is the only option
  • Older AMD CPUs and "Pentium Pro or earlier" Intel CPU: Both SYSENTER and SYSCALL aren't supported

If you think about this, for 32-bit code (e.g. RDOS), SYSENTER is supported on a lot more CPUs than SYSCALL, and therefore SYSENTER is a lot more important than SYSCALL. Support for SYSCALL would help on the less recent (32-bit only) AMD CPUs (but it's easiest to pick the low hanging fruit first). ;)


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Wed Apr 18, 2012 2:09 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

rdos wrote:
I think that it can be concluded that the primary segmentation issue on modern processors is changing SS register. As little of RDOS kernel-mode code manipulates the stack, it would be a good idea to use a flat SS selector in kernel mode on modern processors. That in itself, in conjunction with using SYSENTER/SYSEXIT could provide a considerable speed-up of syscalls.


For all segment register loads the CPU needs to fetch data from the L1 data cache (or worse) to get to the GDT/LDT entry and then do protection checks. Accessing L1 cache alone probably costs about 12 cycles. For DS, ES, FS, GS segment loads the CPU can use things like out-of-order execution and register renaming to hide the performance problem; so these segment loads seem to suck less. For CS loads the CPU can't hide the performance problem - the CPU has to wait for the CS load to complete before it can fetch the next instruction. For SS loads I'd assume similar restrictions (e.g. all calls/returns/pushes/pops need to wait for the earlier segment load to complete).

Basically, all segment register loads suck, potentially including (for e.g.) loading DS in code where all/most of the following instructions depend on DS, but sometimes the CPU can hide the suckage in some cases. Call gates suck twice as much (as the CPU has to fetch the gate's descriptors before it can start fetching the code descriptor). Both SYSENTER and SYSCALL avoid the need to fetch data from the L1 data cache (or worse) and most of the protection checks; and therefore have far less impact on a typical CPU's out-of-order execution pipeline (it'd still cause a temporary blockage, but the blockage is cleared a lot sooner). The same would apply for SYSEXIT/SYSRET compared to "RETF".


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Wed Apr 18, 2012 2:41 pm 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 2249
Brendan wrote:
Hi,

turdus wrote:
Brendan wrote:
The only difference between the "sysenter" and "alternative sysenter" method is that the former loads a different SS:ESP while the latter doesn't.

May I suggest to read the opinion of sysenter's creator about the subject: http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET


Please understand that RDOS's "sysenter method" is a lot of baggage that happens to use SYSENTER, and that RDOS's "alternative sysenter method" is slightly less baggage that also happens to use SYSENTER. At no point (including where RDOS calls it syscall) do any of these use or refer to the SYSCALL/SYSRET instructions.

For what is better, an instruction that isn't supported can't be better than an instruction that is supported. For RDOS (which is limited to 32-bit and 16-bit code) this gives the following cases:
  • "Pentium II or later" Intel CPU: SYSCALL isn't supported for 32-bit code (even if it's a modern CPU that supports SYSCALL in 64-bit code). SYSENTER is the only usable option.
  • Recent AMD CPU: Both SYSCALL and SYSENTER can be used for 32-bit code, and the difference between them will be negligible (especially when you add RDOS's baggage).
  • Less recent (32-bit only) AMD CPU: SYSENTER isn't supported. SYSCALL is the only option
  • Older AMD CPUs and "Pentium Pro or earlier" Intel CPU: Both SYSENTER and SYSCALL aren't supported

If you think about this, for 32-bit code (e.g. RDOS), SYSENTER is supported on a lot more CPUs than SYSCALL, and therefore SYSENTER is a lot more important than SYSCALL. Support for SYSCALL would help on the less recent (32-bit only) AMD CPUs (but it's easiest to pick the low hanging fruit first). ;)


Cheers,

Brendan


Add that for older CPUs (for instance AMD Geode or similar), there is absolutely no reason to use SYSENTER/SYSCALL since at that time performance of segmentation hadn't started to degrade, so even if the CPUs would support SYSENTER or SYSCALL, they would be meaningless and slower. The AMD Geode does support SYSENTER, but it would be useless since the call gate speed is excellent and SYSENTER would be slower even without SS reload.


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Thu Apr 19, 2012 2:43 am 
Offline
Member
Member
User avatar

Joined: Tue Feb 08, 2011 1:58 pm
Posts: 496
Brendan wrote:
Please understand that RDOS's "sysenter method" is a lot of baggage that happens to use SYSENTER, and that RDOS's "alternative sysenter method" is slightly less baggage that also happens to use SYSENTER. At no point (including where RDOS calls it syscall) do any of these use or refer to the SYSCALL/SYSRET instructions.

It's not the exact instruction that's interesting, but the comparison of a way to enter kernelmode. One (SYSENTER) uses stack and segment manipulation (RDOS' "sysenter" method), the other (SYSCALL, like RDOS' "alternative sysenter" method) don't.


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Fri Apr 20, 2012 9:15 am 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 2249
OK, so now the exception handling logic in kernel can handle 32-bit stacks, and so can the debugger and panic debugger, so now I should soon be able to carry on with switching to a 32-bit stack in kernel. :D

OTOH, I will probably not transiting the kernel to a 32-bit code segment. It's just too much work with too little utility. That would once more break the exception handlers (segment register pushes have different size in 32-bit mode, so I'll have to check each and every of those to make sure nothing breaks). I might think about it if I have a lot of time, and a working processor emulator with debugger, as I think that would be required.


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Fri Apr 20, 2012 2:40 pm 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 2249
Now I can single-step the sysenter instruction. :D


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Wed May 02, 2012 3:26 pm 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 2249
A new winner:

2-core Intel Core duo: (at 3 GHz)
near: 51.6 million calls per second
gate: 13.4 million calls per second

This is the processor that has the fastest call gate performance, and only lags slighty in near call performance (to i5).


Top
 Profile  
 
 Post subject: Re: Best processor for 32-bit [rd]OS - a.k.a RDOS-OS is best
PostPosted: Thu May 03, 2012 2:11 am 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 2249
A summary of the results (sorted by call gate performance):

2-core Intel Core Duo: (at 3 GHz)
near: 51.6 million calls per second
gate: 13.4 million calls per second
sysenter: 10.5 million calls per second

6-core AMD Phenom: (at 2.8 GHz)
near: 44.7 million calls per second
gate: 12.0 million calls per second
sysenter: 16.8 million calls per second

Intel i5, 2.9GHz:
near: 56.2 million calls per second
gate: 7.2 million calls per second

2-core AMD Athlon I: (at 1GHz)
near: 19.7 million calls per second.
gate: 7.0 million calls per second.

Portable Intel Core Duo (2.13GHz):
near: 35.4 million calls per second
gate: 6.7 million calls per second

AMD Geode: (at 500MHz)
near: 5.9 million calls per second.
gate: 4.0 million calls per second.

1-core AMD Athlon, 1.2GHz:
near: 15.3 million calls per second
gate: 3.8 million calls per second

1-core Intel Celeron (2.66GHz)
near: 16.3 million calls per second
gate: 3.0 million calls per second

2-core AMD E-300 portable (at 1.2GHz):
near: 20.0 million calls per second.
gate: 2.7 million calls per second.

2-core Intel Atom (at 3GHz)
near: 24.4 million calls per second
gate: 2.4 million calls per second
sysenter: 3.6 million calls per second

Intel Celeron (400MHz)
near: 5.8 million calls per second
gate: 1.8 million calls per second


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 71 posts ]  Go to page Previous  1, 2, 3, 4, 5

All times are UTC - 6 hours


Who is online

Users browsing this forum: Google [Bot] and 13 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group