OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 12:06 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: using Process Context Identifiers on x86-64
PostPosted: Tue Nov 10, 2015 6:28 am 
Offline
Member
Member

Joined: Mon Dec 16, 2013 6:50 pm
Posts: 27
Hi,

I'm trying to implement the PCID feature in my OS. My CPU does support PCID but doesn't support INVPCID. In my mind, that doesn't make any sense. How would that work exactly? My initial thought was that, when a process is created, I should invalidate all TLB entries associated with its PCID in case the PCID was in use by another, now dead, process.

Am I misinterpretting something? Is it really possible to have support for PCID and not INVPCID?

But what am I supposed to do with no such instruction? Flush the entire TLB? Is this really the way to do it?

Thank you.


Top
 Profile  
 
 Post subject: Re: using Process Context Identifiers on x86-64
PostPosted: Tue Nov 10, 2015 7:14 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

xmm15 wrote:
I'm trying to implement the PCID feature in my OS. My CPU does support PCID but doesn't support INVPCID. In my mind, that doesn't make any sense. How would that work exactly? My initial thought was that, when a process is created, I should invalidate all TLB entries associated with its PCID in case the PCID was in use by another, now dead, process.

Am I misinterpretting something? Is it really possible to have support for PCID and not INVPCID?


While it doesn't make that much sense to me either; Intel gave PCID and INVPCID different feature flags for a reason.

xmm15 wrote:
But what am I supposed to do with no such instruction? Flush the entire TLB? Is this really the way to do it?


Fortunately...

Intel wrote:
INVLPG. This instruction takes a single operand, which is a linear address. The instruction invalidates any TLB entries that are for a page number corresponding to the linear address and that are associated with the current PCID. It also invalidates any global TLB entries with that page number, regardless of PCID (see Section 4.10.2.4).1 INVLPG also invalidates all entries in all paging-structure caches associated with the current PCID, regardless of the linear addresses to which they correspond.


Also...

Intel wrote:
MOV to CR3. The behavior of the instruction depends on the value of CR4.PCIDE:
  • If CR4.PCIDE = 0, the instruction invalidates all TLB entries associated with PCID 000H except those for global pages. It also invalidates all entries in all paging-structure caches associated with PCID 000H.
  • If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 0, the instruction invalidates all TLB entries associated with the PCID specified in bits 11:0 of the instruction’s source operand except those for global pages. It also invalidates all entries in all paging-structure caches associated with that PCID. It is not required to invalidate entries in the TLBs and paging-structure caches that are associated with other PCIDs.
  • If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 1, the instruction is not required to invalidate any TLB entries or entries in paging-structure caches.


This means you've basically got 3 choices:
  • Modify CR4 (e.g. enabled then disable either global pages or PCID) and invalidate everything for all PCIDs (including TLBs for global pages)
  • Use INVLPG to invalidate one TLB entry for the current PCID and all TLB entries that aren't global for all other PCIDs
  • Reload CR3 (with bit 63 set and not clear) to invalidate all TLB entries (except for global pages) for the PCID being loaded (and not other PCIDs)

Of course the INVPCID instruction gives you even more options, which would be better for performance (if it was supported).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: using Process Context Identifiers on x86-64
PostPosted: Tue Nov 10, 2015 7:26 am 
Offline
Member
Member

Joined: Mon Dec 16, 2013 6:50 pm
Posts: 27
Hmmm, I missed the part about bit 63 in cr3. That should work just fine. So basically, a invpcid could be emulated with:

mov $TARGET_PCID | (1<<63),%rbx
mov %cr3,%rax
cli
mov %rbx,%cr3
mov %rax,%cr3
sti

Assuming that the current executed code resides in a global page.
I could probably emulate that in the #UD handler.


Top
 Profile  
 
 Post subject: Re: using Process Context Identifiers on x86-64
PostPosted: Tue Nov 10, 2015 1:26 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

xmm15 wrote:
Hmmm, I missed the part about bit 63 in cr3. That should work just fine. So basically, a invpcid could be emulated with:

mov $TARGET_PCID | (1<<63),%rbx
mov %cr3,%rax
cli
mov %rbx,%cr3
mov %rax,%cr3
sti

Assuming that the current executed code resides in a global page.
I could probably emulate that in the #UD handler.


Don't forget that INVPCID has 4 different types:
    Type 0: Invalidate one address for one PCID (unless that address is in a global page). Can't be emulated exactly. Best case might be INVLPG (which wipes non-global TLBs for all other PCIDs).
    Type 1: Invalidate all TLB entries for one PCID (except for global pages). Could be emulated by reloading CR3 with bit 64 set.
    Type 2: Invalidate all TLB entries for all PCIDs (including global pages). Could be emulated by modifying CR4.
    Type 3: Invalidate all TLB entries for all PCIDs (except global pages). Can't be emulated exactly. Best case would be to reload CR3 with bit 64 set to switch to a temporary virtual address space (and wipe all non-global TLBs for all PCIDs except one), then switch back to the original virtual address space in the same way (to wipe non-global TLBs that weren't already wiped).

Note that for some of these (type = 0 and type = 1) you need to switch virtual address spaces whenever the current PCID isn't the one in the "INVPCID Descriptor" (e.g. load CR3 with bit 64 clear) and then switch back after.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: using Process Context Identifiers on x86-64
PostPosted: Tue Nov 10, 2015 5:08 pm 
Offline
Member
Member

Joined: Mon Dec 16, 2013 6:50 pm
Posts: 27
Yes, thank you. I only need one mode for now so I'll keep it simple.

Thanks


Top
 Profile  
 
 Post subject: Re: using Process Context Identifiers on x86-64
PostPosted: Tue Nov 10, 2015 7:00 pm 
Offline
Member
Member

Joined: Mon Dec 16, 2013 6:50 pm
Posts: 27
For those interested, I documented my experience with PCID here: http://www.dumaisnet.ca/index.php?artic ... 6b3fbe37e7


Top
 Profile  
 
 Post subject: Re: using Process Context Identifiers on x86-64
PostPosted: Wed Jun 20, 2018 1:17 am 
Offline

Joined: Fri Nov 17, 2017 7:02 am
Posts: 20
xmm15 wrote:
Hmmm, I missed the part about bit 63 in cr3. That should work just fine. So basically, a invpcid could be emulated with:

mov $TARGET_PCID | (1<<63),%rbx
mov %cr3,%rax
cli
mov %rbx,%cr3
mov %rax,%cr3
sti

Assuming that the current executed code resides in a global page.
I could probably emulate that in the #UD handler.


Sorry for not getting your and Brendan's following points (neither by reading Intel's SDM).
"If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 0, the instruction invalidates all TLB entries associated with the PCID specified in bits 11:0 of the instruction’s source operand except those for global pages. It also invalidates all entries in all paging-structure caches associated with that PCID. It is not required to invalidate entries in the TLBs and paging-structure caches that are associated with other PCIDs.
If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 1, the instruction is not required to invalidate any TLB entries or entries in paging-structure caches."

My understanding is "If CR4.PCIDE=1, and bit 63 of CR3 is 0 (not 1), it will invalidate all TLB entries associated with the PCID specified in bits 11:0."

If my understanding is correct, above code should be changed to not to set bit 63 of CR3 to invalidate the TLB entries of the switched-out process.

Thanks,
-Tao


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot] and 88 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group