OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Apr 19, 2024 6:39 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: Page Fault after invalidating non-accessed pages
PostPosted: Tue Jun 06, 2017 7:44 pm 
Offline
User avatar

Joined: Mon Jun 05, 2017 6:09 am
Posts: 5
I ran into a very bizarre issue last night. It was a bit tricky to debug, but eventually I tracked it down.
It was coming from my paging_remap_page() function, and causing a Page Fault despite the fact a physical address was properly mapped to the faulting address.

The problem was caused by this:
Code:
pte->address = (paddr & SMALL_PAGE_MASK) | (pte->flags & ~SMALL_PAGE_MASK);
paging_invalidate_page(vaddr & SMALL_PAGE_MASK);

And, the fix was an incredibly minor change:
Code:
pte->address = (paddr & SMALL_PAGE_MASK) | (pte->flags & ~SMALL_PAGE_MASK);

if(pte->accessed)
{
    paging_invalidate_page(vaddr & SMALL_PAGE_MASK);
}

Apparently
Code:
invlpg
behaves poorly if trying to invalidate a page that hasn't actually been accessed.
I haven't been able to find anything in literature that documents this nuance, and I was wondering if anyone else has run into the issue.

Also, I'll leave this here if anyone else does run into it.


Top
 Profile  
 
 Post subject: Re: Page Fault after invalidating non-accessed pages
PostPosted: Tue Jun 06, 2017 8:35 pm 
Offline
Member
Member

Joined: Wed Jul 10, 2013 9:11 am
Posts: 51
Do you find it more likely that you discovered an undocumented feature that's trivial enough to be independently discovered or that there is a bigger bug in your program and this "fix" made it work by coincidence?

At least give us a rundown of your debugging process (what your initial problem was and how you came to the conclusion that X was the problem and Y was the fix) and the value of SMALL_PAGE_MASK.

FWIW I could not reproduce this.


Top
 Profile  
 
 Post subject: Re: Page Fault after invalidating non-accessed pages
PostPosted: Tue Jun 06, 2017 8:44 pm 
Offline
User avatar

Joined: Mon Jun 05, 2017 6:09 am
Posts: 5
goku420 wrote:
Do you find it more likely that you discovered an undocumented feature that's trivial enough to be independently discovered or that there is a bigger bug in your program and this "fix" made it work by coincidence?

At least give us a rundown of your debugging process (what your initial problem was and how you came to the conclusion that X was the problem and Y was the fix) and the value of SMALL_PAGE_MASK.

FWIW I could not reproduce this.

SMALL_PAGE_MASK is just 0xFFFFF000, used to mask away the bottom 12 bits (flags) from a physical address.
My debugging process consisted of me testing this across several VMs and 2 physical machines (Both Dell, P4 era). It only occurred on the physical hardware.

The way in which I debugged this was comparing the loaded pages between the VM and hardware. I found no discrepancy.
There was no notable differences in the values that were loaded into registers, any the error code included in the interrupt was 0 (no flags set).
Finally, I set up a minimal test case of mapping an available physical address to 0xD0000000, calling invlpg, and then reading from it (which caused the mentioned fault).

What made me suspect that invlpg was the cause was the unittest that was failing. I have 2 tests that call paging_remap_page, but one of them did so without accessing the mapped page first.


Top
 Profile  
 
 Post subject: Re: Page Fault after invalidating non-accessed pages
PostPosted: Tue Jun 06, 2017 9:03 pm 
Offline
Member
Member
User avatar

Joined: Sat Dec 27, 2014 9:11 am
Posts: 901
Location: Maadi, Cairo, Egypt
Without seeing more of your code, I can't guess. But INVLPG should work with both accessed and un-accessed pages, and it does in my OS. You shouldn't check if the page has been accessed; that is not a fix and will only cause problems on other hardware. You should invalidate the TLB cache every time you modify the page directory/table. Since you mention the error only occurred on physical hardware, this is far more likely a caching problem (just guessing.)

Tell us more: what is the error code of the page fault? What is in CR0, CR2 and CR4?

_________________
You know your OS is advanced when you stop using the Intel programming guide as a reference.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], DotBot [Bot] and 164 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group