OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 4:33 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 27 posts ]  Go to page Previous  1, 2
Author Message
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 1:08 am 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
Looks like you had another pagefault before that one. When debugging, always start with the first error. What's the very first pagefault you get?

Also what instruction is at 0xffffffff8010203a? What function is it? Is it really code? etc.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 1:35 am 
Offline
Member
Member

Joined: Sat Feb 20, 2021 3:11 pm
Posts: 93
kzinti wrote:
Looks like you had another pagefault before that one. When debugging, always start with the first error. What's the very first pagefault you get?


kzinti wrote:
Also what instruction is at 0xffffffff8010203a? What function is it? Is it really code? etc.

I used the objdump -d on the kernel and got the instruction located there - it is the iretq returning from the routine that reloaded page tables
Code:
ffffffff80102037 <arch_loadPageTables>:
ffffffff80102037:       0f 22 df                mov    %rdi,%cr3
ffffffff8010203a:       c3                      retq


Last edited by rpio on Sun Jan 23, 2022 4:34 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 6:52 am 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
ngx wrote:
The address in CR2 can't be linear, I certainly think it is physical as in my case it is
The address in CR2 is always a linear one. Having a weird address in that register might shed some light on why you're getting a page fault.
ngx wrote:
I used the objdump -d on the kernel and got the instruction located there
This means you haven't mapped your code correctly, so as soon as you set the new paging table, the next instruction can't be fetched. Make sure that you map your kernel (at least this function and it's variables) at the same address as it's mapped in the old table. (In your case the PTE for ffffffff80102000 should point to the same physical page in the new table as in the old one, and the stack must be the same too to get the return address correctly.)

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 11:21 am 
Offline
Member
Member

Joined: Sat Feb 20, 2021 3:11 pm
Posts: 93
bzt wrote:
This means you haven't mapped your code correctly, so as soon as you set the new paging table, the next instruction can't be fetched. Make sure that you map your kernel (at least this function and it's variables) at the same address as it's mapped in the old table. (In your case the PTE for ffffffff80102000 should point to the same physical page in the new table as in the old one, and the stack must be the same too to get the return address correctly.)


The thing you said about stack got me thinking - if the faulty instruction is iretq which stores the return address on top of the stack and the last exception has a physical address in CR2 - then maybe I just haven't mapped the stack and that is why an error occurs?

So now I have a couple of questions about this
- Could I read the SP register to get the address where the stack is stored, if physical then I would just map it to a virtual, if virtual then map it to a physical?
- Does the SP contain virtual/linear or physical address?
- If physical - is there a need for the stack to be mapped into virtual memory - the CPU operates on physical addresses and the stack is not accessed directly, but using the pop and push instructions instead?

Also:
When I dump the registers in qemu using qemu monitor(first I run of the page table reloading in the kernel so the OS runs without double faulting) "reg info" the address in SP is - 0xFFFFFFFF801116a8

The address in SP when I just use mov to get it is - 0xFFFFFFFF801116a0

The address in SP when I get it from exception(using qemu -d int) before reloading page tables - 0000000007ef4f38
After reloading tables it is - ffffffff80111630

What could be wrong with the SP?


Last edited by rpio on Mon Apr 12, 2021 11:54 am, edited 2 times in total.

Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 11:50 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
ngx wrote:
The thing you said about stack got me thinking - if the faulty instruction is iretq which stores the return address on top of the stack and the last exception has a physical address in CR2 - then maybe I just haven't mapped the stack and that is why an error occurs?
No, IRETQ only reads from the stack. If the stack were faulty, the CPU would invoke the double fault handler. You would never get to the place you got to with a faulty stack.

ngx wrote:
- Does the SP contain virtual/linear or physical address?
SP contains the linear address of the stack. Common solution is to map the stack to a predetermined address and reset SP after enabling paging. You must discard all stack references from before the address space switch anyway, so might as well do it properly.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 1:02 pm 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
Just pointing out that the code you disassembled shows that the page fault is occurring on "retq", not on "iretq".

The stack looks fine, it's your code that is not mapped properly. The fact that the page fault happens on the very next instruction after you load CR3 gives it away. It doesn't matter what that instruction actually is.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 1:08 pm 
Offline
Member
Member

Joined: Sat Feb 20, 2021 3:11 pm
Posts: 93
kzinti wrote:
Just pointing out that the code you disassembled shows that the page fault is occurring on "retq", not on "iretq".

The stack looks fine, it's your code that is not mapped properly. The fact that the page fault happens on the very next instruction after you load CR3 gives it away. It doesn't matter what that instruction actually is.


With iretq it was probably a typo.
How could I look at page mappings - is there any way to dump page tables for debugging?


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 1:24 pm 
Offline
Member
Member

Joined: Sat Feb 20, 2021 3:11 pm
Posts: 93
nullplan wrote:
No, IRETQ only reads from the stack.

Yes, sorry - wrong wording, as I meant that the return address is stored on the stack

nullplan wrote:
If the stack were faulty, the CPU would invoke the double fault handler. You would never get to the place you got to with a faulty stack.

The stack made by the boot loader is not faulty, but after I reloading the pages it could brake(the stack)

nullplan wrote:
SP contains the linear address of the stack. Common solution is to map the stack to a predetermined address and reset SP after enabling paging. You must discard all stack references from before the address space switch anyway, so might as well do it properly.

In my case I give an address of 16 KiB list and it makes stack out of it(probably just loads it into sp)

Also the other thing that contributes to my opinion that it is a stack issue is that the pre-last exception is a page fault and the last one is a double fault


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 2:52 pm 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
The double fault happens because you aren't handling the page fault.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Mon Apr 12, 2021 3:38 pm 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
ngx wrote:
Does the SP contain virtual/linear or physical address?
Unless specifically stated otherwise in the Intel manual, once MMU is turned on, all instructions and all registers use linear address.

ngx wrote:
How could I look at page mappings - is there any way to dump page tables for debugging?
Read the wiki page on kernel debugging. With qemu, start a monitor and use "info tbl", but that's very uncomfortable to use. There's a good reason why I suggested to get bochs working, because there the "page" command in the built-in debugger will do exactly what you want.

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Tue Apr 13, 2021 1:34 am 
Offline
Member
Member

Joined: Sat Feb 20, 2021 3:11 pm
Posts: 93
bzt wrote:
ngx wrote:
Does the SP contain virtual/linear or physical address?
Unless specifically stated otherwise in the Intel manual, once MMU is turned on, all instructions and all registers use linear address.

ngx wrote:
How could I look at page mappings - is there any way to dump page tables for debugging?
Read the wiki page on kernel debugging. With qemu, start a monitor and use "info tbl", but that's very uncomfortable to use. There's a good reason why I suggested to get bochs working, because there the "page" command in the built-in debugger will do exactly what you want.

Cheers,
bzt


Thanks for your help. The bochs does not support UEFI unfortunately as I heard, so I will not be able to run my OS on it(does it support amd64 or only x86) or does it support UEFI?

But the good thing is that I managed to fix the page tables and no faults occur now, the problem was very stupid and I was able to fix it with just changing initialization and flag setting places. I don't know how, but for some reason I forgot that I should ignore the flags and I had it like that, but with page tables of all levels(not only 3 and 4)
Code:
l4[offset] = palloc
setPageFlags(l4[offset]
l3 = l4[offset]

So I was initializing next level page table pointer, not with the pointer to a table, but with pointer to a table where first 2 bits are set which completely messed up the array


Top
 Profile  
 
 Post subject: Re: Paging resources and wrong code
PostPosted: Tue Apr 13, 2021 9:27 am 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
ngx wrote:
The bochs does not support UEFI unfortunately as I heard, so I will not be able to run my OS on it(does it support amd64 or only x86) or does it support UEFI?
Yes, bochs supports all operating modes (real, protected and long mode as well). Officially UEFI bios isn't supported, but multiple forum members have reported that they got it working. I'm sure if you open a topic with "using UEFI with bochs", someone will be able to help you with that. (I can't, because it takes more than half a minute to boot TianoCore, while all the other boot methods finish in less than a sec. Huge difference, specially in rapid development-test cycles. So I avoid UEFI as much as I can, and if I have to, I only use qemu+TianoCore, VB+UEFI and real hardware for testing. But most of the time I just don't care, I prefer coreboot or legacy BIOS for every day OS testing as those are lighting fast. My loader is written in a way that by the time the kernel gains control, it doesn't matter at all what kind of firmware were used to load it.)

ngx wrote:
But the good thing is that I managed to fix the page tables and no faults occur now
Glad to hear, well done!

Cheers,
bzt


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 27 posts ]  Go to page Previous  1, 2

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], Google [Bot], nullpointer and 60 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group