OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 9:53 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: Paging Problem on some HW: triple fault on enabling
PostPosted: Wed Apr 25, 2018 3:22 am 
Offline
User avatar

Joined: Sat Sep 05, 2015 2:10 am
Posts: 11
Location: Italy
Hi everyone, I'm an Italian student (so excuse me if my english is poor). I'm rewriting my very basic OS, but trying to improve it so much, and I decided to implement x86_64 for the first time.
As currently I'm working on the bootloader, and writing it in assembler, my environment consists consists in NASM, bochs, other universal tools like an hex-editor, running in LinuxMint 18.3 'Sylvia' on a HP P6 Pavilion (2011) laptop machine. Everything was working very well, I was able to enter long mode, and I identity mapped the first 16Gb of RAM (to grant me access to the whole memory, waiting to encode a proper memory manager) using PSE. I then kept programming and started setting up the environment of my os, like IRQ handlers, exception handlers, my system call, basic I/O environment, when I decided to test it on real hardware (the same laptop i'm working on), via USB seen as an hard disk booting. It triple faulted. I tried on a different laptop of the same series, but the result was the same. So I tried on different PCs but all was fine.
I began trying to sort out which instruction was causing trouble, and I found that was the MOV CR0,EAX that enables paging. So i started debugging my paging system, and rewrote it as a very simple identity allocation of the first 2Mb, no PSE, as seen in tutorials, but the problem persists. I tried to do a lot of little changes like even flipping the page general enable bit, but I can't find out the problem. Also I cannot get and handle the exception, as my handlers seems to get never called (to be sure of this i put hangs in the exception handlers) and the cpu resets immediatly, as it would if i set a not page-aligned CR3.
This is driving me mad, and I'm running out of ideas on how debug this, so I thought to call for help.
Here's my original paging setup code (now disabled):
Code:
TABLES CREATION:
;                MOV             EDI,2000h                       ;Setup PAGINING
;                MOV             CR3,EDI
;                MOV             ECX,4000h
;                XOR             EAX,EAX
;PGT_CLR_LP:     MOV             [EDI],AL                        ;Cleaning pages i'm going to use
;                LOOP            PGT_CLR_LP
;                MOV             EDI,2000h
;                MOV     DWORD   [EDI],3003h                     ;Setting PDPT0 in PML4T[0]
;                ADD             EDI,1000h
;                MOV     DWORD   [EDI],4003h                     ;Setting PDT0 in PDPT0[0]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],5003h                     ;Setting PDT1 in PDPT0[1]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],6003h                     ;Setting PDT2 in PDPT0[2]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],101003h                   ;Setting PDT3 in PDPT0[3]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],102003h                   ;Setting PDT4 in PDPT0[4]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],103003h                   ;Setting PDT5 in PDPT0[5]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],104003h                   ;Setting PDT6 in PDPT0[6]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],105003h                   ;Setting PDT7 in PDPT0[7]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],106003h                   ;Setting PDT8 in PDPT0[8]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],107003h                   ;Setting PDT9 in PDPT0[9]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],108003h                   ;Setting PDTA in PDPT0[A]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],109003h                   ;Setting PDTB in PDPT0[B]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],10A003h                   ;Setting PDTC in PDPT0[C]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],10B003h                   ;Setting PDTD in PDPT0[D]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],10C003h                   ;Setting PDTE in PDPT0[E]
                ;ADD             EDI,08h
                ;MOV     DWORD   [EDI],10D003h                   ;Setting PDTF in PDPT0[F]
;                MOV             EDI,4000h
;                MOV             EBX,00000083h
;                MOV             ECX,00000600h
;                PUSH            DBG32_4
;                CALL            DEBUG
;PGT_IDENTITY:   MOV             [EDI],EBX                       ;Setting up tables to identity map
;                ADD             EBX,200000h
;                ADD             EDI,0008h
;                LOOP            PGT_IDENTITY;

;                MOV             EDI,101000h
;                MOV             EBX,0C0000083h
;                MOV             ECX,1A00h
;                MOV             EDX,00h
;PGT_ID2:        MOV             [EDI],EBX
;                ADD             EDI,04h
;                MOV             [EDI],EDX
;                ADD             EBX,200000h
;                JO              PGT_IDO
;PGT_ID2C:       ADD             EDI,0004h
;                LOOP            PGT_ID2
;                JMP             LONGYS
;PGT_IDO:        ADD             EDX,01h
;                MOV             EBX,00h
;                JMP             PGT_ID2C
SWITCHING TO LONG MODE:
LONGYS:         PUSH            DBG32_5
                CALL            DEBUG
                PUSH            DBG32_6
                CALL            DEBUG   
                MOV             EAX,CR4                         ;Switch to LONG MODE
                OR              EAX,110000b                   ;Setting PAE(bit5) e il PSE(bit4)
                MOV             CR4,EAX
                PUSH            DBG32_7
                CALL            DEBUG
                MOV             ECX,0C0000080h                  ;Asking for EFER MSR (0xC0000080h)
                RDMSR                                           
                OR              EAX,100000000b                  ;Setting LM-bit (bit 8).
                WRMSR
               
                PUSH            DBG32_8
                CALL            DEBUG
                PUSH            DBG32_9
                CALL            DEBUG
MOV             EAX,CR0
                OR              EAX,80000001h                   ;Setting Paging(bit31)
---------------MOV             CR0,EAX                CRASHES HERE----------------------------------------------------------------------
                LGDT            [GDT64]                         ;LOADING GDT
                LIDT            [IDT64]                         ;Loading IDT
                JMP             08h:LONG_MODE                   ;Jumping to a 64 bit segment
                JMP             64_ERR


I know that that's dodgy, in fact that was only a temporary solution , but it worked.
This is how i'm currently doing:
Code:
Setting up tables:
                MOV             EDI,2000h                       ;PML4 at 0x2000
                MOV             CR3,EDI                         ;Setting CR3
                XOR             EAX,EAX                         
                MOV             ECX,4000h
CLR_PAG:        MOV             [EDI],AL                        ;Wiping pages 2000h,3000h,4000h,5000h
                LOOP            CLR_PAG
                MOV             EDI,CR3                         o
                MOV     DWORD   [EDI],3003h                     ;Setting PDPT0 in PML4[0]
                ADD             EDI,1000h
                MOV     DWORD   [EDI],4003h                     ;Setting PDT0 in PDPT0[0]
                ADD             EDI,1000h
                MOV     DWORD   [EDI],5003h                     ;Setting PT0 in PDT0[0]

                MOV             EBX,0003h                       ;Identity mapping first 512 pages in PT0
                MOV             ECX,0200h
                MOV             EDI,5000h
FILL_PT:        MOV             [EDI],EBX
                ADD             EBX,1000h
                ADD             EDI,08h
                LOOP            FILL_PT
Switching to long mode:
LONGYS:         PUSH            DBG32_5
                CALL            DEBUG
                PUSH            DBG32_6
                CALL            DEBUG   
                MOV             EAX,CR4                         ;Switch to LONG MODE
                OR              EAX,10100000b                   ;Setting PGE(bit7) and PAE(bit5)
                MOV             CR4,EAX
                PUSH            DBG32_7
                CALL            DEBUG
                MOV             ECX,0C0000080h                  ;Asking for EFER MSR (0xC0000080h)
                RDMSR                                           
                OR              EAX,100000000b                  ;Setting LM-bit (bit 8).
                WRMSR
               
                PUSH            DBG32_8
                CALL            DEBUG
                PUSH            DBG32_9
                CALL            DEBUG
                MOV             EAX,CR0
                OR              EAX,80000001h                   ;Setting Paging(bit31)
---------------MOV             CR0,EAX                       CRASHES HERE------------------------------------------------------------------------------------------------
                LGDT            [GDT64]                         ;Loading GDT
                LIDT            [IDT64]                         ;Loading IDT
                JMP             08h:LONG_MODE                   ;Jumping to a 64 bit segmen
                JMP             NO_64


It crashes on the highlighted instructions. I have to say that I have exception handlers both at 32 and 64 bits,and they are tested, that the os checks for the capability of doing everything (long mode, pae, pse, extended cpuid, etc), that i'm sure that the problem is that instruction cause i put a hang before, and then after, and that the pcs i'm testing this on are teorically capable of doing those things. I really don't understand why only on some hp laptops this problem exists. I really hope that you guys can help me sort out the problem... any kind of suggestion is welcomed.

_________________
You learn more by a single triple fault than by reading the whole Intel specification...


Top
 Profile  
 
 Post subject: Re: Paging Problem on some HW: triple fault on enabling
PostPosted: Thu Apr 26, 2018 12:39 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
Your CLR_PAG loop is failing to advance EDI. Therefore you are not clearing 0x4000 bytes to 0, merely setting the byte at address 0x2000 to zero 0x4000 times. I would advise setting ECX to a quarter of its current value and replacing that loop with "REP STOSD" and be done with it.

I am not sure how that would cause a tripple fault, though. Maybe a PF or a GPF if the processor was unhappy about something, OK, but you claim to catch these faults. And even if not, I would expect a DF exception, not a tripple fault. Maybe IDTR not set up correctly?

BTW, if you are in 32-bit mode already, why are you writing in assembly, and not in C?


Top
 Profile  
 
 Post subject: Re: Paging Problem on some HW: triple fault on enabling
PostPosted: Thu Apr 26, 2018 1:26 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4591
Location: Chichester, UK
The same mistake is made in the FILL_IDT loop. So the page table will be completely invalid. As soon as paging is enabled there will be a (page?) fault which will try to run an exception handler at a completely random address. Bingo! - triple fault.

This sort of error would be obvious if the code was written in C.


Top
 Profile  
 
 Post subject: Re: Paging Problem on some HW: triple fault on enabling
PostPosted: Thu Apr 26, 2018 2:15 am 
Offline
User avatar

Joined: Sat Sep 05, 2015 2:10 am
Posts: 11
Location: Italy
Ok thank you. These were silly mistakes, I'll fix them. I'll answer ti you properly when I have a break

UPDATE:
Thanks iansjack for replying, but in the FILL_PT loop there is an ADD EDI,08h , so I don't think that that's the problem

_________________
You learn more by a single triple fault than by reading the whole Intel specification...


Last edited by micccy on Thu Apr 26, 2018 7:04 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Paging Problem on some HW: triple fault on enabling
PostPosted: Thu Apr 26, 2018 2:50 am 
Offline
User avatar

Joined: Sat Sep 05, 2015 2:10 am
Posts: 11
Location: Italy
Ok, so: I'll fix the CLR_PAG loop, that was a silly mistake that I didn't notice, and probably didn't give me problema because in bochs and some machines that memory area might be already set to zero. Also I didn't access menory areas above 2Mb until some lines after, where I read acpi tables: the fact that i can handle the page fault that happens there, and the idt check that I did entering "info idt" in bochs debug, makes me think that I set IDT properly for 64bit long mode. Also I tried to cause artificially exceptions (like reading 0xFFFFFFFFFFFFFFFF in memory or doing XOR EBX,EBX /DIV BL) and they are handled properly, except for the exceptions happening between when I toggle LM bit, and when I've jumped ti a 64bit code segment (Maybe because I haven't Compatibility Mode exception handlers). It also seems that a fail in paging enabling locks up all the memory, causing subsequently page faults and triple faulting, but I'm not sure, so I'll check It out.
And to answer your question, there is no particular reason why I'm writing all in assembler, I Just like assembler a lot, and I thought it was going to be fun to write everything in assembler (also if It surely isn't the better choice). Thank you, I'll do some tests

_________________
You learn more by a single triple fault than by reading the whole Intel specification...


Top
 Profile  
 
 Post subject: Re: Paging Problem on some HW: triple fault on enabling
PostPosted: Thu Apr 26, 2018 10:20 am 
Offline
User avatar

Joined: Sat Sep 05, 2015 2:10 am
Posts: 11
Location: Italy
nullplan wrote:
I am not sure how that would cause a tripple fault, though. Maybe a PF or a GPF if the processor was unhappy about something, OK, but you claim to catch these faults. And even if not, I would expect a DF exception, not a tripple fault. Maybe IDTR not set up correctly?


FIXED:

I don't really understand why, but the processor on these machines doesn't like to go in long mode, before you set up a proper long mode GDT and a long mode IDT: to fix everything it was enough to move the LGDT and LIDT instructions before (and not after) the MOV CR0,EAX that enables paging.
Code:
THIS WORKS WELL:

                PUSH            DBG32_8
                CALL            DEBUG
                PUSH            DBG32_9
                CALL            DEBUG

                LGDT            [GDT64]                         ;Loading 64bit GDT
                LIDT            [IDT64]                         ;Loading 64bit IDT
                MOV             EAX,CR0
                OR              EAX,80000001h                   ;Enabling Paging(bit31)
                MOV             CR0,EAX
               
               JMP             08h:LONG_MODE                   ;Jumping in a 64 bit code segment
                JMP             NO_64

AND THIS TRIPLE FAULTS:
 
               PUSH            DBG32_8
                CALL            DEBUG
                PUSH            DBG32_9
                CALL            DEBUG

                MOV             EAX,CR0
                OR              EAX,80000001h                   ;Enabling Paging(bit31)
                MOV             CR0,EAX

                LGDT            [GDT64]                         ;Loading 64bit GDT
                LIDT            [IDT64]                         ;Loading 64bit IDT
               
               JMP             08h:LONG_MODE                   ;Jumping in a 64 bit code segment
                JMP             NO_64



Also, I can now tell that in the first case no exceptions are fired. Can anybody explain why is this happening? It seems very dodgy and overprotective to build a processor that immediately crashes on a mode change if it hasn't the proper GDT and IDT tables already loaded. Also this seems to happen only with HP Pavilion laptops, but not on Bochs, other PCs, and even HP Pavilion Desktops

_________________
You learn more by a single triple fault than by reading the whole Intel specification...


Top
 Profile  
 
 Post subject: Re: Paging Problem on some HW: triple fault on enabling
PostPosted: Thu Apr 26, 2018 11:44 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
Well, what is a poor processor to do? For some reason it is checking the validity of the GDT on switch to long mode. That does go against the architecture manual, but hey, this is Intel.

So, when it sees a broken GDT, what will it do? Cause a GPF. In Long Mode, while the IDTR still points to the 32-bit GPF handler. So that fails, and causes a DF. Again, while the IDTR points to the Protected Mode IDT, although the processor is in Long Mode. So that faults again, and there's your tripple fault.

_________________
Carpe diem!


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Google [Bot], Majestic-12 [Bot] and 64 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group