OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 7:03 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: Cyclical GP fault on iretq of timer interrupt in 64 bit OS
PostPosted: Sun Aug 12, 2018 3:39 am 
Offline

Joined: Sun Aug 12, 2018 1:48 am
Posts: 3
I'm trying to write a 64 bit OS. It throws a GP on iretq from the timer interrupt handler, then repeatedly throws more GPs from the iretq of the GP handler.

I know this because my generic handler prints the ISR number on the serial port, and it goes 32, 13, 13, 13, ...

The error code for the first GP is 10, which is my data segment.

I'm debugging it in qemu, so I can see quite a bit. Here's the situation at the iretq from the timer handler:
Code:
    (gdb) disas isr_common,isr_head_2                                               
    Dump of assembler code from 0x8189 to 0x81c4:                                   
       0x0000000000008189 <isr_common+0>:   callq  0x8125 <sayN100>                 
       0x000000000000818e <isr_common+5>:   cmp    $0x20,%eax                       
       0x0000000000008191 <isr_common+8>:   jl     0x81a8 <isr_common.no_more_acks> 
       0x0000000000008193 <isr_common+10>:  cmp    $0x30,%eax                       
       0x0000000000008196 <isr_common+13>:  jge    0x81a8 <isr_common.no_more_acks> 
       0x0000000000008198 <isr_common+15>:  cmp    $0x28,%al                         
       0x000000000000819a <isr_common+17>:  jl     0x81a2 <isr_common.ack_master>   
       0x000000000000819c <isr_common+19>:  push   %rax                             
       0x000000000000819d <isr_common+20>:  mov    $0x20,%al                         
       0x000000000000819f <isr_common+22>:  out    %al,$0xa0                         
       0x00000000000081a1 <isr_common+24>:  pop    %rax                             
       0x00000000000081a2 <isr_common.ack_master+0>:        push   %rax             
       0x00000000000081a3 <isr_common.ack_master+1>:        mov    $0x20,%al         
       0x00000000000081a5 <isr_common.ack_master+3>:        out    %al,$0x20         
       0x00000000000081a7 <isr_common.ack_master+5>:        pop    %rax             
       0x00000000000081a8 <isr_common.no_more_acks+0>:      cmp    $0x24,%ax         
       0x00000000000081ac <isr_common.no_more_acks+4>:      pop    %rax             
       0x00000000000081ad <isr_common.no_more_acks+5>:      pop    %rax             
    => 0x00000000000081ae <isr_common.end+0>:       iretq                           
       0x00000000000081b0 <isr_head_0+0>:   pushq  $0x55    ;DUMMY ERROR CODE                         
       0x00000000000081b2 <isr_head_0+2>:   mov    $0x0,%eax                         
       0x00000000000081b7 <isr_head_0+7>:   push   %rax                             
       0x00000000000081b8 <isr_head_0+8>:   jmp    0x8189 <isr_common>               
       0x00000000000081ba <isr_head_1+0>:   pushq  $0x55    ;DUMMY ERROR CODE                                                 
       0x00000000000081bc <isr_head_1+2>:   mov    $0x1,%eax                         
       0x00000000000081c1 <isr_head_1+7>:   push   %rax                             
       0x00000000000081c2 <isr_head_1+8>:   jmp    0x8189 <isr_common>

That also shows a couple of "isr_head"s which are entered in the IDT, might push a dummy error code and jmp to isr_common.

The stack looks correct to me:
Code:
    (gdb) bt                                   
    #0  0x00000000000081ae in isr_common.end ()
    #1  0x0000000000008123 in LongMode.Nirv () 
    #2  0x0000000000000010 in ?? ()             
    #3  0x0000000000000216 in ?? ()             
    #4  0x0000000000015000 in Pd ()             
    #5  0x0000000000000010 in ?? ()             
    #6  0x000000b8e5894855 in ?? ()             
    #7  0x78bf00000332e800 in ?? ()             
    #8  0x000003e3e8000000 in ?? ()                       

where:
Code:
    0x0000000000008122 <LongMode.Nirv+0>:        hlt                           
    0x0000000000008123 <LongMode.Nirv+1>:        jmp    0x8122 <LongMode.Nirv>

To be careful:
Code:
    (gdb) info registers                             
    rax            0x55     85                       
    rbx            0x80000011       2147483665       
    rcx            0xc0000080       3221225600       
    rdx            0x3f8    1016                     
    rsi            0xb      11                       
    rdi            0x3fc    1020                     
    rbp            0x0      0x0                       
    rsp            0x14fd8  0x14fd8 <Pd+36824>       
    r8             0x0      0                         
    r9             0x0      0                         
    r10            0x0      0                         
    r11            0x0      0                         
    r12            0x0      0                         
    r13            0x0      0                         
    r14            0x0      0                         
    r15            0x0      0                         
    rip            0x81ae   0x81ae <isr_common.end>   
    eflags         0x97     [ CF PF AF SF ]           
    cs             0x8      8                         
    ss             0x10     16                       
    ds             0x10     16                       
    es             0x10     16                       
    fs             0x10     16                       
    gs             0x10     16
   
    (gdb) x/32xg 0x14f00                                               
    0x14f00 <Pd+36608>:     0x0000000000841f0f      0x000000841f0f2e66
    0x14f10 <Pd+36624>:     0x00841f0f2e660000      0x1f0f2e6600000000
    0x14f20 <Pd+36640>:     0x2e66000000000084      0x0000000000841f0f
    0x14f30 <Pd+36656>:     0x000000841f0f2e66      0x00841f0f2e660000
    0x14f40 <Pd+36672>:     0x1f0f2e6600000000      0x2e66000000000084
    0x14f50 <Pd+36688>:     0x0000000000841f0f      0x000000841f0f2e66
    0x14f60 <Pd+36704>:     0x00841f0f2e660000      0x1f0f2e6600000000
    0x14f70 <Pd+36720>:     0x2e66000000000084      0x0000000000841f0f
    0x14f80 <Pd+36736>:     0x000000841f0f2e66      0x00841f0f2e660000
    0x14f90 <Pd+36752>:     0x1f0f2e6600000000      0x2e66000000000084
    0x14fa0 <Pd+36768>:     0x0000000000000020      0x0000000000008144
    0x14fb0 <Pd+36784>:     0x0000000080000011      0x0000000000000020
    0x14fc0 <Pd+36800>:     0x0000000000000020      0x0000000000000020
    0x14fd0 <Pd+36816>:     0x0000000000000055      0x0000000000008123
    0x14fe0 <Pd+36832>:     0x0000000000000010      0x0000000000000216
    0x14ff0 <Pd+36848>:     0x0000000000015000      0x0000000000000010


Now I'll let it run to the GP handler head:
Code:
    (gdb) break isr_head_13                                 
    Breakpoint 3 at 0x8236                                   
    (gdb) c                                                 
    Continuing.                                             
                                                             
    Breakpoint 3, 0x0000000000008236 in isr_head_13 ()       
    (gdb) bt                                                 
    #0  0x0000000000008236 in isr_head_13 ()                 
    #1  0x0000000000000010 in ?? ()                         
    #2  0x00000000000081ae in isr_common.no_more_acks ()     
    #3  0x0000000000000008 in ?? ()                         
    #4  0x0000000000000097 in ?? ()                         
    #5  0x0000000000014fd8 in Pd ()                         
    #6  0x0000000000000010 in ?? ()                         
    #7  0x0000000000000055 in ?? ()                         
    #8  0x0000000000008123 in LongMode.Nirv ()               
    #9  0x0000000000000010 in ?? ()                         
    #10 0x0000000000000216 in ?? ()                         
    #11 0x0000000000015000 in Pd ()                         
    #12 0x0000000000000010 in ?? ()

We see that it pushed the error code 0x10 after the usual stack with selector, flags and return address with selector, but the interesting thing is that my dummy error code from the timer (0x55) is back from the dead.
We already know it was popped by the first iretq and I didn't push it this time:
Code:
    (gdb) disas isr_head_13                                   
    Dump of assembler code for function isr_head_13:         
    => 0x0000000000008236 <+0>:     mov    $0xd,%eax         
       0x000000000000823b <+5>:     push   %rax               
       0x000000000000823c <+6>:     jmpq   0x8189 <isr_common>

I guess that's just 16-byte alignment, but I'm not really involved in that. The stack was 16-byte aligned before the timer went off but the CPU pushed an odd number of longlongs.

So why would it crash? The Intel docs say that GP with a selector means it tried to pop something out of range, but I see no such problem.

Any help much appreciated.


Top
 Profile  
 
 Post subject: Re: Cyclical GP fault on iretq of timer interrupt in 64 bit
PostPosted: Sun Aug 12, 2018 5:51 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

adrianmay wrote:
So why would it crash? The Intel docs say that GP with a selector means it tried to pop something out of range, but I see no such problem.


It's very likely that the stack is messed up when you IRETQ (e.g. forgot to POP something), causing the CPU to complain because (e.g.) the value for CS or SS its trying to load from the stack isn't where the CPU thinks it should be.

If that's the problem; then it's extremely unlikely that the compiler would have generated wrong code, which means that it's likely that the problem is in your assembly stubs (and the common interrupt handler if that's also in assembly).

Would you mind posting the original assembly source code for the stubs (and the common interrupt handler if that's also in assembly); so we can see the whole thing (and not just fragments excluding "not taken" branches)?

Note that this looks wrong:

Code:
    (gdb) disas isr_head_13                                   
    Dump of assembler code for function isr_head_13:         
    => 0x0000000000008236 <+0>:     mov    $0xd,%eax         
       0x000000000000823b <+5>:     push   %rax               
       0x000000000000823c <+6>:     jmpq   0x8189 <isr_common>

..because it's modifying RAX before pushing it (causing the original value in RAX from interrupted code to be trashed); but that can't cause a GPF by itself.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Cyclical GP fault on iretq of timer interrupt in 64 bit
PostPosted: Sun Aug 12, 2018 6:10 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
adrianmay wrote:
The stack looks correct to me:
Code:
    #2  0x0000000000000010 in ?? ()             

That looks an awful lot like your data selector was in CS when the IRQ occurred, which might explain the initial GPF.

Try running your OS in Bochs. Bochs logs a lot of detail by default, including which protection check is causing each GPF. It's often enough to pinpoint the issue, but if not, you can post the log here along with a link to your code.


Top
 Profile  
 
 Post subject: Re: Cyclical GP fault on iretq of timer interrupt in 64 bit
PostPosted: Mon Aug 13, 2018 5:35 am 
Offline

Joined: Sun Aug 12, 2018 1:48 am
Posts: 3
Indeed it was because I had the data segment on the stack where the code seg should have been. When I put 8: in front of an earlier jump the problem went away.
Thanks everybody!


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 68 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group