OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Mar 29, 2024 12:48 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: [Solved] Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 4:30 am 
Offline

Joined: Sat Jun 04, 2016 8:52 am
Posts: 4
Hi!

I'm writing a kernel for amd64 long mode. So far only doing the basic bootstrapping (interrupts, apic, page tables etc).

I've started to see a problem though - I sometimes get strange page faults and trigger "impossible" assertions in my logic, maybe every 5th time I run through my kernel code (I'm using Bochs). Most of the time, everything seems to run exactly as I'd expect. I think I've isolated it down to the following:
- I only get the errors when compiling with -O0, never with -O2.
- I only get the errors when I'm running with interrupts enabled, never with interrupts disabled.

So my main suspicion is that I have some kind of memory/register corruption going on, that depends on the timing of how my interrupt handling runs relative to the rest of my code.

The thing is, no matter how much I've stared at my interrupt code or compared it to other sources online, I can't find anything that seems wrong. So that's why I'm hoping someone here has more wisdom and can tell if I'm doing something that would cause issues. :?
Oh, and the only interrupt that seems to be triggered (in both the normal case and when I get my assertions) is INT32, which would be the IRQ 0 timer interrupt, so no strange stuff going on there.

Attaching the relevant parts of my isr setup, in asm and C++.


Attachments:
int.cpp [779 Bytes]
Downloaded 19 times
intentries.S [2.09 KiB]
Downloaded 18 times


Last edited by cadaker on Wed Nov 07, 2018 12:24 pm, edited 1 time in total.
Top
 Profile  
 
 Post subject: Re: Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 5:45 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5103
The System V ABI for x64 includes a red zone at the top of the stack. Are your interrupt handlers clobbering it?


Top
 Profile  
 
 Post subject: Re: Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 6:17 am 
Offline

Joined: Sat Jun 04, 2016 8:52 am
Posts: 4
Oh, I hadn't considered that. That's a good suggestion to look into, thanks!


Top
 Profile  
 
 Post subject: Re: Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 6:35 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5103
In case it helps, the usual solutions are to either disable the red zone (don't forget libgcc if you're using GCC) or have interrupts switch to a new stack.


Top
 Profile  
 
 Post subject: Re: Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 8:07 am 
Offline
Member
Member
User avatar

Joined: Mon Sep 03, 2018 2:25 am
Posts: 66
You could switch to a new stack on an interrupt using an interrupt stack table, https://os.phil-opp.com/double-fault-exceptions/ this is a tutorial that implements it. I know it is in rust but it shouldn't be to hard to convert it to C/C++.


Top
 Profile  
 
 Post subject: Re: Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 12:23 pm 
Offline

Joined: Sat Jun 04, 2016 8:52 am
Posts: 4
Not getting any issues anymore after rebuilding with -mno-red-zone, so this was very probably the issue.

Also realized that I'm not linking in libgcc, so I should get some proper interrupt stacks in soon, I guess, and fix this properly.

Thanks a lot everyone!


Top
 Profile  
 
 Post subject: Re: [Solved] Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 1:46 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
Ugh, ISTs... I dislike them. You can register up to 7 interrupt stacks (3 bit field and 0 has a special meaning), but you are going to want to handle dozens if not hundreds of interrupts. Not to speak of exceptions. So you can't use one stack for each interrupt. So what then? You also can't nest two interrupts that use the same IST, as those would clobber the stack. Whereas good old stack switching just adds the new frame to the stack.

I would seriously suggest rebuilding libgcc with special kernel options (-mno-red-zone because interrupts can always occur in kernel mode, -mcmodel=kernel in order to use the correct relocations for code that will run at -2GB, -msoft-float to prevent the use of floating point registers or SSE in kernel mode) and applying those to your kernel as well. This way, there is no need for ISTs. Well, almost no need. Some exceptions can happen at any time. Now, I have decided to ditch ISTs entirely, and therefore also need to ditch the "syscall" instruction, as that one does not switch stacks. But if you choose to use syscall, you will have times when you are at CPL0 with an invalid RSP value. And for those you might want to consider ISTs. But then you have to be careful not to nest anything. Which can be a challange, for NMIs for instance, as those are triggerred externally, beyond your control. The CPU doesn't recognize further NMIs until the next "iret", but for one, this could be the iret from an exception handler, from an exception caused while handling the NMI, and for two, this could be an iret from the firmware executed in system management mode. Which you couldn't see, because SMM is invisible to you.

Another canonical example is double fault exceptions. But why though? All exceptions can happen at any time, anyway.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: [Solved] Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 3:13 pm 
Offline

Joined: Sat Jun 04, 2016 8:52 am
Posts: 4
Hmm, yeah, reading up on this, I guess you're right. I mean, I'd still need the IST setup to be interrupt safe w.r.t. red-zones in userspace, but there really doesn't seem to be any good way to be interrupt-safe in this way inside the kernel. How... disappointing.


Top
 Profile  
 
 Post subject: Re: [Solved] Interrupt handler bug?
PostPosted: Wed Nov 07, 2018 11:33 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
No, you don't need them for that. Userspace typically runs at CPL3, whereas any interrupts run at CPL0. So the stack is already switched to the CPL0 stack if a privilege change occurs. You only need to setup the TSS. The red zone only becomes important for signal handling.

_________________
Carpe diem!


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: DotBot [Bot] and 106 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group