OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 12:22 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 15 posts ] 
Author Message
 Post subject: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Sat Mar 20, 2010 7:46 am 
Offline
User avatar

Joined: Sat Oct 17, 2009 4:32 am
Posts: 21
It has costed me 6 days to debug and fix this, but it was really worth the effort.

It's interesting that no OSDev resources discussed this important topic before. From further inspection, there seems to be a high rate of hobby x86-64 kernels that get their leaf functions stacks silently overriden in case of an interrupt triggered in the right place.

Now to the story: somehow the montonic PIT interrupts that get triggered every 1 millisecond badly corrupted my kernel state. At first, I thought the handler code might have corrupted the kernel stack, but minimizing it only to acknowledging the local APIC:
Code:
      push   %rax
      movq   $(VIRTUAL(APIC_PHBASE) + APIC_EOI), %rax
      movl   $0,(%rax)
      pop    %rax
      iretq
led to the same buggy behaviour.

It was weird. Once I enable interrupts and program the PIT to fire at a high rate, things go insane: random failed assert()s and page-fault exceptions get triggered all over the place. I even minimized the handler code more by ditching the IOAPIC and using the PIC in Automatic EOI mode. This has led to the absolute architecturally minimum x86 IRQ handler of:
Code:
      iretq
but nothing really changed: the same ugly symptomps prevailed.

After days and days of disassembly and hex dumps, I found that GCC generated this assemly for memcpy() at -O0:
Code:
      ffffffff80109c88:       55                      push   %rbp
      ffffffff80109c89:       48 89 e5                mov    %rsp,%rbp
      [snip]

      /* Bochs magic breakpoint */
      ffffffff80109caa:       66 87 db                xchg   %bx,%bx

      /* Our manual software interrupt */
      ffffffff80109cad:       cd f0                   int    $0xf0

      /* Failing code, specially last line */
      ffffffff80109caf:       48 8b 45 f8             mov    -0x8(%rbp),%rax
      ffffffff80109cb3:       0f b6 10                movzbl (%rax),%edx
      ffffffff80109cb6:       48 8b 45 f0             mov    -0x10(%rbp),%rax
      ffffffff80109cba:       88 10                   mov    %dl,(%rax)
Noticed something? Look again, especially at how the x86 ops accessed the stack. Yes, GCC kept parts of the leaf function local state below the stack pointer. Now when the interrupt was soft-triggerd, the CPU rightfully pushed CPU counter, status word, and stack pointer (%ss, %rsp, %rflags, %cs, %rip) which meant that parts of the kernel state got corrupted through the implicit CPU stack usage!

Now certainly the generated code is interrupts unsafe. Scanning the AMD64 ABI document for any paragraphs that mentioned the stack, the reason was found: it's the red zone. The zone is a 128-byte area, below the stack, mandated by the x86-64 ABI to be safe for use to leaf functions. It's also safe for higher level functions to use before they call any other function, where they'll need to 'reserve' the used parts of the zone beforehand by moving the stack further down.

All what was needed to fix the bug, like a magic pill, was instructing GCC not to use this x86-interrupts-unsafe zone:
Code:
      -mno-red-zone
Note that the bug was much easier to trigger at -O0 due to -O0's heavy stack usage. At -O2 and -O3, the bug got triggered with much less frequency.

Everything became sane afterwards: the heavy test cases now works well while the PIT is firing rapidly at all possible optimization levels. I would really like to thank Brendan for advising me to further investigate the issue using Bochs binary single-stepping debugger when I was stuck :mrgreen:

_________________
For latest news, please check my homepage and my blog.
—Darwish


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Sat Mar 20, 2010 9:20 am 
Offline
Member
Member

Joined: Sun Jan 11, 2009 7:41 pm
Posts: 89
Very good article. Maybe put it in wiki?
I'm lucky. After reading the ABI, I add the -mno-red-zone flag to g++. You know why? I just follow linux, and didn't really know the difference between "red zone" and "no red zone" :mrgreen:


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Sat Mar 20, 2010 10:43 am 
Offline
Member
Member

Joined: Tue Dec 15, 2009 6:36 pm
Posts: 44
Is this kernel or user code? If user code, are you not using a separate stack for ring 0?


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Sat Mar 20, 2010 12:09 pm 
Offline
User avatar

Joined: Sat Oct 17, 2009 4:32 am
Posts: 21
nedbrek wrote:
Is this kernel or user code? If user code, are you not using a separate stack for ring 0?

That's purely kernel context; I don't have user-space support yet.

_________________
For latest news, please check my homepage and my blog.
—Darwish


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Sat Mar 20, 2010 12:23 pm 
Offline
User avatar

Joined: Sat Oct 17, 2009 4:32 am
Posts: 21
torshie wrote:
Very good article. Maybe put it in wiki?

Added here, with extra flags needed to disable emitting SSE ops.

_________________
For latest news, please check my homepage and my blog.
—Darwish


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Sun Mar 21, 2010 12:18 am 
Offline
Member
Member

Joined: Sun Jan 11, 2009 7:41 pm
Posts: 89
Darwish wrote:
torshie wrote:
Very good article. Maybe put it in wiki?

Added here, with extra flags needed to disable emitting SSE ops.

You can enable SSE ops by setting bit OXFXSR(9) of CR4, of course you need to save SSE registers before entering an interrupt handler. :mrgreen:


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Sun Mar 21, 2010 12:42 am 
Offline
Member
Member
User avatar

Joined: Fri May 16, 2008 7:13 pm
Posts: 301
Location: Hanoi, Vietnam
great post! This is the first time I know about 'red zone'

_________________
"Programmers are tools for converting caffeine into code."


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Mon Dec 16, 2013 6:57 pm 
Offline
Member
Member

Joined: Mon Dec 16, 2013 6:50 pm
Posts: 27
Arrrrrrrggh!!! This has driven me insane for the past two weeks.

Yes, I know that I am resurecting an old thread but this one needs to be pinned for everyone to see. That red-zone thing drove me insane, and I just found out about it when I decompiled a leaf function. To my surprise there was no sub xxx,%rsp at the begining of the function. I wasn't sure what therms to type in google but finally I found the answer (because google did return quite a lot of results when asking "why gcc doesn't substract anything from rsp to create stack frame", we're not in 1997 anymore.)


But let me ask this: how can interrupts guarantee that this 128bytes red zone will be safe? the CPU will obviously crush the first 8 bytes when saving RIP because jumping to the handler right?


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Mon Dec 16, 2013 7:20 pm 
Offline
Member
Member
User avatar

Joined: Tue Dec 25, 2007 6:03 am
Posts: 734
Location: Perth, Western Australia
It's standard to disable the red zone in kernel code, either that or you make sure to provide the red zone before calling the C code.

_________________
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Mon Dec 16, 2013 8:31 pm 
Offline
Member
Member

Joined: Mon Dec 16, 2013 6:50 pm
Posts: 27
thepowersgang wrote:
either that or you make sure to provide the red zone before calling the C code.


I'm not exactly sure what you mean. How would you do that? by changing the value of RSP before calling the c code?
Even then, what happens if an interrupt is generated once you are in the c code? I don't think it will be any less dangerous, am I right?

The only way I can think of right now is by running the handlers in a different ring level than the rest of the code so that the handlers would have their own stack. So since the handlers usually would run in ring 0, it would mean that there is absolutely no other options than to disable the red zone with the compiler flag for the kernel code. Unless there is a way to tell the CPU to sub 128 from rsp when calling an int.


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Mon Dec 16, 2013 9:07 pm 
Offline
Member
Member
User avatar

Joined: Tue Dec 25, 2007 6:03 am
Posts: 734
Location: Perth, Western Australia
Sorry, actually, I'd forgotten what the red zone really was. You disable red zone in kernel code exactly because of this (the CPU will write within this zone when an interrupt fires)

_________________
Kernel Development, It's the brain surgery of programming.
Acess2 OS (c) | Tifflin OS (rust) | mrustc - Rust compiler
Currently Working on: mrustc


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Tue Dec 17, 2013 6:05 pm 
Offline
Member
Member
User avatar

Joined: Wed Mar 21, 2012 3:01 pm
Posts: 930
xmm15 wrote:
Arrrrrrrggh!!! This has driven me insane for the past two weeks.


I share your pain. This drove me insane for three months. Surely you know already by now to compile your kernel code and kernel libc code (if such exists) with -mno-red-zone. Note that if you are using libgcc, that libgcc is built with the red zone enable, so there's an infinitesimal chance that some odd call into that blows up, though there such calls would happen on x86_64 normally. You may well wish to actually implement the red zone, though it may well be difficult because the x86_64 architecture is silly. I do remember some interrupt stack switching mechanism that may be exploited for your needs.


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Thu Jan 07, 2016 10:34 am 
Offline

Joined: Wed Jan 06, 2016 5:20 am
Posts: 1
thepowersgang wrote:
Sorry, actually, I'd forgotten what the red zone really was. You disable red zone in kernel code and read this review of PhenQ exactly because of this (the CPU will write within this zone when an interrupt fires)


Where do you place the code? Sorry if it's a dumb question but I'm a struggling newb lol.


Last edited by GloverTex on Sat Feb 12, 2022 2:21 pm, edited 8 times in total.

Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Thu Jan 07, 2016 10:52 am 
Offline
User avatar

Joined: Mon Dec 28, 2015 7:36 pm
Posts: 9
GloverTex wrote:
All what was needed to fix the bug, like a magic pill, was instructing GCC not to use this x86-interrupts-unsafe zone:

Code:
      -mno-red-zone


Where do you place the code? Sorry if it's a dumb question but I'm a struggling newb lol.


It's a compiler flag. You need to pass it as a parameter to gcc.


Top
 Profile  
 
 Post subject: Re: Long-mode Kernels and the AMD64 ABI 'Red Zone'
PostPosted: Fri Jan 08, 2016 5:14 pm 
Offline
Member
Member
User avatar

Joined: Sun Sep 19, 2010 10:05 pm
Posts: 1074
This has come up several times in the past. It's actually mentioned on the Calling Conventions page, in a footnote. But I'm all for adding more references to it to the Wiki.

edit: just realized how old this thread was.. :)

_________________
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 15 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Google [Bot], Octocontrabass and 92 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group