Entering a different ring 3 task after an interrupt.

ryukoposting · **Joined:** Thu Nov 07, 2019 2:17 pm **Posts:** 11

I have my timed interrupt working (as well as the other important interrupts and exceptions). I have structures for storing and scheduling tasks. I can get into ring 3, after setting up the GDT and TSS. I cannot figure out how to get from one ring 3 task to another. My timed interrupt is wrapped with this macro:

Code:

#define CONTEXT_SWITCH_ENTRY(exn) \
  Process *exception_##exn##_handler(CoreDump cd); \
  extern void exception_##exn##_entry(); \
  asm ( \
    ".globl exception_" #exn "_entry\n" \
    "exception_" #exn "_entry: \n" \
    "    pushl $0x0\n" \
    "    pusha\n" \
    "    pushw %ds\n" \
    "    pushw %es\n" \
    "    pushw %fs\n" \
    "    pushw %gs\n" \
    "    pushw %ss\n" \
    "    pushw %ss\n" \
    "    pushw %ss\n" \
    "    pushw %ss\n" \
    "    pushw %ss\n" \
    "    popw %ds\n" \
    "    popw %es\n" \
    "    popw %fs\n" \
    "    popw %gs\n" \
    "    mov %esp, %eax\n" \
    "    call exception_" #exn "_handler\n" \
    "    test %eax, %eax\n" \
    "    jz no_switch\n" \
    "    int $0x3\n"\          // this is where we need to switch to the new process. what the heck do I do?
    "    iret\n"\
    "no_switch:\n" \
    "    popw %gs\n" \
    "    popw %gs\n" \
    "    popw %fs\n" \
    "    popw %es\n" \
    "    popw %ds\n" \
    "    popa\n" \
    "    add $0x4, %esp\n" \
    "    iret\n");

The exception handler itself looks like this:

Code:

CONTEXT_SWITCH_ENTRY(0x20)
Process *exception_0x20_handler(CoreDump cd)
{
  pic_sendeoi(0);
  
  Process *current = sched_get_current_process();
  Process *switchto = 0;
  
  if (current) {
    sched_tick();
    readyqueue_add(current);
    Process *n = readyqueue_pop();
    if (current != n) {  // i.e. a different task now has the earliest deadline, so we need to change tasks
      switchto = n;
      // save all registers.
      // current->context.tss doesn't actually point to a TSS that is registered with the machine. I'm just
      // using the same type of struct to hold context data.
      current->context->tss.gs = cd.gs;
      current->context->tss.fs = cd.fs;
      current->context->tss.es = cd.es;
      current->context->tss.ds = cd.ds;
      current->context->tss.edi = cd.edi;
      current->context->tss.esi = cd.esi;
      current->context->tss.ebp = cd.ebp;
      current->context->tss.ebx = cd.ebx;
      current->context->tss.edx = cd.edx;
      current->context->tss.ecx = cd.ecx;
      current->context->tss.eax = cd.eax;
      current->context->tss.eip = cd.eip;
      current->context->tss.cs = cd.cs;
      current->context->tss.eflags = cd.eflags;

      // + 16 for CS, EIP, EFLAGS, exception code
      current->context->tss.esp = cd.esp + 16;
      current->context->tss.ss = cd.ss;
      
      sched_set_current_process(switchto);  // just sets an internal variable, doesn't do anything else
    }
  }
  return switchto;  // if going back to the same task, switchto is null. if switching tasks, points to the Process structure of the new task.
}

So, in essence, the interrupt pushes all registers and pushes segment registers to form the CoreDump structure (which I have verified is being formed correctly). Then, it calls the C function.

In the C function, we check to see if we're even in ring 3 yet (sched_get_current_process will return null if we haven't actually gotten into ring 3 yet). Next, it re-adds the currently running process to the priority queue. It then pops the highest-priority task from from the queue. I have verified that the priority queue is working correctly. If the popped task is the same one we were just running, we don't set switchto, just returning null and doing nothing else.

If the popped task is different, then we need to switch to the new task. I set switchto to point to the new process' Process struct. Then, I save the info in cd to the current task's process structure (effectively saving the current process' context), then set the scheduler's internal "current process" variable to the new process.

The return value of the C function allows the assembly to see if we're just returning to the previous process, or if a context switch needs to occur. The problem is, I have no idea how to make the actual context switch happen. I tried just copying the context data for the new process into 'cd' in the C function, but this just caused a GP fault at the iret. What actually needs to happen for the CPU to iret into a different ring 3 process?

nullplan · **Joined:** Wed Aug 30, 2017 8:24 am **Posts:** 1604

Usually you want a stack switch. Here is a simple model: Every task gets its own kernel stack. On task switch, the new task's kernel stack top is written into the current CPU's TSS's ESP0. No task is ever suspended in user mode, they all go to kernel mode sooner or later, and if it is by force.

The first level interrupt handlers only save the volatile registers (EAX, ECX, and EDX), as well as the segment registers, before switching the segments over to kernel mode. ESP is saved by the CPU, and EBP, EBX, EDI, and ESI are callee saved registers, according to the ABI.

Before returning to userspace, all interrupt handlers (if they are returning to userspace) and the system call handler check for a task flag that says the task should be scheduled out. If set, they all call a function to do that. The timer interrupt then only needs to set that flag. The function to switch tasks then only needs to save the nonvolatile registers (EBX, EBP, EDI, and ESI), switch stack to the next task's kernel stack, restore registers and return.

Code:

struct task {
    [...]
    unsigned task_flags;
    void *kstack_top;
    void *kstack_bot;
    [...]
};

extern void raw_switch_task(struct task *current, struct task *next);

Code:

switch_task:
    pushl %ebp
    movl %esp,%ebp
    subl $12, %esp
    movl %ebx, -4(%ebp)
    movl %esi, -8(%ebp)
    movl %edi, -12(%ebp)
    movl 8(%ebp), %eax
    movl 12(%ebp), %ecx
    movl %esp, KSTACK_BOT(%eax)
    movl KSTACK_BOT(%ecx),%esp
    movl KSTACK_TOP(%ecx),%edx
    /* now somehow get %edx into current_tss.esp0. I can do that like this: */
    movl %edx, %gs:CPU_TSS+TSS_ESP0

    pushl %ecx
    call sched_set_current_task
    addl $4,%esp

    movl (%esp),%edi
    movl 4(%esp),%esi
    movl 8(%esp),%ebx
    movl 12(%esp),%ebp
    addl $16, %esp
    ret

Code:

void switch_task(struct task *current, struct task *next) {
    current->task_flags &= ~TI_TIMEOUT;
    raw_switch_task(current, next);
    /* when we get here, another task switched to "current". */
}

Bit of a head-bender, but it means the volatile registers are only saved close to the top of the kernel stack (once), then there might be several call frames and maybe even an interrupt frame, and then there are the nonvolatile registers, and that's how all suspended tasks look. You might notice that this means you can create a new task by filling the new kernel stack with convenient values for "restoration".

Code:

int new_kernel_task(void (*main)(void*), void *arg) {
    struct task *new = ...
    [...]
    uint32_t *ks = new->kstack_top;
    ks -= 7;
    new->kstack_bot = ks;
    ks[3] = 0; /* mark stack frame as initial with 0 EBP */
    ks[4] = (uint32_t)main;
    ks[5] = (uint32_t)kernel_task_exit;
    ks[6] = (uint32_t)arg;
}

This means, the "ret" in raw_switch_task would jump to the given entry point, with ESP pointing to the start of the kernel task cleanup function, and the argument on stack.

Or a new user task:

Code:

int new_user_task(uint32_t entry, uint32_t start_sp)
{
    struct task *new = ...
    [...]
    uint32_t *ks = new->kstack_top;
    ks -= 10;
    new->kstack_bot = ks;
    ks[4] = (uint32_t)user_start_iret;
    ks[5] = entry;
    ks[6] = USER_CS;
    ks[7] = USER_INIT_EFLAGS;
    ks[8] = start_sp;
    ks[9] = USER_DS;
}

Code:

user_start_iret:
    movw $USER_DS, %ax
    movw %ax, %ds
    movw %ax, %es
    movw %ax, %fs
    movw %ax, %gs
    iret

OSDev.org

Entering a different ring 3 task after an interrupt.

Who is online