OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Mar 29, 2024 2:55 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 12 posts ] 
Author Message
 Post subject: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Tue Apr 05, 2022 12:06 pm 
Offline
Member
Member

Joined: Thu Sep 27, 2018 5:10 pm
Posts: 28
Location: Turkey
I am trying to read and understand Brendan's Multi-tasking Tutorial before trying to implement my own kernel threads.

I was able to follow the tutorial, until I got to this version of PIT handler;

Code:
uint64_t time_slice_remaining = 0;

uint64_t time_since_boot = 0;
uint64_t time_between_ticks = 1000000;  // 1000 Hz = 1 ms between ticks = 1000000 nanoseconds between ticks

void PIT_IRQ_handler(void) {
    thread_control_block * next_task;
    thread_control_block * this_task;

    lock_stuff();
    time_since_boot += time_between_ticks;

    // Move everything from the sleeping task list into a temporary variable and make the sleeping task list empty

    next_task = sleeping_task_list;
    sleeping_task_list = NULL;

    // For each task, wake it up or put it back on the sleeping task list

    while(next_task != NULL) {
        this_task = next_task;
        next_task = this_task->next;

        if(this_task->sleep_expiry <= time_since_boot) {
            // Task needs to be woken up
            unblock_task(task);
        } else {
            // Task needs to be put back on the sleeping task list
            task->next = sleeping_task_list;
            sleeping_task_list = task;
        }
    }

    // Handle "end of time slice" preemption

    if(time_slice_remaining != 0) {
        // There is a time slice length
        if(time_slice_remaining <= time_between_ticks) {
            schedule();
        } else {
            time_slice_remaining -= time_between_ticks;
        }
    }

    // Done, unlock the scheduler (and do any postponed task switches!)

    unlock_stuff();
}


I will explain what I think is happening step by step so you can point out mistakes in my reasoning.

This function is called PIT_IRQ_handler, so I suppose it is called from an interrupt gate assembly wrapper. So, at the beginning of function interrupts are disabled by CPU. Also, PIC is also disabled waiting for "End Of Interrupt" command. Since EOI is not seen here, I will also assume it is sent by assembly wrapper after PIT_IRQ_handler is returned.

First thing this function does is to call lock_stuff() function. This function disabled interrupts and postpones task switching. It is implemented as below;

Code:
void lock_stuff(void) {
#ifndef SMP
    CLI();
    IRQ_disable_counter++;
    postpone_task_switches_counter++;
#endif
}


Disabling interrupts does nothing in this case because it is already disabled, but it doesn't cause any harm.

Next, function unblocks any sleeping threads. This part has nothing to do with my question, so I skip to next part.

In the next part, if current thread used up all its timeslice, schedule function is called. This is the version of schedule function that is supposed to be called at that point in tutorial.

Code:
void schedule(void) {
    thread_control_block * task;

    if(postpone_task_switches_counter != 0) {
        task_switches_postponed_flag = 1;
        return;
    }

    if( first_ready_to_run_task != NULL) {
        task = first_ready_to_run_task;
        first_ready_to_run_task = task->next;
        switch_to_task(task);
    } else if( current_task_TCB->state == RUNNING) {
        // Do nothing, currently running task can keep running
    } else {

        // Currently running task blocked and there's no other tasks

        task = current_task_TCB;                           // "task" is the task that we're borrowing while idle
        current_task_TCB = NULL;                           // Make sure other code knows we're not actually running the task we borrowed
        uint64_t idle_start_time = get_time_since_boot();  // For future power management support

        // Do nothing while waiting for a task to unblock and become "ready to run".  The only thing that is going to update
        // first_ready_to_run_task is going to be from a timer IRQ (with a single processor anyway), but interrupts are disabled.
        // The timer must be allowed to fire, but do not allow any task changes to take place.  The task_switches_postponed_flag
        // must remain set to force the timer to return to this loop.


        do {
            STI();          // enable interrupts to allow the timer to fire
            HLT();          // halt and wait for the timer to fire
            CLI();          // disable interrupts again to see if there is something to do
        } while( (first_ready_to_run_task == NULL);

        // Stop borrowing the task

        current_task_TCB = task;                         

        // Switch to the task that unblocked (unless the task that unblocked happens to be the task we borrowed)

        task = first_ready_to_run_task;
        first_ready_to_run_task = task->next;
        if(task != current_task_TCB) {
            switch_to_task(task);
        }
    }
}


Since lock_stuff() was called earlier, this function will just set task_switches_postponed_flag and return immediately. However, unlock_stuff() from PIT_IRQ_handler() will be called next. This is where my confusion begins. This is how it is implemented.

Code:
void unlock_stuff(void) {
#ifndef SMP
    postpone_task_switches_counter--;
    if(postpone_task_switches_counter == 0) {
        if(task_switches_postponed_flag != 0) {
            task_switches_postponed_flag = 0;
            schedule();
        }
    }
    IRQ_disable_counter--;
    if(IRQ_disable_counter == 0) {
        STI();
    }
#endif
}


Since task_switches_postponed_flag was set earlier, unlock_stuff() function will call schedule() and it will most probably call switch_task() function.

switch_task() function will return from another thread. However, we are yet to return from PIT_IRQ_handler. If next task we are switching to was called from PIT_IRQ_handler, things might work out fine. But if it is not, we won't be able to send EOI to PIC, so we can't preempt no more. It is possible for threads to block themselves to sleep for some time, so it is a real possibility.

Another problem I am having is this part in unblock_stuff() function;

Code:
    IRQ_disable_counter--;
    if(IRQ_disable_counter == 0) {
        STI();
    }


This enables interrupts again before we return from IRQ handler. This seems problematic as it is supposed be enabled after iret called from interrupt handler. However, since PIC is waiting for EOI anyways, this might actually be harmless. What do you think about this?

Looking for your input,

Best regards


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Tue Apr 05, 2022 12:53 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
Wow, I would not have expected Brendan's code to make the same mistake I see so many beginners make: Intermingling IRQ handling and task handling. These two things ought to be separate.

The points you raised are valid. The code does wrongly enable interrupts in the wrong place. It appears the counter was supposed to counteract this, but failed to adjust for the effect of interrupt gates. No matter, it would be better if he had primitives that read out the flags register before doing a cli, and restored that register instead of unconditionally enabling interrupts.
Code:
static inline unsigned irqdisable(void) {
  unsigned flags;
  asm volatile ("pushfd; cli; popl %0" : "=r"(flags) :: "memory");
  return flags;
}

static inline void irqrestore(unsigned flags) {
  asm("pushl %0; popfd" :: "r"(flags) : "memory", "cc");
}
That way you could nest things more elegantly. Come to think of it, he actually needs a spinlock mechanism to actually protect the variables he wants to protect, because the whole thing will not work as soon as a second CPU gets into the mix.

It is true that Brendan's code does not emit EOI until later, however, with interrupts enabled, there is the potential for stack overflow even from this small window (PIC can re-interrupt before CPU reaches iret instruction). And in modern PCs, PIC is by far not the only interrupt source, which also opens the door to stack overflow.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Tue Apr 05, 2022 3:17 pm 
Offline
Member
Member

Joined: Thu Sep 27, 2018 5:10 pm
Posts: 28
Location: Turkey
nullplan wrote:
Intermingling IRQ handling and task handling. These two things ought to be separate.


That is what I thought too. But, I also don't see a way to do preemptive multitasking outside timer interrupts.

nullplan wrote:
No matter, it would be better if he had primitives that read out the flags register before doing a cli, and restored that register instead of unconditionally enabling interrupts.


I re-read relevant section again, and he actually mentions that option, but opts for using a counter to make it look like a lock so we can change it later when we support multiple CPU's.

nullplan wrote:
because the whole thing will not work as soon as a second CPU gets into the mix.


To be fair, it was supposed to be for single-CPU only.

That aside, I am still trying to figure out a reasonable explaination to why this code should work because everything else is very carefully thought out to have an obvious bug.

One possible explanation I could think of is that, EOI is sent before calling PIT_IRQ_handler and interrupts are supposed to be enabled without returning from interrupt handler because everything else depends on it.


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Wed Apr 06, 2022 11:59 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
yasar11732 wrote:
That is what I thought too. But, I also don't see a way to do preemptive multitasking outside timer interrupts.
Separate out the concepts. First of all, I'd have the timer mechanism and the callback be separate. I'd have a priority queue of timed events, ordered by deadline, with the timer dynamically set to the smallest time, and then when it has arrived, just run all the expired callbacks. A cyclic event is then just re-inserted. This uncouples the timer from the timed event, and eases adoption of newer or better timers (e.g. HPET, LAPIC timer). So that way you would handle the PIT interrupt by handling all expired timers, then setting the PIT to fire on the smallest deadline, then EOI (and you insert a new timer by resetting the PIT if need be). And what the timers do is written somewhere else.

Then, for the actual scheduling, I would just add a task flag. Have a flags field in the task struct. You need one anyway to distinguish kernel and user tasks, or to distinguish FPU tasks. When the timer triggers, just set a "timeout" flag. This is handled in the code that returns from interrupt back to userspace. It checks to see if the return is to userspace, and if so and the timeout flag is set, it clears the timeout flag and then calls schedule(). This gets you out of the perilous interrupt state: Since you can return to userspace only on the highest level of the stack, that code cannot have interrupted other kernel code, and therefore can take locks and allocate memory and schedule processes. It also ensures that any interrupt controller is out of interrupt state before scheduling.

And finally for timekeeping: Just use a hardware counter. No need to count timer interrupts in software.
yasar11732 wrote:
One possible explanation I could think of is that, EOI is sent before calling PIT_IRQ_handler and interrupts are supposed to be enabled without returning from interrupt handler because everything else depends on it.
Another possibility is that the chances of something going wrong in reality are slim. There is a theoretical possibility of a stack overflow, but for real conditions to cause such a thing would require an interrupt storm, and unless such a thing happened, it would just not be noticed.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Wed Apr 06, 2022 12:45 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5103
nullplan wrote:
There is a theoretical possibility of a stack overflow, but for real conditions to cause such a thing would require an interrupt storm, and unless such a thing happened, it would just not be noticed.

I wouldn't call it theoretical. Linux stopped allowing nested interrupt handlers because someone encountered real conditions that could cause a stack overflow.


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Thu Apr 07, 2022 4:02 am 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 3192
Octocontrabass wrote:
nullplan wrote:
There is a theoretical possibility of a stack overflow, but for real conditions to cause such a thing would require an interrupt storm, and unless such a thing happened, it would just not be noticed.

I wouldn't call it theoretical. Linux stopped allowing nested interrupt handlers because someone encountered real conditions that could cause a stack overflow.


I just find this to be evidence that Linux (and Windows) nested interrupt structure is too complicated and some people are unable to write decent IRQ handlers that might cause stack overflows.

Generally speaking, an IRQ handler should only remove the interrupt condition and signal wakeup conditions for server threads. I see no reason why this would need to be run with interrupts disabled.


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Thu Apr 07, 2022 8:58 am 
Offline
Member
Member

Joined: Sun Jun 23, 2019 5:36 pm
Posts: 618
Location: North Dakota, United States
I am curious... Is there a better tutorial than Brendans available that anyone knows about discussing the implementation of multitasking? Or is Brendans still the "best"/"preferred" reference? I've always wondered how a system that uses (say) lottery scheduling would function under preemption...


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Thu Apr 07, 2022 9:53 am 
Offline
Member
Member

Joined: Thu Sep 27, 2018 5:10 pm
Posts: 28
Location: Turkey
I have another question about the same tutorial, it is about this function:

Code:
;C declaration:
;   void switch_to_task(thread_control_block *next_thread);
;
;WARNING: Caller is expected to disable IRQs before calling, and enable IRQs again after function returns

switch_to_task:

    ;Save previous task's state

    ;Notes:
    ;  For cdecl; EAX, ECX, and EDX are already saved by the caller and don't need to be saved again
    ;  EIP is already saved on the stack by the caller's "CALL" instruction
    ;  The task isn't able to change CR3 so it doesn't need to be saved
    ;  Segment registers are constants (while running kernel code) so they don't need to be saved

    push ebx
    push esi
    push edi
    push ebp

    mov edi,[current_task_TCB]    ;edi = address of the previous task's "thread control block"
    mov [edi+TCB.ESP],esp         ;Save ESP for previous task's kernel stack in the thread's TCB

    ;Load next task's state

    mov esi,[esp+(4+1)*4]         ;esi = address of the next task's "thread control block" (parameter passed on stack)
    mov [current_task_TCB],esi    ;Current task's TCB is the next task TCB

    mov esp,[esi+TCB.ESP]         ;Load ESP for next task's kernel stack from the thread's TCB
    mov eax,[esi+TCB.CR3]         ;eax = address of page directory for next task
    mov ebx,[esi+TCB.ESP0]        ;ebx = address for the top of the next task's kernel stack
    mov [TSS.ESP0],ebx            ;Adjust the ESP0 field in the TSS (used by CPU for for CPL=3 -> CPL=0 privilege level changes)
    mov ecx,cr3                   ;ecx = previous task's virtual address space

    cmp eax,ecx                   ;Does the virtual address space need to being changed?
    je .doneVAS                   ; no, virtual address space is the same, so don't reload it and cause TLB flushes
    mov cr3,eax                   ; yes, load the next task's virtual address space
.doneVAS:

    pop ebp
    pop edi
    pop esi
    pop ebx

    ret                           ;Load next task's EIP from its kernel stack


This is the code that switches between tasks. First it saves some registers to stack, it saves stack pointer and cr3 to task's structure. Next, it changes stack for next tasks stack, pop saved registers and exits.

I don't see flags register being saved somewhere. Wouldn't it cause problems if, for example, a task is stopped after 'cmp' and before 'jz'. Wouldn't other tasks be able to alter flags registers while there are running?


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Thu Apr 07, 2022 10:09 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
yasar11732 wrote:
I don't see flags register being saved somewhere. Wouldn't it cause problems if, for example, a task is stopped after 'cmp' and before 'jz'. Wouldn't other tasks be able to alter flags registers while there are running?
flags is a volatile register. See, this code is called synchronously, so it doesn't have to save everything. And at the asynchronous boundary, namely the interrupt handler, you already are saving everything. In particular, flags is part of the interrupt frame, and restored by iret.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Thu Apr 07, 2022 11:48 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5103
rdos wrote:
Generally speaking, an IRQ handler should only remove the interrupt condition and signal wakeup conditions for server threads. I see no reason why this would need to be run with interrupts disabled.

There can be dozens or even hundreds of interrupt sources. If interrupt handlers can be interrupted, your stack must be big enough for all interrupt handlers to nest at the same time.


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Thu Apr 07, 2022 12:25 pm 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 3192
Octocontrabass wrote:
rdos wrote:
Generally speaking, an IRQ handler should only remove the interrupt condition and signal wakeup conditions for server threads. I see no reason why this would need to be run with interrupts disabled.

There can be dozens or even hundreds of interrupt sources. If interrupt handlers can be interrupted, your stack must be big enough for all interrupt handlers to nest at the same time.



Not really. The IRQ stub will enable interrupts, but it won't do EOI until after all handlers have run. This means only higher priority IRQs can run. Besides, highly loaded IRQs should be running on their own cores.


Top
 Profile  
 
 Post subject: Re: Brendan's multi-tasking tutorial - Preemptive Scheduling
PostPosted: Thu Apr 07, 2022 3:06 pm 
Offline
Member
Member

Joined: Wed Oct 01, 2008 1:55 pm
Posts: 3192
yasar11732 wrote:

This is the code that switches between tasks. First it saves some registers to stack, it saves stack pointer and cr3 to task's structure. Next, it changes stack for next tasks stack, pop saved registers and exits.

I don't see flags register being saved somewhere. Wouldn't it cause problems if, for example, a task is stopped after 'cmp' and before 'jz'. Wouldn't other tasks be able to alter flags registers while there are running?


Generally, I think it is a bad idea and a mistake to leave the register set of a thread on an interupt stack. Registers should be saved & loaded from the thread control block. This also makes it easier to implement a kernel debugger, and it decouples scheduling from IRQs.

The code also lacks the actual scheduling. Typically, when an IRQ occurs, it might result in preemption of a thread, or switching to a high priority server thread. There is seldom a well-known thread to run next. That was the problem with hardware task-switching too, which assumed a switch from one thread to another with a call or jmp, but it didn't work well since there is a requirement for logic between having saved the state of the current thread and loading the state of the next (scheduling). Running the scheduler / task-switcher with interrupts disabled is a bad idea too. It should have some kind of lock instead, and run with interrupts enabled.

My scheduler has one dedicated stack per core that it will switch to after having saved the current thread. It can then run the scheduler function on a known-valid stack. The stack of the next thread to run is loaded as part of loading it's register context.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 137 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group