Brendan's multi-tasking tutorial

8infy · **Joined:** Sun Apr 05, 2020 1:01 pm **Posts:** 183

Hi, I'm currently trying to implement blocking for my scheduler and I'm using Brendan's tutorial, but I'm a bit confused about his ideas here:

https://wiki.osdev.org/Brendan%27s_Multi-tasking_Tutorial

Quote:

Step 4: Task States and "Schedule()" Version 2
For a real OS (regardless of which scheduling algorithm/s the scheduler uses) most tasks switches are caused because the currently running task has to wait for something to happen (data received from a different task or storage devices/file system or network, or time to pass, or for user to press a key or move the mouse, or for a mutex to become available, etc), or happen because something that a task was waiting for happened causing a task to unblock and preempt the currently running task.

To prepare for this, we want to add some kind of "state" to each task so that it's easy to see what a task is doing (e.g. if it's waiting and what it's waiting for).

For task's that aren't waiting; I find it easier (later, when you add support for blocking and unblocking tasks) to have a "running" state (when the task is currently using the CPU) and a separate "ready to run" state. The idea is that when a task is given CPU time it is removed from the scheduler's list of "ready to run" tasks and its state is changed from "ready to run" to "running"; and when a task stops using CPU time either it's state is changed back to "ready to run" and it is put back on the scheduler's list of "ready to run" tasks, or its state is changed to "waiting for something" and it is not put back on the scheduler's list of "ready to run" tasks.

Note that making this change will involve changing the scheduler's list of tasks from a circular linked list into a "non-circular" linked list; and (because we've been using a singly linked list and have been using the "current_task_TCB" variable as a temporary "start of linked list" variable) you will need to add two new global variables - one for the start of the linked list and one for the end of the linked list. Initially both of these variables will be NULL because there's no tasks on the list, when the first task is added both of the variables will be set to the first task's information (when there is only one task on the list it must be the first and last task), and when more tasks are added they just get appended to the list (like "last_task->next_task = new_task; last_task = new_task;").

After creating the new "first and last task on the scheduler's ready to run list" variables; update the function that initialises multi-tasking so that it sets the initial task's state to "running" (and delete any older code that put the initial task on the scheduler's list of tasks); and then update the code that creates new kernel tasks so that the new task's state is set to "ready to run" just before the new task is added to the scheduler's "ready to run" list.

The next change is to update the low-level "switch_to_task()" function, so that if and only if the previously running task is still in the "running" state the previously running task's state is changed from "running" to "ready to run" and the previously running task is put back on the scheduler's list of "ready to run" tasks; and then set the state of the next task to be given CPU time to "running" (regardless of what state it was in before, and without removing it from any list).

Finally; the "schedule()" function needs to be changed so that it removes the task from the "ready to run" list before doing a task switch, making it something like this:

void schedule(void) {
if( first_ready_to_run_task != NULL) {
thread_control_block * task = first_ready_to_run_task;
first_ready_to_run_task = task->next;
switch_to_task(task);
}
}
Note that if there's only one task you don't really want to do any task switches (because there's no other tasks to switch to), and in this case the task will be "running" and the list of "ready to run" tasks will be empty. The previous version of "schedule()" was a little less efficient (it would've done a task switch from one task to the same task).

In this step he uses the first_ready_to_run_task variable for scheduling, however, he only sets it once when creating the initial task. Isn't it always going to be null after that since we never update it? Or do we have to check if it's null when creating a new task and then put that task in it or not? How is this supposed to work?
Basically why I am confused is that this pipeline makes 0 sense to me: enqueue the first task -> IRQ happens -> task switches to itself since it's marked as first_ready_to_run -> first ready to run now points to null (and always will be) -> we create a new task and append it to the end -> nothing happens since first_ready_to_run is still null.
What am I missing here? Also he says that switch_to_task puts the current task back to the queue, and it does that by doing a simple append at the end as well right (doesn't seem helpful)?

nullplan · **Joined:** Wed Aug 30, 2017 8:24 am **Posts:** 1604

You found a bug. Nurse!

More seriously, the issue disappears entirely if you make the list of processes into a doubly-linked circular list. That way, adding and removing nodes is constant-time, and the switching instruction would not set the head pointer of the process list to NULL. Ever.

8infy · **Joined:** Sun Apr 05, 2020 1:01 pm **Posts:** 183

nullplan wrote:

You found a bug. Nurse!

More seriously, the issue disappears entirely if you make the list of processes into a doubly-linked circular list. That way, adding and removing nodes is constant-time, and the switching instruction would not set the head pointer of the process list to NULL. Ever.

I've been trying to make this work for like 5 hours today, my brain is too weak to comprehend why it constantly breaks, e.g if 1 process sleeps it works ok, if 2 sleep everything breaks, interrupts get randomly disabled even though iret should clearly always reenable them, scheduler queue ends up empty randomly, the same threads end up being in the sleeping queue twice, restarting qemu sometimes randomly makes everything work (as if there was a race condition even though interrupts are disabled all the way until I enqueue all 3 threads), even with extensive logging i still can't figure it out. I'm starting to think my approach is completely wrong.

Should I just switch to a doubly linked circular list? So to block the current thread I would pop it off the ready list and put it in a circular sleeping list, and then pop it off there when its done and insert back into the ready list right? Do you think it's a better approach?

nullplan · **Joined:** Wed Aug 30, 2017 8:24 am **Posts:** 1604

8infy wrote:

I've been trying to make this work for like 5 hours today, my brain is too weak to comprehend why it constantly breaks, e.g if 1 process sleeps it works ok, if 2 sleep everything breaks, interrupts get randomly disabled even though iret should clearly always reenable them, scheduler queue ends up empty randomly, the same threads end up being in the sleeping queue twice, restarting qemu sometimes randomly makes everything work (as if there was a race condition even though interrupts are disabled all the way until I enqueue all 3 threads), even with extensive logging i still can't figure it out. I'm starting to think my approach is completely wrong.

The exact type of list structure is not going to be a panacea; this sounds like you have more basic problems, like your interrupt handling or something. I have frequently found it helpful in such cases to (1) take a break and come back later (your head already hurts from banging it against a wall all day, so you are not going to bring your A-game anymore), and (2) step back and analyze if the sequence of log messages is anticipated. And if not, where things start to go wrong.

8infy wrote:

Should I just switch to a doubly linked circular list? So to block the current thread I would pop it off the ready list and put it in a circular sleeping list, and then pop it off there when its done and insert back into the ready list right? Do you think it's a better approach?

Sounds half good. I don't see a reason to have a list of sleeping processes (if you want a list of all processes, maintain one separately). If you want to block a process, you are usually waiting for an event, optionally with timeout, and optionally interruptible. You already need a way to find a process by PID alone, no matter its state, so you already need a list of all processes, somehow. (If you are writing something approaching POSIX compatible, at least.) For scheduler purposes, have a circular doubly linked list of ready processes that is never empty (you can always idle). To block a process, remove it from the ready list. To unblock it, add it to the list again. Simple as. Scheduling advances the head pointer to the next node, and if you remember to add new nodes always behind the head pointer, no process can starve no matter how many processes get ready (More exactly: Once queued, a process is N nodes in front of the head pointer, and no event will ever increase the number of nodes between the head and the newly ready node, and a lot of events decrease that number). Since the list is circular, you will never reach the end, and since it's never empty, you can always do something when scheduling, even if it is just wasting time.

8infy · **Joined:** Sun Apr 05, 2020 1:01 pm **Posts:** 183

nullplan wrote:

8infy wrote:

I've been trying to make this work for like 5 hours today, my brain is too weak to comprehend why it constantly breaks, e.g if 1 process sleeps it works ok, if 2 sleep everything breaks, interrupts get randomly disabled even though iret should clearly always reenable them, scheduler queue ends up empty randomly, the same threads end up being in the sleeping queue twice, restarting qemu sometimes randomly makes everything work (as if there was a race condition even though interrupts are disabled all the way until I enqueue all 3 threads), even with extensive logging i still can't figure it out. I'm starting to think my approach is completely wrong.

The exact type of list structure is not going to be a panacea; this sounds like you have more basic problems, like your interrupt handling or something. I have frequently found it helpful in such cases to (1) take a break and come back later (your head already hurts from banging it against a wall all day, so you are not going to bring your A-game anymore), and (2) step back and analyze if the sequence of log messages is anticipated. And if not, where things start to go wrong.

8infy wrote:

Should I just switch to a doubly linked circular list? So to block the current thread I would pop it off the ready list and put it in a circular sleeping list, and then pop it off there when its done and insert back into the ready list right? Do you think it's a better approach?

Sounds half good. I don't see a reason to have a list of sleeping processes (if you want a list of all processes, maintain one separately). If you want to block a process, you are usually waiting for an event, optionally with timeout, and optionally interruptible. You already need a way to find a process by PID alone, no matter its state, so you already need a list of all processes, somehow. (If you are writing something approaching POSIX compatible, at least.) For scheduler purposes, have a circular doubly linked list of ready processes that is never empty (you can always idle). To block a process, remove it from the ready list. To unblock it, add it to the list again. Simple as. Scheduling advances the head pointer to the next node, and if you remember to add new nodes always behind the head pointer, no process can starve no matter how many processes get ready (More exactly: Once queued, a process is N nodes in front of the head pointer, and no event will ever increase the number of nodes between the head and the newly ready node, and a lot of events decrease that number). Since the list is circular, you will never reach the end, and since it's never empty, you can always do something when scheduling, even if it is just wasting time.

Thanks man, that's some good advice. Definitely switching to circular list. About blocking a process: let's say I pop it off the ready queue, where does it go then? I need to check whether it's time to wake it up every tick anyways right (at least for sleeping processes)? So I have to have some sort of a place to keep blocked processes or something along those lines. Even for mutexes and other blocking things. What is an appropriate solution for this?

nullplan · **Joined:** Wed Aug 30, 2017 8:24 am **Posts:** 1604

8infy wrote:

Thanks man, that's some good advice. Definitely switching to circular list. About blocking a process: let's say I pop it off the ready queue, where does it go then? I need to check whether it's time to wake it up every tick anyways right (at least for sleeping processes)? So I have to have some sort of a place to keep blocked processes or something along those lines. Even for mutexes and other blocking things. What is an appropriate solution for this?

That is not what I'd do. I have a priority queue of all timers in the system, ordered by deadline. Whenever a new timer is queued up, the timer interrupt is set up to go off when the next timer expires. Each timer has a callback for what should happen on expiry, and the callback unblocks the process again.

Mutexes and semaphores are implemented in terms of futexes. Futexes have two major operations: Wait and wake. For waiting, the futex list is locked, then it is ensured that the futex has an expected value, and if so, a node is aded to the waiting queue for that futex, and the task is put in interruptible sleep mode. On wake, one of the elements is popped off and the task is made ready again. Right now, I use a circular linked list for those to implement a queue, so that the first element popped off is the eldest in the list. This list can be empty, however, and this makes the implementation a bit more difficult.

In all cases, the blocked processes are not part of a "blocked process" collection, but a more specialized list depending on mechanism.The timer mechanism is used for all kinds of time out, for example.

8infy · **Joined:** Sun Apr 05, 2020 1:01 pm **Posts:** 183

nullplan wrote:

8infy wrote:

Thanks man, that's some good advice. Definitely switching to circular list. About blocking a process: let's say I pop it off the ready queue, where does it go then? I need to check whether it's time to wake it up every tick anyways right (at least for sleeping processes)? So I have to have some sort of a place to keep blocked processes or something along those lines. Even for mutexes and other blocking things. What is an appropriate solution for this?

That is not what I'd do. I have a priority queue of all timers in the system, ordered by deadline. Whenever a new timer is queued up, the timer interrupt is set up to go off when the next timer expires. Each timer has a callback for what should happen on expiry, and the callback unblocks the process again.

Mutexes and semaphores are implemented in terms of futexes. Futexes have two major operations: Wait and wake. For waiting, the futex list is locked, then it is ensured that the futex has an expected value, and if so, a node is aded to the waiting queue for that futex, and the task is put in interruptible sleep mode. On wake, one of the elements is popped off and the task is made ready again. Right now, I use a circular linked list for those to implement a queue, so that the first element popped off is the eldest in the list. This list can be empty, however, and this makes the implementation a bit more difficult.

In all cases, the blocked processes are not part of a "blocked process" collection, but a more specialized list depending on mechanism.The timer mechanism is used for all kinds of time out, for example.

I see, so your timer is always single shot, right? That's probably better than constant ticks, but it sounds more complex, because you have to take multiple factors into account, like time slice for the current process, time left to sleep for another process and possibly other things.

8infy · **Joined:** Sun Apr 05, 2020 1:01 pm **Posts:** 183

nullplan wrote:

8infy wrote:

I've been trying to make this work for like 5 hours today, my brain is too weak to comprehend why it constantly breaks, e.g if 1 process sleeps it works ok, if 2 sleep everything breaks, interrupts get randomly disabled even though iret should clearly always reenable them, scheduler queue ends up empty randomly, the same threads end up being in the sleeping queue twice, restarting qemu sometimes randomly makes everything work (as if there was a race condition even though interrupts are disabled all the way until I enqueue all 3 threads), even with extensive logging i still can't figure it out. I'm starting to think my approach is completely wrong.

The exact type of list structure is not going to be a panacea; this sounds like you have more basic problems, like your interrupt handling or something. I have frequently found it helpful in such cases to (1) take a break and come back later (your head already hurts from banging it against a wall all day, so you are not going to bring your A-game anymore), and (2) step back and analyze if the sequence of log messages is anticipated. And if not, where things start to go wrong.

8infy wrote:

Should I just switch to a doubly linked circular list? So to block the current thread I would pop it off the ready list and put it in a circular sleeping list, and then pop it off there when its done and insert back into the ready list right? Do you think it's a better approach?

Sounds half good. I don't see a reason to have a list of sleeping processes (if you want a list of all processes, maintain one separately). If you want to block a process, you are usually waiting for an event, optionally with timeout, and optionally interruptible. You already need a way to find a process by PID alone, no matter its state, so you already need a list of all processes, somehow. (If you are writing something approaching POSIX compatible, at least.) For scheduler purposes, have a circular doubly linked list of ready processes that is never empty (you can always idle). To block a process, remove it from the ready list. To unblock it, add it to the list again. Simple as. Scheduling advances the head pointer to the next node, and if you remember to add new nodes always behind the head pointer, no process can starve no matter how many processes get ready (More exactly: Once queued, a process is N nodes in front of the head pointer, and no event will ever increase the number of nodes between the head and the newly ready node, and a lot of events decrease that number). Since the list is circular, you will never reach the end, and since it's never empty, you can always do something when scheduling, even if it is just wasting time.

Ok I just got up and spent about half an hour to rewrite a few functions in the scheduler and now blocking works flawlessly (with circular queue). What I couldn't do in 5 hours yesterday I did in 30 minutes today, lmao. Your advice worked, thanks.

OSDev.org

Brendan's multi-tasking tutorial

Who is online