Race condition when blocking/unblocking thread

zity · **Posted:** Tue Mar 26, 2019 3:00 pm

Hi all,

I have a relatively straightforward problem, but I am not exactly sure how to solve it.

I started implementing threads in my kernel, in particular I am now trying to add threads to my ATA driver. When a read/write request arrives from another thread, my code does the following.

The ATA driver thread is unblocked because a read/write request has arrived
The ATA driver thread initialises the DMA transfer and then it blocks
When the DMA transfer has finished, the IRQ handler unblocks the ATA driver thread again

However, at least in QEMU, the IRQ handler fires almost immediately, which means that sometimes the ATA thread never manages to actually execute the thread_block() function before the IRQ handler has tried to unblocked the thread. As a consequence, the thread_block() function is called after the thread_unblock(), which means that the ATA thread blocks forever.

I must be approaching this in the wrong way, because the problem is quite generic. How do I handle situations like this properly?

bellezzasolo · **Joined:** Sun Feb 20, 2011 2:01 pm **Posts:** 110

The first thing that comes to mind is that you shouldn't be working on the thread level, as that's asking for race conditions. You want the scheduler to deal with this.
Instead, synchronise on an object - Win32 KeWaitForSingleObject, UEFI WaitForEvent. This could be as simple as a semaphore, which will persist the signal until it is consumed.

linguofreak · **Joined:** Wed Mar 09, 2011 3:55 am **Posts:** 509

You might give the ATA driver a work pending queue. Whenever the thread is blocked and work is added to the queue, the thread is woken up. It then checks its queue, sees that there is an item in the queue, performs the needed task, and checks the queue again. If the queue is empty, it blocks again, if not, it deals with the next item.

So the flow of an I/O request might look like this:

[*]Userspace calls the filesystem driver to write to a file.
[*]Filesystem driver calls an add-to-queue() function to add work to the ATA driver queue.
[*]add-to-queue() adds work item to the ATA driver queue.
[*]add-to-queue() is done with queue, releases lock on work queue.
[*]add-to-queue() unblocks ATA thread.
[*]ATA thread executes check-queue() function
[*]check-queue() takes lock on work queue
[*]check-queue() finds I/O request in queue
[*]check-queue() releases lock on work queue
[*]ATA thread sets up DMA transfer

On QEMU, with a very fast DMA transfer you might see this:

[*]DMA transfer completes instantly, IRQ handler calls add-to-queue() to add work to ATA driver queue
[*]add-to-queue() takes lock on work queue.
[*]add-to-queue() adds work item to the ATA driver queue
[*]ATA thread executes check-queue()
[*]check-queue() finds work queue lock held, spins until it is released.
[*]add-to-queue() is done with queue, releases lock on work queue.
[*]check-queue() takes lock on work queue
[*]check-queue() finds DMA complete item in queue
[*]add-to-queue attempts to wake ATA thread, ATA thread is already running
[*]check-queue() releases lock on work queue
[*]ATA thread sends data to filesystem driver
[*]ATA thread executes check-queue()
[*]check-queue() takes lock on work queue
[*]check-queue() finds no work
[*]check-queue() releases lock on work queue
[*]ATA thread blocks pending new work.

If the DMA transfer completes less quickly, you might see this:

[*]ATA thread executes check-queue()
[*]check-queue() takes lock on work queue
[*]DMA transfer completes, IRQ handler calls add-to-queue() to add work to ATA driver queue
[*]add-to-queue() finds work queue lock held, spins until it is released.
[*]check-queue() finds no work
[*]check-queue() releases lock on work queue
[*]add-to-queue() takes lock on work queue.
[*]add-to-queue() adds work item to the ATA driver queue
[*]add-to-queue() is done with queue, releases lock on work queue.
[*]ATA thread blocks pending new work.
[*]add-to-queue() wakes ATA thread.
[*]ATA thread executes check-queue()
[*]check-queue() takes lock on work queue
[*]check-queue() finds DMA complete item in queue
[*]check-queue() releases lock on work queue
[*]ATA thread sends data to filesystem driver
[*]ATA thread executes check-queue()
[*]check-queue() takes lock on work queue
[*]check-queue() finds no work
[*]check-queue() releases lock on work queue
[*]ATA thread blocks pending new work.

On real hardware, with a long delay before the DMA transfer completes, you might see this:

[*]ATA thread executes check-queue()
[*]check-queue() takes lock on work queue
[*]check-queue() finds no work
[*]check-queue() releases lock on work queue
[*]ATA thread blocks pending new work.
[*]DMA transfer completes, IRQ handler calls add-to-queue() to add work to ATA driver queue
[*]add-to-queue() takes lock on work queue.
[*]add-to-queue() adds work item to the ATA driver queue
[*]add-to-queue() is done with queue, releases lock on work queue.
[*]ATA thread executes check-queue()
[*]check-queue() takes lock on work queue
[*]check-queue() finds DMA complete item in queue
[*]check-queue() releases lock on work queue
[*]ATA thread sends data to filesystem driver
[*]ATA thread executes check-queue()
[*]check-queue() takes lock on work queue
[*]check-queue() finds no work
[*]check-queue() releases lock on work queue
[*]ATA thread blocks pending new work.

Implemented as I've suggested it, as long as check-queue() and add-to-queue() always hold a lock on the work queue, and as long as add-to-queue always wakes the thread after adding an item to the queue, the worst that should happen is that the ATA thread blocks as add-to-queue adds an item to its queue, but before add-to-queue attempts to wake it. As add-to-queue() is executed by the IRQ handler, check-queue() and add-to-queue() should have a short, bounded execution time, so that the IRQ handler doesn't end up waiting forever for the lock, and the lock itself is probably suboptimal.

nullplan · **Joined:** Wed Aug 30, 2017 8:24 am **Posts:** 1604

zity wrote:

However, at least in QEMU, the IRQ handler fires almost immediately, which means that sometimes the ATA thread never manages to actually execute the thread_block() function before the IRQ handler has tried to unblocked the thread. As a consequence, the thread_block() function is called after the thread_unblock(), which means that the ATA thread blocks forever.

I must be approaching this in the wrong way, because the problem is quite generic. How do I handle situations like this properly?

My first thought was: "This looks like a job for a semaphore." But more generally, if you want to call thread_unblock() from an IRQ handler, then you have to CLI from the last time you check that the IRQ has not arrived until at least thread_block() has made changes that thread_unblock() will pick up. This way the IRQ handler can't run at an inopportune time. I'd suggest handing thread_block() an address and an expected value (or writing another function that does that... maybe thread_block_if_equal()?) That version of thread_block() will then CLI, take a spinlock (maybe a global one), ensure the value at the address is still the expected value, and if so, add the current thread to the list of wakable threads. Then it can release the lock and STI again before blocking until the thread is woken.

The IRQ handler must then take the same spinlock before changing the value (I'm thinking your request structure has a "done" field, which the IRQ handler sets to 1). Then it can unblock the thread and release the spinlock.

The spinlock is needed for SMP safety. If you do not support SMP, nor intend to in the future, you do not need it. But I always think it is better to design with parallelism in mind, rather than patching it in later.

By the way: If you export the above mechanism into userspace, you basically have a futex mechanism, which is a really powerful tool to let the userspace handle its synchronization needs itself.

zity · **Posted:** Wed Mar 27, 2019 8:00 am

Thanks for all the comments, I have been working on a solution today.

My ATA driver does have a work queue, in fact, in my original post, step 1 happens because something was appended to the queue.

The current solution consists of two new functions in the scheduler, namely:

thread_wait(mutex*) - this function replaces thread_block()
thread_signal(mutex*) - this function replaces thread_unblock()

When either of these functions are called, I disable interrupts and I obtain a spinlock stored in the mutex. When thread_signal() is called, it either unblocks the waiting thread, or simply it increments a variable in the mutex. Conversely, when thread_wait() is called, it either blocks the thread (if the variable in the mutex has not been incremented), or it simply returns because thread_signal() has already been called.

I hope this is a reasonable way of doing it.

rdos · **Joined:** Wed Oct 01, 2008 1:55 pm **Posts:** 3192

I have a function very similar to this in kernel space. It has three different APIs: WaitForSignal, Signal(thread), and ClearSignal. There is a variable in the thread control structure "p_signal" which is 1 if the thread has been signaled (Signal called). The WaitForSignal method is implemented in the scheduler and ensures that race conditions cannot occur even if Signal is called from an IRQ (which is allowed). In the multiprocessor version, Signal/WaitForSignal use spinlocks, while in the single processor version it can be implemented with cli/sti. These functions guarantee that a signal will never be missed even if the IRQ is fired at the same time as WaitForSignal is called.

OSDev.org

Race condition when blocking/unblocking thread

Who is online