Mutex Implementation.

astrocrep · Post by **astrocrep** » Thu Jun 21, 2007 2:43 pm

Ok, so I have done some more homework and come up with the following mutex implementation in some wierd vb/c pseudo code...

Please keep in mind, I am looking for opinions on it, not code correctness.

Basically I have a Double linked list. Each Item in the linked list coresponds to a Resource type, mostlikely a constant definition. Then each resource would have a linked list of guys wanting to use it... FiFo.

ie:

Code: Select all

void printf(...)
{
  MUTEX_LOCK(MUTEX_CONSOLEOUT);
  ...
  ...
  ...
  MUTEX_UNLOCK(MUTEX_CONSOLEOUT);
}

Also that MOVE_TO_WAIT Basically stops the thread where its at and moves it to the BLOCKED_QUEUE.

Code: Select all

struct mutex_item {
	mutex_item * next;
	mutex_item * last;
	long thread_id;
}

struct mutex_type {
 mutex_type * next;
 mutex_type * last;
 long mutex_id;
 mutex_item * head;
}

void MUTEX_LOCK(long pMUTEXT_ID)
{
	mutex_type *my_mutex_type;
	mutex_item *my_mutex_item;

	//STOP INTs
	disable()

	//Go through all of the mutex types in the primary linked list
	for each mutex_type in the mutex_type linked list
	  if the mutex_type.mutex_id = pMUTEXT_ID then
		  my_mutex_type = this mutex_type
		  break the loop

    //If this mutex does not yet have an entry, create one.
    if my_mutex_type is empty then
		my_mutex_type = kmalloc(sizeof(mutex_type))
		add my_mutex_type to the end of the mutex_type linked list

	//By now we should have a valid my_mutex_type
	//If the mutex is free (head = 0) means nothing is in queue
	if my_mutex_type.head == 0 then
		//Create a new item for the mutex queue and load it in there
		my_mutex_item = kmalloc(sizeof(mutex_item))
		my_mutex_item.thread_id = GetCurrentThreadID()
		my_mutex_item.next = 0
		my_mutex_item.last = 0
		my_mutex_type.head = my_mutex_item

		//START INTs
		enable()

		return
	else
		//So there is already a line for this mutex
		//Lets find the end of the line
		for each mutex_item in my_mutex_type.head linked list
			if the mutex_item.next = 0 then
				//Now lets add the entry for this thread
				my_mutex_item = kmalloc(sizeof(mutex_item))
				my_mutex_item.thread_id = GetCurrentThreadID()
				my_mutex_item.next = 0
				my_mutex_item.last = the last mutex_item in the linked list
				my_mutex_type.head = my_mutex_item
				//So now we wait until our time has come.
				MOVE_TO_WAIT(my_mutex_item.thread_id)
				//START INTs
				enable()
}

void MUTEX_UNLOCK(long pMUTEXT_ID)
{
	mutex_type *my_mutex_type;
	mutex_item *my_mutex_item;

	//STOP INTs
	disable()

	//Go through all of the mutex types in the primary linked list
	for each mutex_type in the mutex_type linked list
	  if the mutex_type.mutex_id = pMUTEXT_ID then
		  my_mutex_type = this mutex_type
		  break the loop

    //Sanity check
	if my_mutex_type.head <> 0 then 
		my_mutex_item = my_mutex_type.head
		//Sanity check again
		if my_mutex_item.thread_id <> GetCurrentThreadID() then
			PANIC("OHHH CRUDE!!!")
		//Remove me from the queue
		my_mutex_type.head = my_mutex_item.next
		my_mutex_item = my_mutex_type.head
		//The next call places it at the HEAD of the ready queue to ensure that it will be the next go to access this resource
		JUMP_TO_READY(my_mutex_item.thread_id)
	
	//START INTs
	enable()
}

Thanks,
Rich

jnc100 · Post by **jnc100** » Thu Jun 21, 2007 3:41 pm

Stopping interrupts IS a form of mutual exclusion. When you need to use that idea to implement mutual exclusion you're in a bad way. Think of what happens when you have more that one processor. Disabling interrupts on one won't stop another altering the memory you're reading/writing. The x86 provides for what you're trying to do: look up the LOCK prefix and something like BTS or CMPXCHG.

Regards,
John.

Combuster · Post by **Combuster** » Thu Jun 21, 2007 4:31 pm

While BTS/BTR (Test-and-set) and CMPXCHG (Compare-and-swap) are a means to implement mutual exclusion, it is not necessarily a part of it. Rather, they are more commonly used for lock-free or wait-free code. You should read up on the Bakery algorithm. (although, using something like LOCK XADD can make its implementation a lot much easier for being an atomic ticket dispenser)

Note that several complex atomic instructions are not available on the 386 - XADD and CMPXCHG were added with the 486 and CMPXCHG8B was introduced with the Pentium.

Back to the suggestion, Building a linked list is basically moving the problem: instead of having threads fight for an object, you make them fight for the end of the queue that is set up in front of it. The fight that you are trying to resolve will not be any different there.
And as said before, cli/sti will break on multiprocessor systems so I suggest you use something that is known to work. Not doing mutual exclusion properly is bound to give untracable random problems later on.

astrocrep · Post by **astrocrep** » Thu Jun 21, 2007 5:44 pm

Is they actual code examples of using the Bakery algorithm?

My assembler is only good enough to get me here... Using those extra op-codes is out of my current bounds...

So... what would be the "right" and "best" way to implement mutual exclusion... I cant seem to wrap my head around it.

Thanks,
Rich

os64dev · Post by **os64dev** » Fri Jun 22, 2007 1:04 am

astrocrep wrote:So... what would be the "right" and "best" way to implement mutual exclusion... I cant seem to wrap my head around it.

Use any form with the LOCK prefix, look in the intel/amd manuals which instructions are allowed with lock. This gives you the basis for mutexes, semaphores and spinlocks. And as a bonus they will be multiprocessor safe.

These documents provide more detail.
Spinlocks Parts 1
Spinlocks Parts 2
Spinlocks Parts 3

Combuster · Post by **Combuster** » Fri Jun 22, 2007 4:18 am

Bakery algorithm using fetch-and-increment (LOCK XADD on x86s)

Code: Select all

int entries = 0;
int exits = 0;

void lock()
{
     int ticket = fetch_and_increment(&entries);
     while (ticket != exits)
     {
          schedule(); // or anything else you can do while waiting for your turn
     }
}

void unlock()
{
     exits = exits + 1;
}

The original bakery algorithm works with just read/write atomicity, but you'll have to keep track of all threads for each lock. Its on wikipedia: http://en.wikipedia.org/wiki/Lamport's_bakery_algorithm (sorry, phpbb refuses to link normally)

Actually, any read-modify-write instruction is interesting when it has become atomic, but lockfree and waitfree programming is hardly beginners material, unless you are looking for a challenge

astrocrep · Post by **astrocrep** » Fri Jun 22, 2007 6:05 am

os64dev wrote:
astrocrep wrote:So... what would be the "right" and "best" way to implement mutual exclusion... I cant seem to wrap my head around it.
Use any form with the LOCK prefix, look in the intel/amd manuals which instructions are allowed with lock. This gives you the basis for mutexes, semaphores and spinlocks. And as a bonus they will be multiprocessor safe.

These documents provide more detail.
Spinlocks Parts 1
Spinlocks Parts 2
Spinlocks Parts 3

Thats all well in good, but in the first tutorial...

But they disable interrupts...

Also the queued mutex approach I got from two different resources. Including an OS book, though I added the enable / disable. Maybe that would be a better approach for semaphors (the pseudo code I wrote above?)

Thanks,
Rich

os64dev · Post by **os64dev** » Fri Jun 22, 2007 7:07 am

But they disable interrupts...

Disabling interrupts is not allways a very bad thing. jnc100 just pointed out that just disabling interrupts is not enough in a multiprocessor environment. You also need the LOCK statement making the mutex multiprocessor safe.

keep in mind that the example used in the document is for scheduling a new process which requires the interrupts to be disables. The real mutex code is partly the int test_and_set (int new_value, int *lock_pointer); function.

and the spinlock is done by

Code: Select all

while (test_and_set (1, &threads_waiting_to_run_lock)) {}

which is your mutex lock code.

and the unlock is simply threads_waiting_to_run_lock=0;

if you want to make a class Mutex is is something like:

Code: Select all

class Mutex {
  protected:
    Mutex(){}
    int     _lock;
    char *  _name;
  public:
    Mutex(char *name) : _name(name), _lock(0) {
    }
    void lock(void) {
        while (test_and_set (1, &_lock{}
    }
    void unlock(void) {
        _lock = 0;
    }
}

the code for test_and_set is in the spinlock part 1 document.

astrocrep · Post by **astrocrep** » Fri Jun 22, 2007 7:23 am

I don't have classes...

So would I have to do something like

Code: Select all

volitale char mutex_list[16]; //Init all to zero
int mutex_printf = 1
int mutex_malloc = 2
int mutex_something = 3
int mutex_somethingelse = 4

void printf()
{
 lock(mutexes[mutex_printf]);
 ...
 preform printf
 ...
 unlock(mutexes[mutex_printf]);
}

Thanks,
Rich

os64dev · Post by **os64dev** » Fri Jun 22, 2007 7:28 am

that would be:

Code: Select all

void mutex_lock(int * lock) {
    while (test_and_set (1, lock)) {} 
}

void mutex_unlock(int * lock) {
    *lock = 0;
}

Combuster · Post by **Combuster** » Fri Jun 22, 2007 11:12 am

the main problem with the test-and-set implementation is that of starvation:

If you have one program that continuously calls print statements, it is rather likely that it will hold the lock when the task get scheduled, once it does, the other threads are stuck until the original thread is rescheduled, after which it can release the lock and retake it and be scheduled again with the lock taken. This can be disastrous for other applications. Hence, we normally require a mutex to fulfill the "no-starvation" requirement. The bakery algorithm solves it by giving threads tickets and make them wait for their turn. Dekker's algorithm uses politeness to have the most recent threads make way for newcomers instead of putting up a fight.

In case you wonder wether it will actually occur, I once managed to make my kernel lock up completely due to starvation issues since I managed to sync the interrupt timing exactly to the point where the lock was taken. (Bochs is suddenly _very_ accurate when it comes to clock cycles </sarcasm>

)

astrocrep · Post by **astrocrep** » Fri Jun 22, 2007 11:41 am

Well actually, I was thinking about implementing my mutex logic above with the set and lock below.

My thinking is:

The test and lock will only be used for the MUTEX_LOCK I originally wrote instead of disabling and enabling interrupts.

This way, if I have 5 threads all trying to print to the screen, they would be queued up in the order they tried to request.

I see two benefits to this.

1.) "Waiting" Processes do not sit in an indefinite for loop for longer then it takes them to check the mutex. After they can check it, they either are given permission to run, or required to enter a "blocked queue". Meaning that the primary printf function will get 100% usage against other threads trying to printf and thus can complete faster.

2.) As soon as he frees the mutex (this is actually done by the printf function and not the thread itself) The next thread waiting for printf access is popped onto the head of the ready queue.

Seems like a solid setup to me no?

Thanks,
Rich

Combuster · Post by **Combuster** » Fri Jun 22, 2007 12:20 pm

like I said above, the queue is only moving the problem. If the test-and-set can't stop the fight at the front door, it also can not stop the fight at the queue's entrance.

astrocrep · Post by **astrocrep** » Fri Jun 22, 2007 12:58 pm

Combuster wrote:Bakery algorithm using fetch-and-increment (LOCK XADD on x86s)

Code: Select all

int entries = 0;
int exits = 0;

void lock()
{
     int ticket = fetch_and_increment(&entries);
     while (ticket != exits)
     {
          schedule(); // or anything else you can do while waiting for your turn
     }
}

void unlock()
{
     exits = exits + 1;
}

Code I still do something like this?

Code: Select all

volitale char mutex_list_entries[16]; //Init all to zero 
volitale char mutex_list_exits[16]; //Init all to zero 
int mutex_printf = 1 
int mutex_malloc = 2 
int mutex_something = 3 
int mutex_somethingelse = 4 

void printf() 
{ 
 lock(mutexes[mutex_printf]); 
 ... 
 preform printf 
 ... 
 unlock(mutexes[mutex_printf]); 
}

This leaves me with a couple concerns first:
What would fetch and increment look like. Second can I still generalize like abouve. Thrid how can I "watch" the ceiling on these entry / exits, and correct them to keep them in bounds. Lastly... What exactly does schedule() do? Just through it back onto the ready queue?

Thanks,
Rich

Combuster · Post by **Combuster** » Fri Jun 22, 2007 2:30 pm

fetch and increment
Atomic operation that does the following: Get a number from memory, increment it, store it, and return the original value

Its implementation is architecture-dependent. For x86, you should RTFM and look up 'XADD' (and while you're at it, read on the BTS, BTR and CMPXCHG instructions as well; jnc100 didn't suggest those because of nothing).

Generalisation
The idea is good. However, a char can hold only 256 different values and as such can at best only provide atomicity among that number of threads. Also, using several predimensioned arrays is horribly bad programming practice. Instead use a struct, and create one together with each set of functions requiring the mutual exclusion.

Bounds
The thing is, a register is of limited size. If you increment it often enough it will wrap around. Hence after a while a ticket will be given that is equal to one handed out before. However, in order to have that cause a problem, there must be two threads holding the same ticket. Since a thread can only hold one ticket, the number of threads must exceed the amount of unique numbers that can be generated by the register. If you use an 32-bit int, that'd require 4 billion threads using the same lock at once.

Schedule
Since we have to wait, we can be polite and give up the time-slice for other processes to use. On uniprocessor systems, the thread holding the lock is currently in the ready list and needs to be scheduled before it can release the lock so the turn can be passed on to the next thread.