OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Mar 29, 2024 1:26 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 81 posts ]  Go to page 1, 2, 3, 4, 5, 6  Next
Author Message
 Post subject: OS Support For Transactional Memory
PostPosted: Thu May 03, 2007 12:34 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Lately I've been thinking about transactional memory (the wikipedia page has a very good explanation of transactional memory).

I don't think transactional memory is suitable for situations where you need good performance under high contention, but IMHO transactional memory would work very well for software that doesn't expect much contention. It would make it much easier to write thread-safe code, and it would make a very nice addition to an OS (especially considering that a lot of application software doesn't really have high contention).

What I'm considering is supporting transactional memory directly in the OS's linear memory manager (instead of as a special compiler extension, like researchers currently seem to be doing). This would make it easier to use for any language (without modifying compilers, etc to suit), allow object files created from different compilers/languages to be linked together without worrying about conflicting transactional memory support, and even allow transactional memory to be used in assembly language without serious hassles.

Basically a process would have an area of it's address space set aside as a transactional memory area, and the kernel would provide a function to start a transaction and another function to end a transaction (or commit the transaction). If the "commitTransaction()" function fails software would retry the transaction.

The kernel would also keep track of a nesting level, so that if a thread calls the "startTransaction()" function N times and then calls the "commitTransaction()" N times, then only the last call to "commitTransaction()" could result in changes being comitted to the process' address space..

Any thoughts?


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 03, 2007 8:01 am 
Offline
Member
Member
User avatar

Joined: Thu Mar 08, 2007 11:08 am
Posts: 670
Actually, if I've not misunderstood anything, transactional memory would be especially suited for situations where locking would be too expensive, because there would be either lots of overhead from fine-grained locking, or lots of contention to a few global locks.

I'm not quite convinced that it makes sense to do transactional memory on page granularity though, because unless you can localize the changes to a few pages, you'll end up with lots of false conflicts if unrelated objects modified by unrelated transactions happen to reside on the same pages.

For some special situations might be useful, but the applications taking advantage of it will need to deal with the details anyway, at least to some extent, so I'm not sure what advantages you gain my involving kernel. I don't think equating an object with a page is realistic, just like equating malloc with page allocator isn't realistic.

Kernel support might be useful on platforms that can't support atomic test-and-set on hardware, but how many general purpose systems don't? And unless I've misunderstood something, test-and-set is more or less all that you need to do atomic commit to shared memory.

_________________
The real problem with goto is not with the control transfer, but with environments. Properly tail-recursive closures get both right.


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 03, 2007 9:02 am 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 6:06 pm
Posts: 1437
Location: Vancouver, BC, Canada
mystran wrote:
Actually, if I've not misunderstood anything, transactional memory would be especially suited for situations where locking would be too expensive, because there would be either lots of overhead from fine-grained locking, or lots of contention to a few global locks.


No, Brendan is right -- under high contention, transactional memory will cause a lot of re-tries. It is not meant to solve a performance problem, but rather a complexity problem.

--disclaimer--
The next part will discuss implementation concepts and techniques, but not details. In other words, I'm not going to talk about how hardware transactional memory would be implemented on a particular processor. So don't use this as a reason to reject the idea of an "advanced topics" forum.

And now back to the topic at hand...

Quote:
I'm not quite convinced that it makes sense to do transactional memory on page granularity though


I don't think current prototypes of HTM use the paging system (at least not primarily). AFAIK, they use the multiprocessor cache coherence protocol to keep track of conflicts and so on. Yes, this means that HTM requires direct hardware support and doesn't really "exist" yet commercially. :)

Personally I prefer software transactional memory for its flexibility, but to do it properly (i.e. -- not using paging) you really need intervention from the compiler, since it has to emit the proper read and write barriers... basically it instruments the code within an "atomic block" so that it can keep track of transactions.

To put it simply, I don't think transactional memory is really the OS' responsibility. HTM is up to the hardware (although this probably requires some kernel support) and STM is up to the compiler and language run-time environment.

_________________
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!


Top
 Profile  
 
 Post subject: Re: OS Support For Transactional Memory
PostPosted: Thu May 03, 2007 10:46 am 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 3:45 am
Posts: 9301
Location: On the balcony, where I can actually keep 1½m distance
Brendan wrote:
The kernel would also keep track of a nesting level, so that if a thread calls the "startTransaction()" function N times and then calls the "commitTransaction()" N times, then only the last call to "commitTransaction()" could result in changes being comitted to the process' address space..

well, then what do you do when you try to commit on top level and you detect a conflict - how would the nested calls know that things have not been committed while the call returned true? (or only being able to redo part of the commit) I think nesting causes more trouble than it solves.

_________________
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]


Top
 Profile  
 
 Post subject: Re: OS Support For Transactional Memory
PostPosted: Thu May 03, 2007 11:10 am 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
Combuster wrote:
Brendan wrote:
The kernel would also keep track of a nesting level, so that if a thread calls the "startTransaction()" function N times and then calls the "commitTransaction()" N times, then only the last call to "commitTransaction()" could result in changes being comitted to the process' address space..

well, then what do you do when you try to commit on top level and you detect a conflict - how would the nested calls know that things have not been committed while the call returned true? (or only being able to redo part of the commit) I think nesting causes more trouble than it solves.


If I recall correctly, lockless algorithms work by not doing anything permanent until the commit. They won't know, then. The concept is:

Read state
Do work
If state equals old state, write result of work
If it didn't, try again

That means that if your part of the work wasn't stored, you won't know because you'll just be invoked again. The functions used for this should be "pure" functions with respect to other variables except for perhaps a retry-counter or such - they should hold no state and not touch anything but the explicit state.


Top
 Profile  
 
 Post subject: Re: OS Support For Transactional Memory
PostPosted: Thu May 03, 2007 11:52 am 
Offline

Joined: Thu May 03, 2007 11:27 am
Posts: 11
[quote]If I recall correctly, lockless algorithms work by not doing anything permanent until the commit. They won't know, then. The concept is:[/quote]

The vast difference between lockless algorithms (LA) and sw transactions is that LA's make progress for at least one process (at least if they are wait free). In both cases, you definitely gain less complexity in most cases.

SW transactions has the same problems as optimistic locking in databases; they lead to high rollback rates on hotspots. Alas, an OS is littered with hotspots :) Of course, in the case there is no hot spots, you can us Copy-Modify-Update.

That's my take, at least.


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 03, 2007 7:14 pm 
Offline
Member
Member
User avatar

Joined: Tue Nov 09, 2004 12:00 am
Posts: 843
Location: United States
So this optimistic locking in databases, transactional memory, and copy-modify-update are essentially the same technique.

Candy wrote:
Read state
Do work
If state equals old state, write result of work
If it didn't, try again


Oddly enough optimistic locking could slow down transactions with a shared object I would imagine since you have to perform the comparison (or in relation to the size of the object) versus a exchange.

Let C be the time of a comparison operation on one unit of object size.
Let S be the object size.
Let M be the time of a test and set operation.
Let T be the number of threads.

CS > M (I would imagine this would almost always be true.)
CS*T = M*T

Colonel Kernel wrote:
No, Brendan is right -- under high contention, transactional memory will cause a lot of re-tries. It is not meant to solve a performance problem, but rather a complexity problem.


I can imagine the thread constantly spinning between reading, processing, and attempting a commit.

The only way I could think of doing it off the top of my head would be a example in C here being more relevant to Brandon's original post.
http://kmcguire.jouleos.galekus.com/dok ... d=kmcguire

And as you guys know that little peanut sized particle up there I have only works half the time... never the less the example uses a linked list of people with support for family members to be added and such not operations. However each operation uses a tmLog and then uses tmCommit to actually record the changes. I figured you could use no blocking at all and just move full speed ahead using some sort of callback mechanism for error recovery in the case a commit failed.

I dunno.. Nothing I would investigate deeper myself unless it actually looked viable for usage somewhere, but maybe just a idea for someone else to get inspired over.

@Brandon:

But you specified it being implemented directly in the "OS's linear memory manager". I am wondering more about the details of how this could work such as using page faults or something?

_________________
Projects


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 03, 2007 7:57 pm 
Offline

Joined: Thu May 03, 2007 11:27 am
Posts: 11
Kevin McGuire wrote:
So this optimistic locking in databases, transactional memory, and copy-modify-update are essentially the same technique.


On the surface, yes.

But comparing optimistic locking to CMU or transactional memory isn't really fair.

Why? Because databases in a sense have a much easier job, because they have subsystems ensuring consistency and isolation - and they only deal with data.

The things you lock in an operating system isn't just data, you also protect access to I/O ports and other resources that makes it impossible to provide isolation (in a db, a transaction can be inconsistent while it executes, but no other transaction generally observes that => the I in ACID). Many actions are irreversible (i.e. undoing sending a packet, writing to port that starts a device motor, whatever). Hence we still need locks.

In short, I think lockfree ADT's are fine, but I don't see a general transaction mechanism happening before we have hardware support at most levels.


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 03, 2007 9:22 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 6:06 pm
Posts: 1437
Location: Vancouver, BC, Canada
grimpy wrote:
In short, I think lockfree ADT's are fine, but I don't see a general transaction mechanism happening before we have hardware support at most levels.


Hardware support for pulling packets back off the wire or un-writing blocks from disk? ;) Some actions are irrevocable, IMO. This doesn't invalidate the usefulness of TM though -- but it suggests that STM at least should prevent system calls and other "unknowable, possibly irrevocable" actions from being done in an atomic block.

@kmcguire: I found a really good paper on transactional memory, but I seem to have misplaced it. If I find it, I'll post it (or a link to it).

_________________
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 03, 2007 11:33 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 11:33 pm
Posts: 3882
Location: Eindhoven
Colonel Kernel wrote:
grimpy wrote:
In short, I think lockfree ADT's are fine, but I don't see a general transaction mechanism happening before we have hardware support at most levels.


Hardware support for pulling packets back off the wire or un-writing blocks from disk? ;) Some actions are irrevocable, IMO. This doesn't invalidate the usefulness of TM though -- but it suggests that STM at least should prevent system calls and other "unknowable, possibly irrevocable" actions from being done in an atomic block.

@kmcguire: I found a really good paper on transactional memory, but I seem to have misplaced it. If I find it, I'll post it (or a link to it).


I would expect a transactional memory with any non-memory output to cache the "write" to each of the outputs until the transaction is approved and then sends them all out. I'm not sure how you would do that atomically as they're separate actions but if I'm thinking correctly, the only device that requires atomicity is memory.

Hmm... On the other hand, files too...


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 03, 2007 11:44 pm 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 6:06 pm
Posts: 1437
Location: Vancouver, BC, Canada
Candy wrote:
I would expect a transactional memory with any non-memory output to cache the "write" to each of the outputs until the transaction is approved and then sends them all out. I'm not sure how you would do that atomically as they're separate actions but if I'm thinking correctly, the only device that requires atomicity is memory.

Hmm... On the other hand, files too...


There are already concurrency mechanisms in place for files... why extend something that's supposed to be for memory to other things when it's not needed? Especially when it adds so much complexity. It reminds me of finalizers in Java & C#. Freeing files and sockets is what deterministic destruction is for! Extending STM to non-memory resources is just as crazy as extending GC to non-memory resources IMO...

About that paper I was going to look for... I can't find it. :( I suspect it was on Intel's research site, which seems to have moved. If you Google "software transactional memory", you'll find a ton of links though. Unforunately many of the examples are in Haskell and are therefore beyond the reach of mere mortals (I was taught by a prof who was one of the inventors of Haskell, and now I hate the language with a passion. ](*,) )

_________________
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!


Top
 Profile  
 
 Post subject:
PostPosted: Fri May 04, 2007 8:02 am 
Offline

Joined: Thu May 03, 2007 11:27 am
Posts: 11
Colonel Kernel wrote:
Hardware support for pulling packets back off the wire or un-writing blocks from disk? ;)


With proper device queuing and, say, SATA support for transaction logs, this is not as sci-fi as it seems.

That said, I don't think it will happend in my lifetime.

So we'll have a mix of lock-free algos and locks. There's no practical away around it IMO.


Top
 Profile  
 
 Post subject:
PostPosted: Fri May 04, 2007 8:58 am 
Offline
Member
Member
User avatar

Joined: Tue Oct 17, 2006 6:06 pm
Posts: 1437
Location: Vancouver, BC, Canada
grimpy wrote:
With proper device queuing and, say, SATA support for transaction logs, this is not as sci-fi as it seems.


I think the only way for this to work in general is for someone to invent time travel. :) Consider what would happen if a transaction tried to send something over the network, then do a blocking receive waiting for the reply, then take some action based on the content of the reply. I can't see how this could be revocable.

_________________
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!


Top
 Profile  
 
 Post subject:
PostPosted: Fri May 04, 2007 9:45 am 
Offline

Joined: Thu May 03, 2007 11:27 am
Posts: 11
Colonel Kernel wrote:
grimpy wrote:
With proper device queuing and, say, SATA support for transaction logs, this is not as sci-fi as it seems.


I think the only way for this to work in general is for someone to invent time travel. :) Consider what would happen if a transaction tried to send something over the network, then do a blocking receive waiting for the reply, then take some action based on the content of the reply. I can't see how this could be revocable.


Thinking outside the box, and assuming all hardware and software is written for the new fancy Transactional Computer Programming TCP (tm) :), it could work.

Of course, it would mean the receiving end would need to understand and participate in a distributed transaction, potentially with compensating transactions.

Yes, it's highly unlikely, but probably doable in a near-perfect world.


Top
 Profile  
 
 Post subject:
PostPosted: Fri May 04, 2007 10:30 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

mystran wrote:
I'm not quite convinced that it makes sense to do transactional memory on page granularity though, because unless you can localize the changes to a few pages, you'll end up with lots of false conflicts if unrelated objects modified by unrelated transactions happen to reside on the same pages.


I'm not convinced that (for software with low contention) page granular STM won't give better performance than fine grained "compiler based STM".

For an example, consider a large transaction that reads from 1000 addresses spread across 10 pages and writes to 100 addresses spread across 2 pages. In this case "compiler based STM" has all the overhead for 1000 reads and 100 writes, while "OS based STM" has the overhead for 10 reads and 2 writes. Also, "OS based STM" can make use of existing mechanisms built into the CPU (page faults, accessed & dirty flags) to keep track of accesses, while "compiler based STM" needs to insert additional code at every access.

mystran wrote:
Personally I prefer software transactional memory for its flexibility, but to do it properly (i.e. -- not using paging) you really need intervention from the compiler, since it has to emit the proper read and write barriers... basically it instruments the code within an "atomic block" so that it can keep track of transactions.

To put it simply, I don't think transactional memory is really the OS' responsibility. HTM is up to the hardware (although this probably requires some kernel support) and STM is up to the compiler and language run-time environment.


OS based STM wouldn't be less flexible - it needs to be optional (as not all processes would use STM) and therefore wouldn't prevent compiler based STM from being used.

But, to do STM properly (i.e. not forcing people to use a specific compiler or a specific library, not causing problems with linking code from different compilers, and perhaps providing a transperant upgrade path to future HTM), you really want intervention from the linear memory manager... ;)

The only real question is whether or not the performance, convenience and other advantages of OS based STM justify the additional time and complexity involved in designing and implementing it. If it takes ages to implement and performance is poor, then it'd be much less worthwile.

There are also other considerations here - I'm always looking for things my OS could do that other OSs can't (even if it's just for eventual marketting purposes), and AFAIK it would be the first time anyone has ever implemented STM using the CPU's paging mechanism (making the OS itself interesting for research purposes).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 81 posts ]  Go to page 1, 2, 3, 4, 5, 6  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 30 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group