OSDev.org https://forum.osdev.org/ |
|
Modelling asynchronous event handling https://forum.osdev.org/viewtopic.php?f=15&t=30868 |
Page 1 of 1 |
Author: | Schol-R-LEA [ Sat Oct 01, 2016 9:34 pm ] |
Post subject: | Modelling asynchronous event handling |
In light of the things currently going on in the 'POSIX under Microkernels' thread, I think we need to step back - again - and consider a specific, well-defined example of an asynchronously event-driven operation working on an indeterminate number of threads which communicate with each other. I want to propose such an example, which I call naive mapreduce. Despite its name, it is not a form, simplified or otherwise, of an actual Mapreduce algorithm; it does, however, simulate certain aspects of its operation. For the purposes of this particular discussion, we will start with the assumption that there will always be at least one CPU available for any thread when it is started or awakened from a waiting cycle, and that we are using kernel threads (to be exact: that the thread creation, wait, IPC, and termination are performed through system calls, either explicit or implicit, that all scheduling is by the kernel scheduler, and that threads can be scheduled to run on any free CPU). These conditions can be changed to explore different issues, but anyone discussing a variant should say so explicitly; for now, let's keep to fairly idealized assumptions about the CPU scheduling to avoid contention over which conditions apply. The Naive Mapreduce problem, broadly speaking, is to fingerprint all the possible links reachable from a given hypertext document (or unspecified format), using one thread or process per document found during the tree-building process. The specific approach I want to suggest we discuss would be:
This gives a framework to begin discussing the subject in a relatively formal manner. The process is highly inefficient, and the example probably could be simplified, but it is, I think, a good place to work from. |
Author: | Brendan [ Sat Oct 01, 2016 10:23 pm ] |
Post subject: | Re: Modelling asynchronous event handling |
Hi, Schol-R-LEA wrote: The specific approach I want to suggest we discuss would be: By specifying an implementation (and not just specifying the behaviour), you tie my hands. For a simple example; the first thing I'd want to do is establish a pool of worker threads. Otherwise I'd be too worried about "fork bomb" (an exponentially increasing number of threads where the only thing limiting worst case growth rate is networking). More specifically; I'd be tempted to want "1 + number worker threads == number of CPUs on this computer" for each computer; where a worker thread only communicates with its master in the same process, and masters (different processes on different computers) communicate with each other. This allows for a "table of already seen links and their states" for each process that is shared by master and its worker threads; while using "shared nothing" between masters (where masters send messages to each other to update each other's tables). Of course for my model, a single (master?) thread can handle fetching an unlimited number of pages by itself, so I'd mainly be using worker threads to spread the load of processing pages across available CPUs (and possibly "lazy spawning" where it only spawns a new worker thread if/when processing pages becomes a bottleneck, while keeping under the "1 + number worker threads == number of CPUs on this computer" maximum). Cheers, Brendan |
Author: | Schol-R-LEA [ Sun Oct 02, 2016 10:34 am ] |
Post subject: | Re: Modelling asynchronous event handling |
Brendan wrote: By specifying an implementation (and not just specifying the behaviour), you tie my hands. On the one hand, this is a good point, and I didn't really intend it as such, at least not in the long run; I was presenting it as sketch of an example solution. OTOH, I did specifically write this as a simplified, naive, and flawed solution, because I wanted to start with a relatively abstract analysis (hence the stipulation about unlimited CPU availability) to isolate the issue that are inherent in the problem of asynchronous event handling, before addressing more practical concerns. I was aiming for an abstract problem space, along the lines of "Producers and Consumers" or "Dining Philosophers" (or at least "Byzantine Generals", which is more broadly defined but still abstract), as I wasn't able to find such a standard treatment for general asynchronous IPC (though it is likely I simply overlooked it). This isn't as simplified as I would like, in fact, and I am concerned that this will interfere with the ability to isolate the questions involved. On the gripping hand, the main goal here is to find some problem that we can all study and tear apart, so for the time being, I think we should start with a specific design and each independently implement that, then review the implications of the specific solutions. We can look at other designs later as we progress through this. |
Page 1 of 1 | All times are UTC - 6 hours |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |