jvff wrote:
Hi,
I just wanted to ask for some opinions about the viability of an idea I had some time ago. It's based on the same principle as Unix pipes, however, implemented on a lower level.
I think it'd be quite viable, am working on a similar idea myself.
Quote:
All source-code objects (programs, libraries, etc.) are implemented and designed on the pipe metaphor. So every "object" (for a lack of better term, but could be module, component, function, etc.) runs and has a pointer to it's entry stream(s) and a pointer to it's exit stream(s) (and their respective lengths). This way all the code does is process streams.
Identical to my design, except that I view a stream as a continuous stream, so you can't request the length (normally), but you can test for the end of the stream.
I also intend to allow code to convert things to streams of other things & streams of things to one (1) other thing, so you can view opening a file as convert(file, stream<byte>) and archiving files as convert(stream<file>, file). The latter isn't supported yet. Also, there's a bit of a performance problem with indirection as nothing is buffered and you can only process a single item each time.
Quote:
IMHO it's an oversimplification of how computer systems are today (ie. Input/Output machines), and perhaps more flexible. I think cdecl calling convention in IA32 (ie. the default C calling convention in x86), has indirect stream processing built-in, however only for input streams. The input pointer is saved in esp (ie. the stack), and the caller is responsible for it's cleanup, hence printf(char *str, ...) is possible. However cdecl doesn't have support for output streams as it only allows single-return (the return could be a string pointer, but that still limits where the stream size is returned, possibly as first item in stream).
I think it's a simplification, not more than that. There are so many things that you can model with this concept that it could be a next greatest thing in programming.
Quote:
Anyway, stream oriented programming (SOP, please mind this term is a place holder for the idea here described, and if there exists real SOP outside this thread, I have no knowledge about it), allows some neat features, such as module linking to create for example programs or libraries, and perhaps aspect oriented programming (AOP) by replacing a specified module in a registry or central server by another module with pre-formats data and links to the original module, and also can perform post-formating.
Still sounds the same
Quote:
Drawback #1: who will handle the output stream? The OS? The parent module? The child module? This could result in inefficiencies. For example a module when "called" doesn't really know the size of the result. So it can't preallocate some are and use it without guessing the output size. It could use the stack, but that could result in stack overflows (which could possibly be solved by buffers, but who handles these?).
I intend to solve this with buffers, handled by the compiler (as template code that's semi-automatically inserted in between). The buffers also allow you to specify that you do or do not want to immediately jump to the next sequence for the next step of processing and if done correctly, you could run each minuscule bit of processing on a separate core. That scales really well (although the system I've tested it on only had HT - not much improvement).
Quote:
Drawback #2: Micro-streams could pose inefficiencies. For example consider an extremist SOP model which implements an itoa function as a module. The stream is so small it fit's in the CPU's registers. What's the point of allocating memory for the stream data? Of course stream data could be useful if you're streaming integers to the module, but for most cases, not. Someway, someone must catch micro-streams and optimize them (again in extreme SOP there could be a stream to handle stream creation, but that sounds stupid).
This is the case where I consider the model breaks down because it becomes too much of a stand in the way, and should either be replaced with more efficient modules that do more or acceptance of the speed. Using buffers & mass processing, you can get this pretty quick.
Quote:
Drawback #3: Streams are limited by their static nature. Some modules are more flexible if they are implemented like state machines. They become more like object instances in object oriented programming (OOP). However, this can be acheived simply by streaming the state with the stream data. But this still would require "buffer modules" to store the state for future use.
Instantiate modules explicitly with their own state so they have a state & known in/output connections (which they immediately can use). OO implicitness for me.
Quote:
Drawback #4: Type safety; there is none. Stream is just data. Interpretation of this data is strictly a responsability of the module. If the stream isn't properly formated it could result in the best case bad data, in medium case system crash and in worst case, uncontrolled system reconfiguration. This could be at least protected by AOP, where verifiers are placed before critical modules. But still AOP could be used to corrupt outcoming data.
Convert the type as a part of the name and only allow (at a choice level) the connections that work. Disallow all others. If you know something works, ilnk the module in the type system twice and tell it that it has two names.
Quote:
Drawback #5: Multi-threading safety. Obviously independant streams can be issued in parallel. However if modules become state machines (ala OOP), the states could be compomised. The most probable solution would be to have buffers save states and restrict buffers to single-threads. Synchronization would be simply controlled streaming of thread buffers.
Lock-free buffers are very usable and you can use a fairly large buffer & use multiple threads that keep tabs on what they should & shouldn't be doing. They should probably be implemented somewhat like coroutines.
Quote:
Drawback #6: Parallel stream processing. Stream processing allows the removal of some iterations (by simply streaming all the data). Some iterations can be executed in parallel. The coder could transform an iteration into n-streams, allowing them to be executed in n-threads. However, in a higher level, an iteration stream can be transformed into a single stream, and threads are created by the system, where each thread handles a part of the stream. This is more dynamic, but again it's hard to define who will control thread generation (a higher level central stream system? A module? The user?).
Thread creation is implicit and (imo) system defined. The modules indicate whether they carry state from one entry to the next (and should of course be made so that they don't). That way you can predetect which modules are being executed most of all, which threads are overloaded and add more threads to those bits or split up that thread into two threads each covering half the modules.
Quote:
Drawback #7: Stream funneling. The problem here is about modules with multiple input streams. This could be dangerous if one of the streams isn't ready. Another problem arises from funneling standards. What if we only feed a single stream to a multi-stream module? When is it best to have a module allow multi-inputs if it can be desgined for a single input?
I've defined a basic_filter that implicitly assumes there's one input and one output. That just requires you to test that the modules all keep working with full outputs & inputs, and you must ensure that for a module with >1 input the lowest throughput comes from the entropy-input. I've noticed that nearly all those things with more than one input use the others as auxiliary input, for random numbers, encryption streams and so on. They all have a primary stream as well, which is the actual data (thinking Vernam cipher here). If your encryption key stream is as quick as the data stream, that's no problem at all.
Quote:
SOP seems to me as a neat idea. However reality is a bit different. I can't really weigh the advantages and disadvantages, so I'm asking for opinions. I kill time brainstorming, and I would like to see how far I can go with the ideas and what I can lear from them. Thanks for reading, sorry for long post (as apparently I have the habbit of doing), and forgive bad english,
You can try my code at
http://atlantisos.svn.sourceforge.net/v ... clude/aos/, in particular input, output, module and filter.
My personal implementation atm only does unicode (two up, lib/libaos/modules/unicode) which is tested (in unix) and works, and should cleanly do Vernam ciphers and so forth. I have (at my work place) an implementation that was implemented to process network traffic and to interpret numerous levels of data, and it does that parallel (not implicitly yet) with buffers, supports cycles in data chains, physical interfacing and distributed processing. That program has 14k lines that I wrote for that, which includes 33 filters.
You also need an item called settings, which are variables within the modules that you can adjust at runtime for finetuning or changing the processing. In that idea I invented dynamic variables (short while back) and I've published them in the same directory as "dynamic", header file.
I'd love to work on it again but I'm pretty cramped for time.