Standardized IPC protocol

AndrewAPrice · Post by **AndrewAPrice** » Mon Apr 13, 2020 12:02 pm

I was thinking about using a low-overhead serialization format, such as FlatBuffers, for interprocess communication. I am envisioning a microkernel where processes register RPCs (a name, request type, response type, entry point) with the kernel, and a task manager would let you view all running processes, see what RPCs they expose, let you visualize and record calls between processes, and even issue RPCs (from a text format such as JSON) to any running process and see the response.

It would be language agnostic (for any language that is supported by FlatBuffers), and you could a kernel API for synchronous and asynchronous calls. You could even generate a calling stub from the service defined in the .fbs to make it look like a function call.

Something similar to gRPC but at the kernel level for all IPC.

OSwhatever · Post by **OSwhatever** » Mon Apr 13, 2020 2:19 pm

What is the question you are asking?

AndrewAPrice · Post by **AndrewAPrice** » Mon Apr 13, 2020 3:46 pm

No question, just sharing my thoughts to see what other people are doing for message passing/RPCs or if someone has tried something similar.

MollenOS · Post by **MollenOS** » Mon Apr 13, 2020 11:47 pm

I use XML to describe my IPC protocols, and then use a code-generator to generate both client and server code stubs. My ipc library (libgracht) supports both my native kernel IPC API and also support my IPC protocols using sockets for userspace (I use this to define the window-manager protocol). My IPC implementation support both asychronous and synchronous calling.

My protocols look like this, however the support is something i'm still finishing up, as I only recently integrated support for my native kernel IPC API.
https://github.com/Meulengracht/MollenO ... /protocols

The format of the IPC calls are a binary format that uses headers to specifiy protocol/action to be called, since no ids/names needs to be known as the code stubs are auto-generated from the XML.

bzt · Post by **bzt** » Tue Apr 14, 2020 4:29 pm

I basically have two types of messages. Scalar and buffered. The first is passed on in GPRs only (up to 56 bytes), and that covers 99% of my messages. The second type uses 2 registers, one for the offset and one for the buffer's size. Then I simply use the same typedef struct cast on that buffer on both the sender and the receiver side. I prefer K.I.S.S. My IPC mechanism does not know, and does not need to know what's in the message buffer, it only cares about offset and length. All message sending is considered low-level and covered up by libc or other libraries. For example when your application calls "read()", it doesn't know that under the hood libc actually does a GPR only IPC to the FS server. The read buffer here is data, do not confuse with sending the message itself in a buffer. On the other hand "open()" is a buffered message, as it passes the file name as well which probably does not fit into 48 bytes (56 minus one register for mode), hence the need for a message buffer. But again, these details are completely hidden from the caller, that only sees a high-level classic C-style API, and completely unaware that the call is actually processed in another process.

Cheers,
bzt

AndrewAPrice · Post by **AndrewAPrice** » Thu Apr 16, 2020 3:40 pm

Thanks for sharing your approaches.

I was thinking about how to make IPC RPCs fast. The L4 microkernels take the approach of making IPCs synchronous. I'm intrigued about the idea implementing RPCs as fat function calls - when a process registers an RPC, it registers the address of the entrypoint, and when you issue an RPC you stay in the same thread, but the function call barrier changes the address space (and preserves registers.) The disadvantage of this method is that all RPC handlers must be thread-safe (although, worst case scenario is you lock the same mutex as the start of all of your handlers and your program is effectively single threaded.)

But, it becomes apparent that there are times we don't want to block the caller, e.g. the mouse driver notifying the program that's in focus that the mouse have moved shouldn't by synchronous otherwise we risk a userland program blocking the mouse driver. It might be useful to a mechanism that's send-and-forget. So, I think it would be useful to have two IPC mechanisms:

Synchronous RPCs where the request and response types are flatbuffers, functioning as fat function calls because they change the address space and set up a new stack, but for all intents and purposes executes the handler in the same thread as the caller.
Events (somewhat similar to signals) - you don't care about a response (or the response could be in the form of another event at a later date), so events are send-and-forget.

Because events are asynchronous, we can't execute the handler in the caller's thread, so I'm wondering if:

We start a new thread to execute the handler in. This would be consistent with how we handle RPCs (incoming calls just start executing), but creating a thread seems like it would be CPU heavy/scheduler heavy, and I'm also expecting many implementations would just want to add the incoming events (e.g. imagine a game receiving events such as key down, mouse moved, etc.) into some queue and then executing it from the process's main thread.
We introduce syscalls such as 'sleep until message' and 'process all messages' that we call from any thread inside the process, then the event handlers run on that thread. This would avoid the overhead of creating a ton of threads, and an event loop would feel natural for many applications.

Qbyte · Post by **Qbyte** » Thu Apr 16, 2020 4:45 pm

RPC's can also be implemented as signals (aka software interrupts) which interrupt the regular control flow of the callee. The caller makes a send_signal() system call and if the intended recipient has registered a signal handler, the kernel will perform a task switch to the callee, but it will leave the register file as the caller left it (allowing it to pass arbitrary arguments to the callee) and jump to the callee's signal handler entry point. When the handler returns, the kernel task switches back to the caller, leaving the register file as the callee left it.

Korona · Post by **Korona** » Fri Apr 17, 2020 1:59 am

Signals (synchronous interrupts of userspace) are a very bad concept with numerous issues (see Google). Signal code doesn't only need to be thread-safe, it also needs to be re-entrant which is much harder to achieve (the only general solution is to disable signals). A better approach is having some enter_rpc() call that processes RPC at a controllable point in the control flow.

I can elaborate more if necessary.

OSwhatever · Post by **OSwhatever** » Fri Apr 17, 2020 11:41 am

Korona wrote:Signals (synchronous interrupts of userspace) are a very bad concept with numerous issues (see Google). Signal code doesn't only need to be thread-safe, it also needs to be re-entrant which is much harder to achieve (the only general solution is to disable signals). A better approach is having some enter_rpc() call that processes RPC at a controllable point in the control flow.

I can elaborate more if necessary.

Yes, you have to elaborate on that a little bit more please, especially what you mean with "synchronous interrupts of userspace". About all microkernels use synchronous message passing (QNX, L4, Mach) which is more less the heart of those designs. Is that what you are meaning? Not sure what type Fuchsia is using but when I look at it, it looks like they are taking much influence from QNX. Does anyone know more about this?

AndrewAPrice · Post by **AndrewAPrice** » Fri Apr 17, 2020 3:34 pm

Regarding "synchronous interrupts of userspace", instead of a POSIX-style signal that pauses all of userspace until it finishes executing, I was thinking my RPCs would be like "fat functions" in that the same thread in which the call was made changes address space into the callee, and this thread doesn't pause any other threads running on the callee (so it will get preemptively interrupted.)

Being an RPC, there's no guarantee that child process will return (it could terminate, for example.) Imagine the following call stack:

Process A -> Process B -> Process C

(e.g. a process calls the VFS which calls the device driver.)

At any point, A, B, or C could terminate. So I'm thinking, we should return a status code along with the Flat Buffer Response message - the message only being populated if the status == OK. They we can handle any of these processes terminating:

Process A -> Process B -> X

The thread returns to the last caller in Process B, with the RPC returning the status CALLER_TERMINATED. Process B can choose if they want to propogate the error back up to Process A, return a different status, or gracefully handle it and still return OK along with some response back to Process A. (e.g. if Process A asks the VFS to get the contents of a directory, and the device driver fails, the VFS could gracefully return an empty directory.)
Process A -> X -> Process C

It wouldn't be wise to stop the thread mid-execution inside of Process C, since Process C could have locked resources or be in the middle of mutating a data structure. Process C should finish executing it's handler (even if this effort is ultimately wasted), and when Process C returns, the kernel sees that Process B was terminated, and jumps back to the caller inside Process A returning status CALLER_TERMINATED.
X -> Process B -> Process C

The thread finishes executing the handler inside of Process C and Process B, and upon returning to Process A, the kernel sees that Process A (where the thread originated) was terminated, discards the response, and kills the thread.

So, if you did want to call an RPC asynchronosly, it would be up to the caller to wrap it in it's own thread. The C++ code would look something like:

Code: Select all

std::future<status_or<ResponseType>> future_status_or_response = std::async(std::launch::async, []{ return CallRpc(process_id, message_id, request_message); });

// Do other processing, make other RPCs, etc.

// Get the future, blocks until finish running:
status_or<ResponseType> status_or_response = future_status_or_response.get();

if (!status_or_response.ok()) {
   // Some error happened, e.g. the process died.
   return;
}

ResponseType response = status_or_response.value();
// Have the response.

bzt · Post by **bzt** » Fri Apr 17, 2020 4:14 pm

I overcome this by introducing a FIFO message queue. When Process A sends a message to Process B, then that message is queued in B's message queue, and Process A goes on with its tasks. All my messages are async. There's only one condition when Process A blocks, and that's if the destination queue is full. Then B can run multiple threads if it wishes, and B can consume the messages whenever it wishes.

For sync calls, I simply do a send+recv pair. Process A sends a message about calling a function in B, then it blocks receiving a response message from B. When B processed the function message, it sends back a message with the return value(s) to Process A. Simple, but works remarkably well. (Each message has a serial, and each response message contains the requester message's serial. This way Process A does not need to block for the response, those could be processed in parallel by multiple threads too. However I haven't implemented that yet, my libc just blocks for now.)

I take great advantage of this async messaging in the FS service. It can receive a read() message from a process (libc in that process blocks waiting for a response from FS, so from POSIX point of view read() is a sync call), and could reply with data from cache right away, but it also could send another message to a disk driver. Then the main thread in FS consumes the next message which could be the response from the disk driver (probably for a different disk read), but also could be another request from another process. Since FS never blocks, it is always responsive all the time.

Conclusion: you should always aim for async communication. You can always implement sync calls with a send+recv pair, but you can't implement async on top of sync.

Cheers,
bzt

AndrewAPrice · Post by **AndrewAPrice** » Thu Apr 23, 2020 2:58 pm

bzt wrote:Conclusion: you should always aim for async communication. You can always implement sync calls with a send+recv pair, but you can't implement async on top of sync.

I was thinking the opposite

Aim for efficient synchronous IPC. (This is what L4 does.) You can always wrap it in a thread/fiber.

I was thinking - for sending large or variable sized messages but avoiding copying, you could send pages - unmap it from the sender and map it into the receiver. Is this an approach anybody else is doing? (It'll be up to the receiver to then free the page, or recycle the page to send another message.)

bzt · Post by **bzt** » Fri Apr 24, 2020 3:11 am

MessiahAndrw wrote:I was thinking the opposite Aim for efficient synchronous IPC. (This is what L4 does.) You can always wrap it in a thread/fiber.

Interesting. And how do you planning to implement async messages on top of sync if needed? A dedicated send async message call that returns right away?

MessiahAndrw wrote:I was thinking - for sending large or variable sized messages but avoiding copying, you could send pages - unmap it from the sender and map it into the receiver. Is this an approach anybody else is doing? (It'll be up to the receiver to then free the page, or recycle the page to send another message.)

Yes, that's exactly what I'm doing. As mentioned before, I have two types, message in registers and message in buffer. For the latter, I map the message into a circular buffer in the dest address space (then freeing can be done transparently to the receiver, when the circular buffer is full). It's not a real circular buffer per se, just a part of the address space I use for mapping messages. I do not unmap it from the sender's address space, I just mark it CoW in dest address space. There's a little trick to get it working, because the messages are not necessary page aligned, and they might overlap on several pages. I have a maximum limit for these messages (1M), the larger ones must be passed in shared memory (all address space has that mapped in). So for really large messages there's no need for the mapping either.

Cheers,
bzt

AndrewAPrice · Post by **AndrewAPrice** » Fri Apr 24, 2020 7:50 am

bzt wrote:
MessiahAndrw wrote:I was thinking the opposite Aim for efficient synchronous IPC. (This is what L4 does.) You can always wrap it in a thread/fiber.
Interesting. And how do you planning to implement async messages on top of sync if needed? A dedicated send async message call that returns right away?

I was thinking of having an event system in addition to the RPC system - an event being a one way message. 'The mouse moved', 'You lost focus', etc. These would be queued, not blocking for the caller, and I provide poll_and_handle_pending_events/sleep_and_handle_events that the callee could call to handle the next event.

For the RPCs, the system I'm thinking treats them as function calls, except the callee happens to exist in a different address space to the caller. There is no queue - the callee code starts running immediately. If you wanted to issue an async RPC, you'd have to create a thread;

Code: Select all

Future<Cat> future_cat = Thread([&] () { return GetCat(); });
Future<Dog> future_dog = Thread([&] () { return GetDog(); });
// Do other work.
// ...
// Now I care about cat and dog.
FunctionThatTakesCatAndDog(future_cat.get(), future_dog.get());

Korona · Post by **Korona** » Fri Apr 24, 2020 9:40 am

One trade-off between synchronous / asynchronous is the number and size of allocations. In a synchronous system, you will need more threads = more stacks (and more memory) but fewer allocations. Asynchronous systems need to allocate often (typically at least one control block per operation; often more because operations are nested) but less memory in total.

This also needs to be considered for real-time applications: it can be hard to make asynchronous system real-time capable since starting an asynchronous operation needs to allocate - and that can fail. Synchronous systems can just allocate from the stack. (This can of course be worked around by various techniques but it requires some effort and trade-offs.)

OSDev.org

Standardized IPC protocol

Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol

Re: Standardized IPC protocol