rod wrote:
Thank you very much! Now I know how to do it, without pushing preserved registers. The key was the exception tables, which I did not know about.
Before the exception tables, Linux used to check every address against its own data structures. But it was a lot of work and something the CPU does, anyway, so they changed it. And I copied the idea.
rod wrote:
I guess that the same procedure would work also for copying _to_ user.
That particular one, yes, but this was just a lazy first attempt. For instance, the pointers are all misaligned (or at least, alignment is unchecked), which might be negligible, might be horrible, depending on the particular processor (benchmarks of misaligned memory access, done over a 10 year interval, return cyclical results
)
rod wrote:
If it were a read() instead of a write(), I guess that in case of EFAULT, it is allowed to leave the user memory in an undefined state (half copied or something).
Interesting point, actually. read() and write() are only supposed to fail if the transfer did not work at all and not a single byte was transmitted. Otherwise they are supposed to return short. But on the other hand, we could argue that the transfer to/from userspace is treated atomically in this instance. And yes, they can trash the user memory. As long as no other memory is trashed, anyway.
rod wrote:
I prefer to save them, but maybe one could also do not save them and set them to zero.
Save them, clear them, it's similar effort. You probably already have routines/macros in place to save registers, since that is needed for interrupts to work, so just using the same macros for syscalls is less effort, and less code. And I tell you, the code you didn't write is guaranteed to be bug-free.
rod wrote:
Edit: I thought that, if there are many entries in the exception table, it might be sorted at boot, and use a binary search instead of a linear search, in order to be more efficient
Actually that's the point of making the table writable. But a linear search was easier to write in that post. Though, I'm thinking of redoing the startup. I used to dislike self-modifying code, but all the function pointers I'm using are becoming cumbersome. I think I might go for the "alternatives" mechanism after all. In which case I will need to write the code section during startup. So placing the exception table in the code segment, then sorting it after applying all the binary patches, and then applying kernel-mode write-protection might be the better option (the more "write-once" data in that segment, the better for overall safety).