Hi,
bluemoon wrote:
Seems the current workaround is implement kernel page table isolation, but incur a significant performance penalty.
Since everyone is slower, do this means the performance gap between monolithic and micro kernel is closer and micro will be more viable design?
I've been trying to think of effective work-arounds, and there really aren't many. Apart from making kernel pages inaccessible ("not present" or using PCID):
- Disabling caches for CPL=3 pages would work but would have an extreme performance cost
- For 32-bit OS; segmentation might work, but to be honest I very much doubt it (I'd assume segment limit checks are done at the same time as page permission checks and therefore have the same problem)
- In theory, managed languages could work (by making it impossible for programmers to generate code that tries to access kernel space); but quite frankly every "managed language" attempt that's ever hit production machines has had so many security problems that it's much safer to assume that a managed languages would only make security worse (far more code needs to be trusted than kernel alone), and the performance is likely to be worse than an PTI approach (especially for anything where performance matters).
This leaves "make kernel pages inaccessible" as the least worst option.
For "make kernel pages inaccessible" it doesn't necessarily need to be all kernel pages. Pages that contain sensitive information (e.g. encryption keys) would need to be made inaccessible, but pages that don't contain sensitive information don't need to be made inaccessible. This gives 2 cases.
If PCID can't be used; then you could separate everything into "sensitive kernel data" and "not sensitive kernel data" and leave all of the "not sensitive kernel data" mapped in all address spaces all the time to minimise the overhead. For a monolithic kernel (especially a pre-existing monolithic kernel) it'd be almost impossible to separate "sensitive" and "not sensitive" (because there's all kinds of drivers, etc to worry about) and it'd be easy to overlook something; so you'd mostly want a tiny stub where almost everything is treated as "sensitive" to avoid the headaches. For a micro-kernel it wouldn't be too hard to distinguish between "sensitive" and "not sensitive", and it'd be possible to create a micro-kernel where everything is "not sensitive", simple because there's very little in the kernel to begin with. The performance of a micro-kernel would be much less effected or not effected at all; closing the performance gap between micro-kernel and monolithic, and potentially making micro-kernels faster than monolithic kernels.
Note: For this case, especially for monolithic kernels, if you're paying for the TLB trashing anyway then it wouldn't take much more to have fully separated virtual address spaces, so that both user-space and kernel-space can be larger (e.g. on a 32-bit CPU, let user-space have almost 4 GiB of space and let kernel have a separate 4 GiB of space).If PCID can be used (which excludes 32-bit OSs); then the overhead of making kernel pages inaccessible is significantly less. In this case, if nothing in the kernel is "sensitive" you can do nothing, and if anything in the kernel is "sensitive" you'd probably just use PCID to protect everything (including the "not sensitive" data). In practice this probably means that monolithic kernels and some micro-kernels are effected; but "100% not sensitive micro-kernel" wouldn't be effected.
In other words; it reduces the performance gap between some micro-kernel and monolithic kernels, but not all micro-kernels, and probably not enough to make some micro-kernels faster than monolithic kernels.
The other thing I'd want to mention is that for all approaches and all kernel types (but excluding "kernel given completely separate virtual address space so that both user-space and kernel-space can be larger"), the kernel could distinguish between "more trusted" processes and "less trusted" processes and leave the kernel mapped (and avoid PTI overhead) when "more trusted" processes are running. In practice this means that if the OS supports (e.g.) digitally signed executables (and is therefore able to associate different amounts of trust depending on the existence of a signature and depending on who the signer was) then it may perform far better than an OS that doesn't. This makes me think that various open source groups that shun things like signatures (e.g. GNU) may end up penalised on a lot of OSs (possibly including future versions of Linux).
Cheers,
Brendan