Hi All,
Just putting this out there as a quick thought from an amateur
The slowdown from fixing Meltdown is, by my limited understanding, caused by the fact that the kernel now has to reside in a separate process space. This means a call to the kernel involves at the very least:
Code:
Process --TS--> Kernel --TS--> Process
Of course, this will be worse where IPC is involved.
How about mapping a stub in to, say the first large page of kernel space. This stub contains no sensitive data, but purely implements a SYSCALL handler. For a 64 bit kernel, the stub then writes a single entry of the PML4 which points to the kernel. The kernel handles the call, the stub clears the present bit in the kernel and jumps back to the calling process:
Code:
Process--SYSCALL-->Stub--CALL-->Kernel--CALL-->Stub--SYSRET-->Process
I realise that mapping in / out a PML4 page has associated TLB costs, but just wondered firstly if those costs are lower than task switching and secondly whether this would actually rectify the cache issues?
I'm waiting to be put right!
Cheers,
Adam