IanSeyler wrote:
Each one the exception handlers is 16-byte aligned so this will save me some bytes.
How will save you some bytes?
As for aligning for performance, it depends on what you mean by ISR. If your ISR is just a stub that calls the actual ISR in some driver then the alignment doesn't matter that much and you might want to make the stubs tiny so they have greater chance of being cached.
If you are talking about the actual ISR's (hundred bytes+) then I might even align on either pages or at least cache lines. The memory wasted due to cache line alignment is minimal. Assuming 64B cache lines, assuming half wasted (32B) and assuming you use all 256 interrupts you're wasting 8KiB memory on the whole system, which shouldn't matter at all. Without alignment it's possible the CPU will jump to your ISR and it's at the very end of some cache line and it won't be able to process multiple instructions immediately and will have to wait for the next cache line to be read from memory.
Note, I don't know how the CPU handles interrupts internally and whether it fetches the single cache line where the IDT entry points to or multiple ones. Either do some testing or assume worst and align at least by cache line size.
On x86 the cache line is commonly (these days always?) 64B..
Google Agner Fog, great research into x86 optimization, not just the latencies/throughput tables but the C and ASM optimization guides. I find them very well written and quite easily understandable. They explain pretty much everything from basics to advanced.