Brendan wrote:
Think of it as a 3 part sequence, where:
OS installs its own drivers (for disk, network, ...) and therefore must break any previous "boot code" support for these devices (if there was any)
OS examines the devices and does some auto-detection (e.g. looks for any "RAID superblock" metadata on the disk); and starts any "optional middleware", like RAID layers and encryption; to create logical volumes (after any "boot code" support for the underlying device has been broken)
OS mounts the logical volume/s with file system/s (after any "boot code" support for the underlying device has been broken)
The filesystem can be recongized by reading the volume using only firmware IO services. At least for the boot filesystem. It may not be applicable for something like network attached devices, etc, but you wouldn't be booting from storage that cannot be easily supported by the bootloader anyway. Hence there should be enough information to decide which modules to pre-load before beginning to initialize the storage stack. There will be some architectural complexity in separating the identification and mounting steps, but this is already desirable for on-demand fs module loading.
There is actually a major issue, which breaks this proposition. Depending on the module organization and the integrity guarantees of the filesystem operations, crashing after partial unsuccessful update may leave the partition unbootable. A single ramdisk image sort-of works around the problem. On the other hand, this leads me to a different thought. That booting from arbitrary filesystems is probably not such a good idea. They may not support atomicity at all or the bootloaders may not have the necessary logic to properly replay journals, which they usually don't. Using specific filesystem designed for booting and atomic updates is, I think, much better. In which case, the above approach could still be feasible.
Brendan wrote:
Note that this works fine for various cases where there is no file system, like "Linux in ROM" (the original idea behind Coreboot) and "Linux boot protocol in ROM" (the way Xeon Phi accelerator cards and probably lots of embedded systems boot), and also works for various other scenarios (e.g. "boot from network, but mount local disk/s after boot").
Of course, if they wanted to, they could have introduced in-memory devices and mounted them as usual. It would obviously complicate things for this particular use case, but not every solution is best for every situation.
Brendan wrote:
Also, (as far as I can tell) Linux developers themselves agree with you about it being over-complicated - I even remember seeing a conversation about GRUB 2 (likely on the Linux kernel developer's mailing list) where someone suggested using a Linux kernel as a boot loader instead because a whole Linux kernel is simpler than GRUB 2.
They do actually support loading the kernel directly from the EFI boot manager. Although "directly" is probably a little overstated here. In linux this means loading an EFI stub that jumps into generic stub that decompresses the kernel code and jumps to it, but all of this happens from code and data in one image. This is in theory. In practice however, since secure boot wont normally allow loading the kernel, they use a shim signed by MS that is loaded first. (Microsoft uses its private key to sign other people's generic loaders.) And because they prefer not to overwrite the firmware's boot entry on every update, yet another boot proxy is introduced - systemd-boot (ex- gummyboot). So, they have jumped a lot of hoops by the end, but the idea is in principle there.