OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Mar 29, 2024 7:46 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Tue Feb 20, 2018 5:34 pm 
Offline

Joined: Tue Jul 04, 2017 12:45 am
Posts: 15
This is more related to the OS bootloader, but it seemed more fitting here than in other forums.

Before I attempt to load my OS via EFI, I wanted to give the "legacy" mode a shot by implementing a realmode loader. I have gotten to the point where I can get execution and read blocks from memory via int13, so the rest is a design chore. From what I understand, the MBR code has the responsibility of reading the MBR table from the first sector of disk, determining which partition is "active" and then loading the VBR from the first sector. Then the VBR loads the next stage from disk, which can be of arbitrary size. This loader may then have logic to support reading from filesystems and parsing its own configuration file (e.g. GRUB).

I think I understand the execution flow, but I am less keen on the details of how to locate these various things on disk. The MBR is easy: it gets loaded from a fixed location. The VBR again, is loaded from the location specified in the active table entry. However, is this a "known" location? Namely, is there are standard for how much memory to reserve after the MBR such that we know where the VBR will be located? Of course, the VBR is not necessarily located in the first partition. I have seen older tools like `fdisk` reserve the first 2048 sectors (1MiB) before the first partition begins. However, I have also read that this is a historical artifact lending itself to old alignment requirements.


In essence, my questions are:
- What regions/how much memory are available at this point? The MBR is loaded at 0x7c00 (as is the VBR), but what about for a stack and other memory needs (like blocks from disk)?

- How does the VBR know where to find the next/final bootloader stage? It is also limited to 512 bytes in size, so it cannot contain logic to parse filesystems.

To test my code I have been creating an MBR with a FAT partition and just dd'ing my realmode loader code into the first sector (making sure to not overwrite the table and signature). I have not quite gotten to the VBR stage yet, but I imagine it would be something similar. However, I am unsure of the format of a partition that also includes a VBR.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Tue Feb 20, 2018 8:56 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

pragmatic wrote:
This is more related to the OS bootloader, but it seemed more fitting here than in other forums.

Before I attempt to load my OS via EFI, I wanted to give the "legacy" mode a shot by implementing a realmode loader. I have gotten to the point where I can get execution and read blocks from memory via int13, so the rest is a design chore. From what I understand, the MBR code has the responsibility of reading the MBR table from the first sector of disk, determining which partition is "active" and then loading the VBR from the first sector. Then the VBR loads the next stage from disk, which can be of arbitrary size. This loader may then have logic to support reading from filesystems and parsing its own configuration file (e.g. GRUB).

I think I understand the execution flow, but I am less keen on the details of how to locate these various things on disk. The MBR is easy: it gets loaded from a fixed location. The VBR again, is loaded from the location specified in the active table entry. However, is this a "known" location? Namely, is there are standard for how much memory to reserve after the MBR such that we know where the VBR will be located? Of course, the VBR is not necessarily located in the first partition. I have seen older tools like `fdisk` reserve the first 2048 sectors (1MiB) before the first partition begins. However, I have also read that this is a historical artifact lending itself to old alignment requirements.


Traditionally the MBR copies itself (including the partition table) somewhere else, finds the active partition, then loads the VBR from the first sector of the active partition to 0x00007C00 (and checks for the "boot signature" at offset 0x01FE in the VBR?), then sets DS:SI to the address of the active partition's partition table entry (which will be near the end of wherever the MBR copied itself) and jumps to the VBR's code.

The VBR's code uses the value left in DS:SI to find the partition table entry that describes its partition; and should probably copy that information somewhere safe before setting up its stack or doing anything else with memory (to avoid any risk of trashing the data or making assumptions about where MBR copied itself).

pragmatic wrote:
In essence, my questions are:
- What regions/how much memory are available at this point? The MBR is loaded at 0x7c00 (as is the VBR), but what about for a stack and other memory needs (like blocks from disk)?


After the MBR is finished it's no longer important and (excluding the information describing the active partition) can overwritten. The VBR can assume that the MBR's stack is still good enough to handle things like IRQs until the VBR decides to use any additional memory (and has to worry about the stack possibly trashing the memory you decided to use, and should probably just setup the stack itself to avoid that).

pragmatic wrote:
How does the VBR know where to find the next/final bootloader stage? It is also limited to 512 bytes in size, so it cannot contain logic to parse filesystems.


That depends on the OS and how it's designed. A more important question is, for your OS, how do you want the VBR to find the next stage?

For my OS, I have a reserved area at the start of the partition containing various things (e.g. the boot loader, the second stage, a "boot image" that contains lots of files) plus a few fields in the 1st sector of the partition that says where these other things are. Utilities that create/update files used during boot (in the reserved area) also update the fields in the 1st sector; and the file system (if there is one) doesn't touch anything in the reserved area.

Note that while the VBR is limited to 512 bytes, that doesn't mean that the boot loader is limited to 512 bytes - e.g. the first 512 bytes of the boot loader (containing the VBR) can load the remainder of the boot loader; and the whole boot loader can be "100 KiB of stuff that loads stage 2".

pragmatic wrote:
To test my code I have been creating an MBR with a FAT partition and just dd'ing my realmode loader code into the first sector (making sure to not overwrite the table and signature). I have not quite gotten to the VBR stage yet, but I imagine it would be something similar. However, I am unsure of the format of a partition that also includes a VBR.


I wouldn't consider the MBR part of any OS - it's just something (a third party helper that could be replaced by anything whenever the user feels like it) that's executed before any OS's code is started.

How you format a partition is up to you - the only restriction is that if you want a VBR then the VBR has to be in the 1st sector of the partition. However; if you decide you want to use an existing file system (rather than using none, or designing your own file system) then you also have to follow the rules of that file system.

Note that FAT allows "hidden/reserved sectors" at the start of the partition (which is normally set to "one reserved sector" but may be set to anything when you create/format the partition), so you could have many reserved sectors and put most of your boot code there (e.g. including "stage 2") if you want to. The other alternative is to use the least reserved sectors (e.g. one partition for VBR and nothing else) and end up having code to do all the sanity checks, code to find/parse directories, code to find/load files (plus all of the "nice and descriptive human readable" error messages, plus the BPB that FAT needs) all squeezed into a pitiful little 512-byte area (which, to be perfectly honest, is not possible without turning it into a steaming pile of puke - e.g. inadequate sanity checks, bad error handling, extremely poor error messages, etc).


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Wed Feb 21, 2018 6:19 am 
Offline

Joined: Tue Jul 04, 2017 12:45 am
Posts: 15
Brendan wrote:
After the MBR is finished it's no longer important and (excluding the information describing the active partition) can overwritten. The VBR can assume that the MBR's stack is still good enough to handle things like IRQs until the VBR decides to use any additional memory (and has to worry about the stack possibly trashing the memory you decided to use, and should probably just setup the stack itself to avoid that).

Note that while the VBR is limited to 512 bytes, that doesn't mean that the boot loader is limited to 512 bytes - e.g. the first 512 bytes of the boot loader (containing the VBR) can load the remainder of the boot loader; and the whole boot loader can be "100 KiB of stuff that loads stage 2".


Is any of this early memory "off-limits", so to speak? I know regions roughly above 000a:0000 are not safe to use, but what can be said of the lower regions?

Brendan wrote:
That depends on the OS and how it's designed. A more important question is, for your OS, how do you want the VBR to find the next stage?

For my OS, I have a reserved area at the start of the partition containing various things (e.g. the boot loader, the second stage, a "boot image" that contains lots of files) plus a few fields in the 1st sector of the partition that says where these other things are. Utilities that create/update files used during boot (in the reserved area) also update the fields in the 1st sector; and the file system (if there is one) doesn't touch anything in the reserved area.


This is true, of course. However, I do not plan to roll my own filesystem for now. That said, I need to play nicely with other filesytems and partitioning schemes. Obviously I will have an MBR at the beginning, and the MBR table will point somewhere in disk allowing me to find the VBR for the active partition. My confusion is with respect to how all of this works together in an orderly way, or at least what the "norms" are. I ask because one could use install GRUB (via a package containing the grub-install command) into an already-existing partitioning scheme and filesystem. In this case GRUB could just overwrite the MBR and VBR (since on an MBR system these would already be in place), but I wonder how it finds and loads its final stage? It is open source .. so perhaps I can find answers there.

Brendan wrote:
How you format a partition is up to you - the only restriction is that if you want a VBR then the VBR has to be in the 1st sector of the partition. However; if you decide you want to use an existing file system (rather than using none, or designing your own file system) then you also have to follow the rules of that file system.


Indeed. My first thought was similar to what you describe above: some reservation of memory at the beginning of the partition for these various things. Of course, these various loaders would have knowledge of each other and know how to where to find the next stage and how to load them. However, none of that applies when working with existing filesystems. I could imagine a case where the MBR points to a real partition, but the bootloaders are sitting in reserved LBAs proceeding them. There are obviously countless ways to implement this, but there must also be some agreed upon methodology. Otherwise various bootloaders could not be drop-in replacements for each other, right? If there is no "standard" way then it would seem that each partitioning scheme be necessarily bootloader-specific, yet we know this to not be true (i.e. enabling dual boot after installing Windows).

Are filesystems drivers "MBR-aware"? If the MBR points to a partition, drivers must know to first skip the VBR to get to the filesystem data structures.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Wed Feb 21, 2018 6:48 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7612
Location: Germany
The term for the MBR (or some other VBR...) loading a VBR is called "chain loading", because the mechanisms and environment for *BR's are so similar -- copy yourself to somewhere safe, load sector from disk to 0x00007C00, jump there.

pragmatic wrote:
I need to play nicely with other filesytems and partitioning schemes.


From a practical angle, you should probably assume that the VBR (on your OS' partition) is yours, while the MBR is the user's. If you fiddle with the MBR, and want to be accepted by a user, you'd have to offer a boot selection menu, and be capable of chainloading e.g. Windows and Linux, i.e. whatever the user will be using when he's not giving your OS a try.

Hence, personally, I would basically forget about the MBR itself, and focus on your VBR being chain-loaded by GRUB or whatever other bootloader / boot manager the user might have installed.

pragmatic wrote:
In this case GRUB could just overwrite the MBR and VBR (since on an MBR system these would already be in place), but I wonder how it finds and loads its final stage?


GRUB is a two-stage bootloader. AFAIR (back from GRUB "legacy", pre-v2.0), the MBR has the location of the second stage hardcoded into it at installation. (Which is why you need to re-run the GRUB "install" after you changed the second-stage files, e.g. during a GRUB update.) So the MBR (GRUB stage 1) loads the menu / command line interface (GRUB stage 2). That stage 2 is usually residing in /boot/grub/ of the Linux that installed GRUB on the machine.

GRUB (again, pre-v2.0 knowledge her) could either "multiboot" an OS (using the capabilities and interfaces GRUB offered), or chainload a VBR off some partition (which is how GRUB boots a Windows partition, and the "standard" pre-EFI way of doing things).

It's even possible to install GRUB, not in the MBR, but as a Linux installation's VBR -- and have your MBR select between booting your OS in whatever way you like, or chainloading GRUB.

I hope this paints a clearer picture for you.

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Wed Feb 21, 2018 7:43 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

pragmatic wrote:
Is any of this early memory "off-limits", so to speak? I know regions roughly above 000a:0000 are not safe to use, but what can be said of the lower regions?


For BIOS boot loaders; typically you begin with an assumption like "RAM from 0x00001000 to 0x00080000 is always safe to use, and the CPU might be an ancient 8086", then check if the CPU is ancient (and display a "your CPU is too old" message and refuse to boot if it's a crusty old 8086 or something) before attempting to get a memory map from the BIOS (which should start with trying "int 0x15, eax = 0xE820", which uses 32-bit registers and is the reason why you make sure the CPU is at least 80386 before trying). You'd also make sure the A20 gate is enabled before using memory that needs it enabled.

Of course where and when these things happen is up to you - you can do as much as possible or as little as possible while still using the "RAM from 0x00001000 to 0x00080000 is always safe to use, and the CPU might be an ancient 8086" assumption.

pragmatic wrote:
Brendan wrote:
That depends on the OS and how it's designed. A more important question is, for your OS, how do you want the VBR to find the next stage?

For my OS, I have a reserved area at the start of the partition containing various things (e.g. the boot loader, the second stage, a "boot image" that contains lots of files) plus a few fields in the 1st sector of the partition that says where these other things are. Utilities that create/update files used during boot (in the reserved area) also update the fields in the 1st sector; and the file system (if there is one) doesn't touch anything in the reserved area.


This is true, of course. However, I do not plan to roll my own filesystem for now. That said, I need to play nicely with other filesytems and partitioning schemes.


I don't want to design my own file system yet either (I want to eventually, but not yet). That's the nice thing about "boot code uses reserved area at start of the partition" - you can design the file system much much later. For "boot code uses files from the file systems" you have to design the file system before you can finish writing the boot code.

Your boot code doesn't need to play nicely with other file systems. More specifically; you can put partitions into 3 categories:
  • Native partitions that belong your OS, where your boot code must support at least one of them
  • Partitions that belong to other OSs that you have no reason to care about or touch
  • Partitions that exist for sharing data between different operating systems (e.g. a FAT partition data that can be used by all operating systems that happen to be installed) that no OS boots from and no boot loader needs to support

You do need to play nicely with the partitioning scheme; but on 80x86 there's mostly only 2 partitioning schemes ("MBR partitions" and GPT) and as long as your boot code knows where your partition begins and ends it doesn't need to care which partitioning scheme is being used. Note: The BSD group of operating systems tend to use "slices", but on 80x86 systems these are implemented on top of other partitioning schemes (e.g. so that you end up with a partition that is split into pieces/slices and don't have a whole disk that is split into slices).

pragmatic wrote:
Obviously I will have an MBR at the beginning, and the MBR table will point somewhere in disk allowing me to find the VBR for the active partition. My confusion is with respect to how all of this works together in an orderly way, or at least what the "norms" are. I ask because one could use install GRUB (via a package containing the grub-install command) into an already-existing partitioning scheme and filesystem. In this case GRUB could just overwrite the MBR and VBR (since on an MBR system these would already be in place), but I wonder how it finds and loads its final stage? It is open source .. so perhaps I can find answers there.


GRUB can (and very likely will) overwrite the MBR, but shouldn't overwrite your VBR (unless the user is deliberately removing your OS); and GRUB does support chainloading and can be used to start your VBR the same as any other MBR or boot manager would (e.g. loading your VBR at 0x0007C000 and jumping to it). That's part of what I was saying earlier - the MBR isn't part of the OS.

More generally; the entire first track of the disk and/or the area at the start of the disk before the beginning of the first partition belongs to a "third party whatever the user feels like boot manager" and no OS should touch it ever; the end user can install many different OSs, and after installing many OSs the end user can change their "boot manager" at any time.

Of course (for convenience and no other reason) it's normal for an OS installer to ask the user if they want a (supplied alongside the OS) "minimal MBR with no features" to be installed, because a lot of people only install one OS and for that case it prevents the end user from finding/installing their own MBR or boot manager.

Unfortunately, everything I've said here uses the word "should" because sometimes different OSs don't do what they should. For example, one well known family of operating systems is created by a-holes (Microsoft) and their OS installer/s will overwrite the MBR without asking (not because it's hard to do the right thing, but because they want to care about competing operating systems).

pragmatic wrote:
Brendan wrote:
How you format a partition is up to you - the only restriction is that if you want a VBR then the VBR has to be in the 1st sector of the partition. However; if you decide you want to use an existing file system (rather than using none, or designing your own file system) then you also have to follow the rules of that file system.


Indeed. My first thought was similar to what you describe above: some reservation of memory at the beginning of the partition for these various things. Of course, these various loaders would have knowledge of each other and know how to where to find the next stage and how to load them. However, none of that applies when working with existing filesystems.


That depends - it can apply when working with some file systems (e.g. FAT, which does support "arbitrary number of reserved sectors at start of partition") but can't apply when working with some other file systems.

pragmatic wrote:
There are obviously countless ways to implement this, but there must also be some agreed upon methodology. Otherwise various bootloaders could not be drop-in replacements for each other, right? If there is no "standard" way then it would seem that each partitioning scheme be necessarily bootloader-specific, yet we know this to not be true (i.e. enabling dual boot after installing Windows).


Traditionally; a boot loader is custom designed specifically for one specific OS (and may also be designed for the type of firmware and/or type of boot device - e.g. one boot loader to boot from CD-ROM, another to boot from network, etc); and each OS has it's own "boot protocol" and/or standard/s that describes exactly what the OS expects from boot loaders designed for that specific OS; and there is no standard boot loader for all OSs; and the only standard shared by all OSs are standards that describe the environment that the boot loader begins with (e.g. BIOS, El Torito, PXE, UEFI, OpenFirmware, ...).

The only exception to this (that I know of) is multi-boot; which mostly only exists because GRUB developers wanted to break traditions and ruin everything. They succeeded in ruining a lot, but mostly failed to get multi-boot adopted and ended up implementing multiple different boot protocols/standards for multiple different OSs.

pragmatic wrote:
Are filesystems drivers "MBR-aware"? If the MBR points to a partition, drivers must know to first skip the VBR to get to the filesystem data structures.


There's no reason for file system code to be "MBR-aware". File system code should rely on a lower level "volume" abstraction; where the underlying volume might be a partition, or a whole disk, or many disks combined with RAID, or a file, or an area of RAM, or anything else.

If the file system is designed to support VBR (e.g. and has a "reserved for boot code" area that's at least 512 bytes at the start of the volume) then the file system code has to comply with the design/specification of the file system; and if the file system is not designed to support VBR (and doesn't have any "reserved for boot code" space at the start of the volume) then the file system code has to comply with the design/specification of the file system.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Thu Feb 22, 2018 7:18 am 
Offline

Joined: Tue Jul 04, 2017 12:45 am
Posts: 15
Thanks everyone. This discussion has been very interesting. I took some time yesterday to investigate GRUB by stepping through the MBR/VBR in QEMU and browsing the disassembly via objdump and IDA. Interestingly, I found that GRUB does not adhere to the common model of moving and reloading code at 0x7c00. I have attached the extracted code sections, but what I observed was (using int13 with ah=0x42):


- GRUB MBR gets execution at 0000:7c00 (LBA0)
    Load 1 sector from LBA1 to 7000:0000
    Copy 0x200 (i.e. 1 sector) bytes from 7000:0000 to 0000:8000
    Jump to 0000:8000

- GRUB stage 1 gets execution at 0000:8000
    Load 67 sectors from LBA2 to 7000:0000
    Copy 0xce00 bytes (i.e. 67 sectors) from 7000:0000 to 0820:0000
    Jump to 0000:8200

- Grub stage 2 gets execution at 0000:8200
    At this point it very quickly moves into protected mode. The source-code can be found here

I was only able to find the source for the stage 2 bootloader in the source tree. The partitioning scheme I used was and MBR with the first 2048 sectors reserved (one for the MBR, the other 2047 unused). This is the default and cannot be changed when partitioning with the fdisk utility. It seems noted this extra space and installed itself in these reserved sectors. Interestingly, one is not required to reserve these 2048 sectors before the first partition with other utilities, like cfdisk.

I wonder how GRUB would install itself if there were not sufficient memory in this "reserved" area. The default for cfdisk appears to be to begin the filesystem at sector 63 rather than 2048. In this case, given that GRUB has done in the first 69 sectors, it muse resolve to a different installation scheme; perhaps GRUB uses a VBR in this case. I checked sector 2048 on my drive and there is no VBR. Once I formatted the partition with FAT32 I was able to find the VBR, but it is a default one installed with the filesystem. If you attempt to execute it the string "This is not a bootable disk. Please insert a bootable ..." is printed.


Attachments:
grub_stages.tar.gz [52.03 KiB]
Downloaded 14 times
Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Thu Feb 22, 2018 10:08 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

pragmatic wrote:
Thanks everyone. This discussion has been very interesting. I took some time yesterday to investigate GRUB by stepping through the MBR/VBR in QEMU and browsing the disassembly via objdump and IDA. Interestingly, I found that GRUB does not adhere to the common model of moving and reloading code at 0x7c00.


Yes, GRUB is far more complex/bloated than a simple (512-byte, "load VBR and little else") MBR. Despite this, (after doing a huge amount of unrelated stuff) it still makes sure it's not using the 512 bytes at 0x00007C00 itself before loading the VBR at 0x0007C000 and then jumping to the VBR.

pragmatic wrote:
I wonder how GRUB would install itself if there were not sufficient memory in this "reserved" area. The default for cfdisk appears to be to begin the filesystem at sector 63 rather than 2048. In this case, given that GRUB has done in the first 69 sectors, it muse resolve to a different installation scheme; perhaps GRUB uses a VBR in this case. I checked sector 2048 on my drive and there is no VBR. Once I formatted the partition with FAT32 I was able to find the VBR, but it is a default one installed with the filesystem. If you attempt to execute it the string "This is not a bootable disk. Please insert a bootable ..." is printed.


When I said "GRUB succeeded in ruining a lot" earlier; one of the things I was referring to is that GRUB ruins the separation between "boot manager" (the thing that's used to choose which boot loader to start, that belongs at the start of the disk and includes the MBR) and "boot loader" (the thing designed to start the chosen OS, which should be part of the OS itself). Because of this; when it's using multi-boot (or "Linux boot protocol" or whatever the BSD's use, or ...) it loads file/s from a file system without touching the VBR at all.

However; if GRUB is configured to chain-load an operating system's own boot loader then it does act as a pure boot manager and does load the chosen operating system's VBR.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Thu Feb 22, 2018 11:52 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5103
pragmatic wrote:
- What regions/how much memory are available at this point? The MBR is loaded at 0x7c00 (as is the VBR), but what about for a stack and other memory needs (like blocks from disk)?

You can use INT 0x12 to see the amount of memory available below the 1MB mark. This measure includes RAM used by the IVT, BDA, and your boot code, so you'll need to take those into account when deciding where to put your stack and other data (like disk buffers). All PC-compatibles will report at least 64kB through INT 0x12. I believe 32-bit PCs will always report at least 256kB, but I don't know for sure if that's correct. You're much more likely to see values like 512kB or slightly-less-than-640kB.

pragmatic wrote:
- How does the VBR know where to find the next/final bootloader stage? It is also limited to 512 bytes in size, so it cannot contain logic to parse filesystems.

It most certainly can, although Brendan will argue there's not enough room to do it well.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Thu Feb 22, 2018 8:15 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Octocontrabass wrote:
pragmatic wrote:
- How does the VBR know where to find the next/final bootloader stage? It is also limited to 512 bytes in size, so it cannot contain logic to parse filesystems.

It most certainly can, although Brendan will argue there's not enough room to do it well.


Yes; it's possible to write a "single sector FAT boot loader", but it's impossible to write a "single sector FAT boot loader" that is good enough to be acceptable.

For things like boot code, where failure can often mean that an entire computer becomes unusable, which (for "single computer" people) can mean things like documentation and the internet can't be used to help figure out how to fix the problem; I expect clear plain English error messages that help the user figure out how to correct the problem (if possible), which includes sending a bug report to developers with useful information (if necessary).

For a FAT boot loader there's two layers of errors to check and report. For the low level disk IO, you'd want things like:
  • Unsupported function (indicating software bug or incompatibility)
  • Device not responding (maybe the user removed/unplugged the device, maybe MBR is buggy and gave the wrong "device number")
  • CRC error/bad sector (device is likely faulty)
For FAT initialisation you'd want things like:
  • BPB not present/too many fields look dodgy (partition not formatted properly, OS not installed properly, maybe MBR gave wrong partition table entry)
  • Unsupported "bytes per sector" or "media type" or ... (maybe the tool used to create the file system is too new and does something the boot loader doesn't support)
Then for finding and loading the file you'd want:
  • Cluster number > max. (file system is corrupt and needs to be reformatted)
  • Directory entry failed sanity checks (file system is corrupt)
  • File not found (OS not installed properly)
  • File size is too large to be a valid "stage 2" (OS not installed properly)
Then after the file is loaded you'd want to check the file's header to make sure it actually does look like a valid "stage 2" and isn't some random text file or something else.

This probably adds up to more than 512 bytes of text strings alone; and the code to do the checking is going to cost at least that much again. Of course that's ignoring things like fault tolerance completely (e.g. no ability to use the secondary/backup "Cluster Allocation Table" if there's a problem with the first; no ability to try to load an alternative copy of "stage2.bin" if there's a problem with the first, etc).

I have never seen a "single sector FAT boot loader" that correctly handles any of the "something isn't how it should be" cases. Almost all of them are pure trash that simply crash in unexpected ways (without providing any information for the end user or the developer) as soon as any of many things is a little wrong. For one of many very simple examples, for your code (as far as I can tell) if someone accidentally replaced the "stage2.bin" with a 1 MiB file your boot loader would happily fail to notice, then overwrite the EBDA, then crash.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Fri Feb 23, 2018 5:44 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5103
Brendan wrote:
Yes; it's possible to write a "single sector FAT boot loader", but it's impossible to write a "single sector FAT boot loader" that is good enough to be acceptable.

That depends on what you consider "acceptable". Not everyone wants the same things you do.

Brendan wrote:
For things like boot code, where failure can often mean that an entire computer becomes unusable, which (for "single computer" people) can mean things like documentation and the internet can't be used to help figure out how to fix the problem; I expect clear plain English error messages that help the user figure out how to correct the problem (if possible), which includes sending a bug report to developers with useful information (if necessary).

How will they fix it themselves with no way to boot the computer?

Once they do have a way to boot the computer, most problems can be detected by an automatic "help my computer stopped working" diagnostic tool, which will have plenty of opportunity to display detailed instructions for the user to follow, assuming it doesn't fix the problem by itself.

Brendan wrote:
For a FAT boot loader there's two layers of errors to check and report. For the low level disk IO, you'd want things like:
  • Unsupported function (indicating software bug or incompatibility)
  • Device not responding (maybe the user removed/unplugged the device, maybe MBR is buggy and gave the wrong "device number")
  • CRC error/bad sector (device is likely faulty)

Most of these issues can be detected by automatic diagnostics. The only one that can't is software bugs, and I'm pretty confident there are no situations where my bootloader might try to call the wrong BIOS function.

Brendan wrote:
For FAT initialisation you'd want things like:
  • BPB not present/too many fields look dodgy (partition not formatted properly, OS not installed properly, maybe MBR gave wrong partition table entry)
  • Unsupported "bytes per sector" or "media type" or ... (maybe the tool used to create the file system is too new and does something the boot loader doesn't support)
Then for finding and loading the file you'd want:
  • Cluster number > max. (file system is corrupt and needs to be reformatted)
  • Directory entry failed sanity checks (file system is corrupt)
  • File not found (OS not installed properly)
  • File size is too large to be a valid "stage 2" (OS not installed properly)
Then after the file is loaded you'd want to check the file's header to make sure it actually does look like a valid "stage 2" and isn't some random text file or something else.

All of these issues can be detected by automatic diagnostics, and most situations where they might occur can be prevented with simple sanity checks in the tool(s) that install the boot loader and manage the boot partition.

Brendan wrote:
For one of many very simple examples, for your code (as far as I can tell) if someone accidentally replaced the "stage2.bin" with a 1 MiB file your boot loader would happily fail to notice, then overwrite the EBDA, then crash.

My boot loader will report an error, without overwriting the EBDA or crashing. I know there are some situations where it's not so well-behaved, but at least this is not one of them.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Fri Feb 23, 2018 5:54 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7612
Location: Germany
Octocontrabass wrote:
How will they fix it themselves with no way to boot the computer?


Not being able to boot from hard drive doesn't mean you can't boot the computer at all. Things like bootable USB sticks or "Live CDs" exist, and I would assume anyone giving YourOS a try will have some of those lying around, and the know-how to use them to recover a trashed MBR / boot manager.

But as I said earlier, touching the MBR isn't something I would recommend YourOS to do; not until you gained a lot of traction in the community. Until then, I'd assume your responsibility starts at the VBR, with the MBR left to one of the "big" solutions (GRUB...). That also means that you won't render the whole system unbootable, just YourOS.

That being said, and with regards to the rest of your post, you seem to work under the assumption that your software will always work like intended. That is a very dangerous assumption...

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Fri Feb 23, 2018 10:21 am 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

Octocontrabass wrote:
Brendan wrote:
Yes; it's possible to write a "single sector FAT boot loader", but it's impossible to write a "single sector FAT boot loader" that is good enough to be acceptable.

That depends on what you consider "acceptable". Not everyone wants the same things you do.


Really? I'd assume that at least 75% of people that have used/maintained computer for a few years are sick of dodgy pieces of crap that frequently fails and often "fails badly" (without any useful information that leads to a viable fix or work-around) because of some incompetent moron that didn't care about the quality of their software.

Today I decided it'd be fun to update my games machine's video card driver. I downloaded the latest driver from AMD's web site, tried to install it, and Windows refused because the driver wasn't digitally signed(!). I tried the other 2 options on AMD's web site (a slightly older version, and a "minimal" thing) which both had the same problem. I tried re-installing the old version I already had (which was working before I started messing around with drivers), and that timed out because it was trying to fetch something from the internet that no longer exists. I wasted around 30 minutes waiting for worthless bloat to download (AMD's drivers are packaged with a huge amount of nonsense and are over 400 MiB each), plus another hour trying to find "solutions" that don't work (like disabling digital signature checks only to find that it's temporary and has to be done every time you boot); and after all this hassle I'm left with an older driver than when I started because it's the only thing I could actually install. This is the sort of thing you get when developers define "acceptable" as sacrificing quality for the sake of developer's own convenience.

Octocontrabass wrote:
Brendan wrote:
For things like boot code, where failure can often mean that an entire computer becomes unusable, which (for "single computer" people) can mean things like documentation and the internet can't be used to help figure out how to fix the problem; I expect clear plain English error messages that help the user figure out how to correct the problem (if possible), which includes sending a bug report to developers with useful information (if necessary).

How will they fix it themselves with no way to boot the computer?


One major step in fixing a problem is figuring out the cause. If the problem is faulty hardware then replacing software won't help. If the problem is a corrupt file system then then replacing the boot loader won't help. If the problem is the boot loader then reformatting the file system won't help... If the error message is "Error: 123" (or worse - the software is so bad that it crashes without any error message at all) then a user (who is already annoyed that something failed) becomes frustrated very quickly because they have no idea what happened (and have to waste their time searching for clues to make up for the developer's stupidity).

Octocontrabass wrote:
Once they do have a way to boot the computer, most problems can be detected by an automatic "help my computer stopped working" diagnostic tool, which will have plenty of opportunity to display detailed instructions for the user to follow, assuming it doesn't fix the problem by itself.


Have you ever used one of those diagnostic tools??? My "favourite" is the one Windows uses when an application crashes - it's a dialog box that says something like "searching for a solution" for about 5 whole minutes that has probably never found a single solution to anyone's problem in the entire 10+ years that it's existed. These tools are the equivalent of salt to rub in the victim's wounds.

Octocontrabass wrote:
Brendan wrote:
For a FAT boot loader there's two layers of errors to check and report. For the low level disk IO, you'd want things like:
  • Unsupported function (indicating software bug or incompatibility)
  • Device not responding (maybe the user removed/unplugged the device, maybe MBR is buggy and gave the wrong "device number")
  • CRC error/bad sector (device is likely faulty)

Most of these issues can be detected by automatic diagnostics. The only one that can't is software bugs, and I'm pretty confident there are no situations where my bootloader might try to call the wrong BIOS function.


So the user needs to use their computer for something and tries to boot, but something goes wrong. The user gets annoyed very quickly because whatever they needed to do has to be postponed or cancelled. Next they read the error message (if there is one) and the error message is pure trash. Now the user is extremely annoyed. Somehow (with no way to use their computer to find help or download a recovery tool) after about 30 minutes of pure frustration they find some other mystical way to obtain and start some bloated (and typically patronising) "diagnostic tool". Now the user is enraged. They start the diagnostic tool, but it's written by the same incompetent developer that couldn't even get basic error messages right so you can expect that it's more likely that the diagnostic tool will crash and/or make things worse. If the developer was in the same room as the end user, they'd be dead by now - bludgeoned to death by a nice old lady who only wanted to look up a recipe for banana cake in the 10 minutes she had before going to church.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Sat Feb 24, 2018 9:02 pm 
Offline
Member
Member

Joined: Sat Feb 27, 2010 8:55 pm
Posts: 147
Brendan wrote:
Hi,
If the error message is "Error: 123" (or worse - the software is so bad that it crashes without any error message at all) ...



I've been trying to write a better bootloader and this is a problem I've had: One of the first things the 512byte VBR does is perform a checksum check; if it fails, it simply halts. It, as you say, "crashes without any error message at all," but I've kinda been stuck here. What else can I do? If it's corrupted I can't trust it to output any sane message (I can't trust it to do anything). The "best" solution I've come up with is if the test succeeds it prints "Checksum valid," that way at least the user knows whether or not it made it that far (If that message never shows up, the VBR has been corrupted). Is there a better way to do this?


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Sat Feb 24, 2018 9:11 pm 
Offline
Member
Member

Joined: Sat Feb 27, 2010 8:55 pm
Posts: 147
pragmatic wrote:
In essence, my questions are:
- What regions/how much memory are available at this point? The MBR is loaded at 0x7c00 (as is the VBR), but what about for a stack and other memory needs (like blocks from disk)?


0x0000-0x0400 = IVT
0x0400-0x0500 = BDA
0x0500-0x7c00 = available
0x7c00-0x7e00 = your VBR
0x7e00-0x8000 = available
0x8000-0x80000 = available on any computer that isn't so ridiculously ancient it doesn't even have 512KB RAM. Check int 12h to know for sure
0x80000-0x9E000 = may be EBDA, but probably available on non-ancient computers. Again, int12h to know for sure.
0x9e000-A0000 = probably EBDA (but may start as low as 0x80000)


Top
 Profile  
 
 Post subject: Re: Questions on the legacy loading process (MBR -> VBR ->)
PostPosted: Mon Feb 26, 2018 4:46 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7612
Location: Germany
azblue wrote:
What else can I do? If it's corrupted I can't trust it to output any sane message (I can't trust it to do anything).


Undefined behaviour is undefined, if your code is corrupted it is corrupted. You can't do jack about that.

Try to cover your bases as best you can, but don't fret about those parts you can't do "safely".

(Something I had to realize when I wrote PDCLib: I wanted to cover all the bases with the code I wrote, but there is only so much you can do -- which is why C leaves some areas "undefined", as they cannot be handled reliably, or efficiently. In the end, I abandoned any intention to cover for conditions the standard declared UB...)

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: SemrushBot [Bot] and 351 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group