OSDev.org
https://forum.osdev.org/

Why relocate MBR?
https://forum.osdev.org/viewtopic.php?f=1&t=34274
Page 2 of 2

Author:  MichaelPetch [ Fri Sep 06, 2019 9:16 am ]
Post subject:  Re: Why relocate MBR?

I think when hobby programmers come here (or Stackoverflow) with an OS question and wonder why their bootloader is exhibiting odd behaviour and it is because of what the BIOS is doing then I think it is a concern. These BIOS peculiarities are not as rare as you think.

The first time I heard (from someone else) about a BIOS that read the MBR on partitioned media with USB HDD mode looking for an active partition - I assumed they were doing something stupid like using Windows to create a USB stick and then placing their boot sector in the partition, rather than the first sector of the drive (something I had seen people do countless times in the past). Often people use the wrong tools to write the first sector of the drive and write to the beginning of the partition. Windows puts a bootloader on the USB drive that loads the VBR on the one and only partition it generates by default. Then when I investigate, I discover to my amazement that such BIOSes exist and they are doing a lot more than reading the boot sector and checking for 0xaa55.

As for the BIOSes that blindly overwrite the BPB area of a boot sector (after being loaded into memory) when using FDD emulation - that case has become so common that I wrote a Canonical answer on SO that has become a dupe target for the issue. That issue seems to apply to most new machines these days. Almost every time someone now says "when I boot on my real hardware, the code runs, but doesn't work", my first question "Is that USB using FDD emulation"? After they go to that answer, make the change and add a dummy BPB, and their problem is solved, I close the question and mark it as a dupe.

On a side note, I had someone ask me a few years back about a particular machine (laptop) that wouldn't boot on this device, but he had access to many other machines that had no problem. A friend of mine had such a device (I believe it was an Acer) so I borrowed their machine and discovered that the simple bootloader that didn't have a BPB (and had a proper 0xaa55 signature) refused to boot as USB using FDD emulation.

To my amazement the BIOS was found to be not only doing a check for the boot signature but it was also doing a check for what appeared to be an instruction it recognized in the first few bytes. In particular I found that there had to be a JMP or an XOR as the first instruction but it refused to boot because the first instruction was a CLI. LOL. If you put a JMP or an XOR before the CLI it was identified as bootable media. At first I thought they were looking for just a JMP to determine if there was a BPB, but then discovered that wasn't the case with this BIOS. The BIOS seemed convinced that it had to identify what it thought was bootable media by seeing if there was an instruction that was common for the beginning of a bootloader present as well. Stackoverflow fielded a question with nearly the same problem. I didn't answer it, because someone else beat me to it after a flurry of comments with the OP to narrow down the issue.

Author:  MichaelPetch [ Fri Sep 06, 2019 9:29 am ]
Post subject:  Re: Why relocate MBR?

mikegonta wrote:
]This merely suggests that the BIOS examines the boot sector for the purpose of (automatically) "emulating" the USB device.
That may be the case on some BIOSes for some people with regards to their bootloaders, but what is being observed by people is real in a lot of cases now. I should know, I have a Lenovo Laptop that does the exact same thing.

The way this usually plays out now is that someone says their bootloader starts but it prints something out incorrectly or doesn't work as expected (like a disk read fails). In most of these cases now It isn't that their BIOS failed to be recognize it as bootable media, it recognized it as bootable and started running it and basic code failed. What happens is that on an increasing number of machines now the BIOS is BLINDLY writing the drive geometry into memory when booting as USB FDD where it thinks there is a BPB present. The behaviour of what happens because of that is undefined but often overwrites instructions and has the code do something unexpected. That SO answer you pointed to is mine, and if you'll note there is code to dump the bootloader memory out to see if any bytes have changed. In the cases where this issue is a problem the bootloader I wrote will display different values in the bytes that get overwritten.

So please don't tell me this isn't a real issue. As I said in my comment just before this, I now have a canned set of questions whenever I see USB boot on real media where a bootloader starts running but doesn't do what is expected. This situation is now the single biggest headache I have to help people on SO for bootloader problems. I scan the bootloader for the traditional mistakes (not setting up the segment registers, writing DL to memory before updating DS etc etc. If the code itself looks fine I ask questions about USB booting. If they are booting from USB devices with FDD emulation. I tell them to read the canonical answer. Most times they add a BPB and the code now works as expected. I provide an actual BPB, but it appears you just need a SHORT JMP, NOP, minimal sized BPB filled with zeros if you aren't actually using a file system like FAT).

So I can say this unequivocally, there are BIOSes that read the VBR from unpartitioned media (when using USB FDD emulation) into memory at physical address 0x07c00; write the actual drive geometry blindly to the places the BPB would expected to be in memory; and then transfers control to the bootloader. The VBR isn't being overwritten on disk, it is being overwritten in memory.

I should point out my SO answer starts with:

Quote:
If you are attempting to use USB to boot on real hardware then you may encounter another issue even if you get it working in BOCHS and QEMU.


The keyword is "MAY". But they can run the test code I provided to determine if their BIOS is modifying the geometry in memory after the boot sector has been loaded and before the bootloader starts executing. If the output from that test bootloader doesn't show changes (from the default output in the answer) then the problem is likely something else, if they are different then the BIOS has overwritten part of the BPB. The changes appear to be contained to the few drive geometry related bytes.

Author:  mikegonta [ Fri Sep 06, 2019 10:32 am ]
Post subject:  Re: Why relocate MBR?

MichaelPetch wrote:
mikegonta wrote:
This merely suggests that the BIOS examines the boot sector for the purpose of (automatically) "emulating" the USB device.
That may be the case on some BIOSes for some people with regards to their bootloaders, but what is being observed by people is real in a lot of cases now. I should know, I have a Lenovo Laptop that does the exact same thing.
The way this usually plays out now is that someone says their bootloader starts but it prints something out incorrectly or doesn't work as expected (like a disk read fails). In most of these cases now It isn't that their BIOS failed to be recognize it as bootable media, it recognized it as bootable and started running it and basic code failed. What happens is that on an increasing number of machines now the BIOS is BLINDLY writing the drive geometry into memory when booting as USB FDD where it thinks there is a BPB present. The behaviour of what happens because of that is undefined but often overwrites instructions and has the code do something unexpected. That SO answer you pointed to is mine, and if you'll note there is code to dump the bootloader memory out to see if any bytes have changed. In the cases where this issue is a problem the bootloader I wrote will display different values in the bytes that get overwritten.
I'm not saying that the BIOS doesn't sometimes overwrite the in memory BBP, I only said that your references didn't prove it and could be interpreted differently. Now you are providing proof which I can accept. (I didn't read your answer and I was referring to the SO user's remarks).
I do however, have a suitable explanation which should satisfy all concerned.
I agree with you, the BIOS is in some cases, overwriting the BPB, but this only goes to prove that you cannot use the FAT12 geometry in the BPB as parameters to INT 0x13 calls when using a USB device. I guess those BIOS assume that every one is a dummy and will blindly use FAT12 geometry on a non floppy disk. So to help us poor soles (boot sector people) out, it will quietly update the parameters to the "correct" values.
The correct way to access FAT12 formatted media that is not a real floppy disk (and yes, real floppy disks can still be booted on newer equipment (with a CSM UEFI BIOS using a USB floppy disk drive) is to use INT 0x13, AH=0x48 Get Drive Parameters (which in most cases will be the same as the BPB, or as in these cases - will be what the BIOS wants to use. And yes, you can use the INT 0x13 Extensions with real floppy disks booted from a USB floppy disk drive.
MichaelPetch wrote:
So please don't tell me this isn't a real issue. As I said in my comment just before this, I now have a canned set of questions whenever I see USB boot on real media where a bootloader starts running but doesn't do what is expected. This situation is now the single biggest headache I have to help people on SO for bootloader problems. I don't even write answers, I ask these questions in the comments, tell them to read that answer, they add their BPB and the code now works as expected. I provide an actual BPB, but it appears you just need a SHORT JMP, NOP, minimal sized BPB filled with zeros if you aren't actually using a file system like FAT).
A near jump will also work. This requirement was first documented in the original IBM PC BIOS source code.
To be on the safe side, a BPB filled with zeros should not be used. I have observed that on an older PC (that supported USB booting) the BIOS used the BPB values to calculate (to be used later) LBA info that resulted in the PC quietly hanging after an unhandled divide by zero CPU exception.

Author:  MichaelPetch [ Fri Sep 06, 2019 10:44 am ]
Post subject:  Re: Why relocate MBR?

mikegonta wrote:
To be on the safe side, a BPB filled with zeros should not be used. I have observed that on an older PC (that supported USB booting) the BIOS used the BPB values to calculate (to be used later) LBA info that resulted in the PC quietly hanging after an unhandled divide by zero CPU exception.
I commented about the zero bytes in this discussion only, to suggest that these particular BIOSes
don't *seem* to care about the existing BPB, however my answer provides a real BPB (I use a 1.44MiB floppy as a basis, but it can be changed by the bootloader developer). My own bootloaders may actually use the BPB data. We are in definite agreement on this point - it is safer (and definitely preferred) to provide a real BPB that isn't entirely zeroes.

Some of my SO bootloader answers now point to a question/answer pair that has a note about using a BPB (and I provide one if they wish to use it) when booting with USB as FDD media.

Author:  mikegonta [ Fri Sep 06, 2019 3:07 pm ]
Post subject:  Re: Why relocate MBR?

MichaelPetch wrote:
If you are attempting to use USB to boot on real hardware then you may encounter another issue even if you get it working in BOCHS and QEMU.
My suggestion would be:
If you are attempting to use USB to boot on real hardware and are using a FAT12 formatted image.
  • use INT 0x13, AH=0x48 Get Drive Parameters if you absolutely want to use the original CHS addressing, or better still
  • use the INT 0x13 Extensions BIOS functions and LBA addressing
  • have a valid FAT12 BPB as part of the image (which can be easily produced in assembly, not to mention the entire 1.44 MB formatted floppy disk image*




* Not to mention an entire FAT32 or exFAT floppy disk image.

Author:  bzt [ Sat Sep 07, 2019 8:00 am ]
Post subject:  Re: Why relocate MBR?

MichaelPetch wrote:
As for the BIOSes that blindly overwrite the BPB area of a boot sector (after being loaded into memory) when using FDD emulation
MichaelPetch wrote:
refused to boot as USB using FDD emulation.
And? Don't use FDD emulation then! Change it to USB-HDD and all your problems will go away! Who wants to emulate a floppy on a several gigabytes big storage anyway?
You are basically saying here that 1 buggy BIOS of 10000 can work too, and fails only with a certain backward-compatibilty option turned on, which should not be used in the first place!

Come on guys, this is the XXI century, it is highly time to leave those legacy floppies behind! I haven't used CHS in the last 30 years, only LBA and it worked all the time. I have never found any BIOS that couldn't boot an USB stick in HDD mode using LBA mode reads. And I wasn't using FAT at all, my linux installers have ext2/ext3 etc. partitions, and my OS has its own fs. I have just recently started to use FAT as a GPT boot partition because of the ESP, and only for UEFI images. (Oh, and I haven't found any UEFI firmware yet that would understand FAT12 except TianoCore. FAT16 is the minimum, but most firmware only handle FAT32, neither of which available for floppies.)

Cheers,
bzt

Author:  mikegonta [ Sun Sep 08, 2019 6:49 am ]
Post subject:  Re: Why relocate MBR?

bzt wrote:
I have just recently started to use FAT
Shame on you.
bzt wrote:
as a GPT boot partition because of the ESP, and only for UEFI images. (Oh, and I haven't found any UEFI firmware yet that would understand FAT12 except TianoCore. FAT16 is the minimum, but most firmware only handle FAT32, neither of which available for floppies.
Obviously, you didn't RTFFP (read the fine print).

Author:  bzt [ Sun Sep 08, 2019 7:46 am ]
Post subject:  Re: Why relocate MBR?

mikegonta wrote:
bzt wrote:
I have just recently started to use FAT
Shame on you.
What's your issue with ext2/3? It's much much better than any FAT.
Shame on you that you stuck with Windoze! :-D There are developers here with wider perspective.

mikegonta wrote:
bzt wrote:
as a GPT boot partition because of the ESP, and only for UEFI images. (Oh, and I haven't found any UEFI firmware yet that would understand FAT12 except TianoCore. FAT16 is the minimum, but most firmware only handle FAT32, neither of which available for floppies.
Obviously, you didn't RTFFP (read the fine print).
What do you mean? According to FAT16 and FAT32, Microsoft clearly states there's a minimum required storage capacity. Here's a comparition table and also read this.
Since 4Mb (or 32Mb) is the minimum, and 2.88 Mb is the maximum capacity floppies ever had, there's no standard way to format them with those filesystems.

On the other hand I don't see what me reading a fine print has to do with manufacturers not implementing FAT12 (or FAT16) in their UEFI firmware...

Cheers,
bzt

Page 2 of 2 All times are UTC - 6 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/