BASICFreak wrote:
Turns out it was recovering orphaned files, at 9000 something of 15000ish.
So needless to say the boot was slow this time around, and thankfully all the files were recovered correctly
Aren't orphaned inodes things like files that are unlinked, but were still open when the crash occurred? In that case the "recovery" is just freeing the space they occupy. Though 15000 temporary files sounds like a lot indeed...
Quote:
So, shy of directly reading and writing to the drive (as in no cache), how would one prevent this type of thing from happening (at least to the scale it did this time)? And if not possible what would be the steps to recover from this sort of issue?
I think first of all you need to understand the exact scenario we're talking about. If it's indeed unlinked, but opened files, you don't prevent it from happening. You just need to make sure that after a crash, you still know which inodes are orphaned, so you can clean them up. You can do that with a fsck-like operation, but that's obviously slow. As an optimisation, you might instead want to write the information to the journal.
Quote:
Currently (on version 0.0.3) I have only cached FAT and Root Directory (yea... I only have FAT support thus far...) and after every change (or a chain of changes) this is flushed back to the HDD.
The "only" thing you really want for FAT is that you order your FAT updates correctly if they touch more than one sector.
Specifically, you would want to avoid marking clusters as allocated, but not actually hooking them up anywhere, or you would leak those clusters. You definitely also want to avoid hooking up a cluster which isn't allocated yet, otherwise it could be allocated again and then you end up with cross-linked files (that is, filesystem corruption).
Unfortunately, FAT isn't made to avoid both problems at the same time when two consecutive clusters are described by different sectors in the FAT (in the same sector, you can update them atomically, so ordering is not a problem there). You get to choose which of the two problems to keep. Obviously, you should keep the leaked clusters rather than the corruption. This means that you need to flush between marking a cluster as allocated and actually hooking it up to the cluster chain of a file.