I'll reply to what I have the answer for right now, and the rest when I find the answer...
Brendan wrote:
There's no real need to have much/any information in the first sector; and the first sector is typically much more important for boot code. Also, if you ever plan to put the file system on a floppy (or a device emulating a floppy) then you're going to get conflicts with the BPB.
I'm fairly sure only MS file systems use BPB (and now EFI - which I have no support for anyways)
http://homepage.ntlworld.com/jonathan.deboynepollard/FGA/determining-filesystem-type.html wrote:
Linux mis-uses the Microsoft Data partition, by creating volumes in it that do not have valid BPBs. This is a fault in Linux. (EXT2/EXT3/EXT4 partitions could easily be given BPBs. In actual fact, the mke2fs tool has possessed the code for creating valid BPBs since 1996, when it was added by Matthieu Willm. It has simply been conditionally compiled out and relegated to an OS/2 porting section, of all things.).
Brendan wrote:
A hash needs to be large so that you don't get too many collisions when there's millions of files; but it also needs to be small enough that that you can have a hash table in memory (e.g. "myHashTable[hash]"). Computers won't have enough RAM to support 96-bit hashes for at least several thousand years. In practice the file system driver is only going to be able to use a fraction of the full hash (e.g. "myHashTable[hash & 0x000FFFFF]") and the remaining bits of the hash only waste disk space.
Well seeing how my Hash Table uses 30.5MB for 1 million entries, any computer since 1999 should be able to cash at least 3 Million entries (91.5MB) without filling the RAM but 1/6th the way (saying 512MB in 1999 [I had 768MB in '99])
And seeing how all 5 of my PC's internal HDDs have a total of 1.5 million files - about 3.2TB including 2 windows installs, lubuntu, and many many games, backups, and codes... And seeing how even my test system has 4GB of ram...
But my design choices here was 32-bit hash, 96-bit hash, or 224-bit hash (the 96-bit seems safest)
Also explain why in practice it can only use a fraction of the hash... I don't get that one.
Quote:
"Parrent" should be "Parent".
Possibly, but it's just a variable name...
Quote:
I don't know what your timestamps are; but 32-bits is too small. Either the precision isn't enough (e.g. "seconds since the epoch"), or the range isn't enough (e.g. "microseconds since the epoch").
Well I was going to post my timestamp, but I realized it didn't work when I placed it in the reply box...
Quote:
When you're allocating a new cluster (for whatever reason) and the disk is almost full, is there any way to avoid doing a linear search through all (4 billion?) Binary Allocation Table entries? What if there are no free clusters at all?
Currently there is no way to avoid searching the entire table, but it is planned soon to have a first free cluster variable. And upon finding no free clusters you will get a response of cluster number 0, which cannot be used as it is the boot cluster - and then you are just SOL, just like any other FS.
Skip a few... (I'll get back to them when I have the answer, I'm tired and cannot fully think clearly...)
Quote:
For "rotating disk" storage devices; if you create a lot of small files will the disk heads be constantly seeking between binary allocation table, hash table, string table, directories and file data spread everywhere throughout the disk (and not near each other to make seeks less expensive)?
If you are using directory's and string table then there may be a decent bit of seeking, though if you are looking for a file 90% of the time you know the full path - in which case the hash table will point directly to the file block. (and if the hash table is in RAM you only have one read before File Data.) The only time I expect you to actually have to search the directories is for and only for directory listings (though if you are not doing recursive listing you can get the directory file block from hash table - and even if you are you should get the "root" of the listing from the hash table.)
Quote:
How do you plan to minimise fragmentation (of files, hash table, string table, etc), and/or efficiently defragment the file system?
For files, I (will) attempt to allocate clusters in a row where possible to keep the file together. As for the tables it may likely need to go through a defragmenter every now and then to keep the clusters in order (or close) - though you could allocate more than one cluster to start with, which should help at least some.
Quote:
If all names (including the root directory) begin with a "/" character, what is the point of storing this character or including it in hashes?
I felt like it. Plus my VFS (or my old one - haven't made one on this kernel...) passes the path with the leading '/'.
Quote:
Assuming strings/names are 100 bytes each on average, does this mean you can't store more than 43 million files (regardless of how huge the partition is) because the string table would be larger than 4 GiB and the "StringBlockOffset" field is only 32 bits?
Good point... Especially since the max size for the string table itself is 64-bit...
And yes by insert I meant append.
Anyways Brendan you make some good points, and I will defiantly reflect on them all. These are the exact reasons I post before fully implementing an entire system.