eekee wrote:
Extensions are the only metadata which don't get removed on download, not to mention bad operating systems (Mac OS <= 9) which make it hard to include metadata when you copy the file.
This is more of an implementation flaw than an inherent flaw with the idea of file types as metadata. I mean, in the end the difference is basically where this information is stored, either in a header within the file itself, or within a file table entry. As usual, both schemes have their pros and cons.
Quote:
Then there's the question of what to do when a file gets the wrong metadata. Filename extensions aren't too bad because any OS will include ways to rename files, but again bad OSs make it hard to change the extension. I say "bad", but they have reasons for making it hard to change extensions and other metadata. There are reasons not to copy various pieces of metadata too.
If a file somehow has the wrong metadata such as an incorrect type, it should be as easy to change as a file permission.
Quote:
It's better to look into the file's data for its type, and indeed most executable formats include a magic number for that purpose, in the header of course. An OS could then cache the type in the filesystem, but is it worth it? Properly invalidating the cache could get quite complex. If one of your formats is flat binary, properly invalidating the cache could be downright impossible!* That would be nasty for sysadmins and users. (I get caught out enough by bash caching program locations.) Also, reading the metadata may introduce unnecessary overhead, depending on filesystem.
I think separating data and metadata is generally a good thing. For example, if an executable is defined as a file that begins with a certain magic number, that means that all other file types have to conform to this standard as well, because otherwise if they happen to start with the same bit pattern as a magic number, then they will be interpreted incorrectly. If the type is stored as metadata in the file system (either as an explicit field or as a filename extension), the file doesn't need to conform to any standards and can be stored as a flat binary. Files with metadata can also be slighty less efficient to work with and require more code to manipulate because of the variable length header that needs to be stripped.
Quote:
Thinking all this through only strengthens my opinion that keeping type information as filesystem metadata is always more trouble than its worth. Only file extensions have some worth, and there are some issues with them too.
I'm still not convinced it's an inherently bad design. I suppose which scheme is better for your OS comes down to how it ties in with other design issues.