OSDev.org
https://forum.osdev.org/

Questions on Long file names FAT32
https://forum.osdev.org/viewtopic.php?f=1&t=33699
Page 1 of 1

Author:  Klakap [ Mon May 20, 2019 12:18 pm ]
Post subject:  Questions on Long file names FAT32

Good day!

I have some questions about LFS.

Please if dir entry have LFS, is his long 64 bytes? (short file name + LFS?)
Please have file with LFS lenght 21 characters + 3 extension?
Please is today in any machine used short names?

I am thankful for answers.

Author:  Pebblerubble [ Mon May 20, 2019 1:12 pm ]
Post subject:  Re: Questions on Long file names FAT32

This link here answers probably all of your question (Correct me if I misunderstood you) :

https://wiki.osdev.org/FAT#Long_File_Names

Please note that in FAT a long filename entry is actually built of a 8+3 entry plus a long-filename-entry.

There is NO limitation to how big the suffix is. So .html for example is a valid suffix in LFS.

I think only DOS machines use 8+3 filenames nowadays anymore. It's in my humble opinion really outdated and obsolete.

[EDIT: I could imagine a bootloader in 16bit mode or a FAT12 floppy driver might have some use for 8+3. But I still think it's outdated.]

Greetings
Peter

Author:  Octocontrabass [ Mon May 20, 2019 1:25 pm ]
Post subject:  Re: Questions on Long file names FAT32

Klakap wrote:
I have some questions about LFS.

Make sure you read Microsoft's FAT filesystem specification. It should have answers to many of your questions.

Klakap wrote:
Please if dir entry have LFS, is his long 64 bytes? (short file name + LFS?)

Long file names are variable-length. If you include the short file name, they can take between 64 and 672 bytes.

Klakap wrote:
Please have file with LFS lenght 21 characters + 3 extension?

Long file names can have up to 255 UTF-16 code units. That includes the extension, if the file has one, and the period (.) separating the name and the extension. The extension can be more than 3 characters long.

Klakap wrote:
Please is today in any machine used short names?

If the file name can be accurately stored as a short name, Windows will not write a long file name. Linux can be configured to behave this way too.

Author:  Klakap [ Tue Jun 04, 2019 11:34 am ]
Post subject:  Re: Questions on Long file names FAT32

Thank you for answers.

Author:  bzt [ Wed Jun 05, 2019 10:25 am ]
Post subject:  Re: Questions on Long file names FAT32

Hi,
Pebblerubble wrote:
I think only DOS machines use 8+3 filenames nowadays anymore. It's in my humble opinion really outdated and obsolete.
This is not exactly the case. 8+3 is quite widespread in embedded world. Many cameras for example record images with names like 'IMG00000.JPG' or 'DSC00000.RAW' which are 8+3. This is partially for simplicity, the other reason is M$ has patented LFN.

I'm not sure about the state of the LFN patent fee these days, still applies? If so, there's a simple workaround: the patent only applies to entries where the 8+3 entry is generated in a particular way from the LFN entry. Linux circumvents this by not generating two entries (see lkml):
Quote:
The claims of both of the VFAT patents involve the creation (or storing) of both a long filename and a short filename for a file. The 2nd patch only creates/stores either a short filename or a long filename for a file, but never both.

Cheers,
bzt

Author:  nullplan [ Wed Jun 05, 2019 12:33 pm ]
Post subject:  Re: Questions on Long file names FAT32

Octocontrabass wrote:
Long file names are variable-length. If you include the short file name, they can take between 64 and 672 bytes.

Where did those numbers come from? There are 12 UCS-2 characters in an LFN entry. The shortest name is one ASCII character, taking up two directory entries. Since I am a proponent of UTF-8 everywhere, I suggest re-encoding an LFN as UTF-8, so that means the shortest LFN is 1 byte. Adding the SFN into this doesn't really make sense, as the file can be named by either the LFN or the SFN, but not a combination of them.

To my knowledge, MS imposes a limit of 128 codepoints, which weighs in at something between 64 and 192 bytes of UTF-8. But the format will in principle allow for 63 LFN entries, each with 12 codepoints, which is 1512 bytes of UCS-2, or between 756 and 2268 bytes of UTF-8. (Opening this up to UTF-16 does not worsen the prospects of UTF-8, since a non-BMP-codepoint will be encoded in 4 bytes of UTF-16 as well as 4 bytes of UTF-8, so nothing is gained. The worst case for UTF-16 --> UTF-8 conversion is a string of high BMP characters, which are 2 bytes of UTF-16, but 3 bytes of UTF-8, leading to a size increase of 3/2)

bzt wrote:
I'm not sure about the state of the LFN patent fee these days, still applies?
Disclaimer: IANAL. None of this is legal advice. Consult a professional if you require aid in this matter.

To my knowledge, all patents regarding LFN are expired, except for the one for generating LFNs and short file names in the same namespace (which is what the Linux quote was about), which is due to expire in 2020 or 2021 (don't quote me on that). Thankfully, patents aren't copyright, and you are allowed to ship an infringing algorithm so long as you ensure it isn't actually used until a licence is procured or the patent has expired. Also, as we saw, it is really easy to get around this one by just always generating LFNs or SFNs, and never a mixture of the two. Also also, it is exceedingly unlikely Microsoft would ever know or care about an infringment from the likes of us before the patent expires. None of us is in this for money (well, except for rdos, and good on them), so M$ are gladly invited to any share of my profits they like. :D

Author:  Octocontrabass [ Thu Jun 06, 2019 2:27 am ]
Post subject:  Re: Questions on Long file names FAT32

nullplan wrote:
Octocontrabass wrote:
Long file names are variable-length. If you include the short file name, they can take between 64 and 672 bytes.

Where did those numbers come from?

It's the total size, in bytes, of the directory entries that store a long file name. (The question, not quoted here, was about the size of the directory entry, not the size of the file name itself.)

nullplan wrote:
There are 12 UCS-2 characters in an LFN entry.

There are 13 UTF-16 code units in each LFN entry.

nullplan wrote:
To my knowledge, MS imposes a limit of 128 codepoints, which weighs in at something between 64 and 192 bytes of UTF-8.

It's 255 UTF-16 code units, or up to 765 bytes when taken as UTF-8. If you ignore that limit and use all 63 possible long entries, it's 819 UTF-16 code units, or 2457 bytes as UTF-8.

nullplan wrote:
Also, as we saw, it is really easy to get around this one by just always generating LFNs or SFNs, and never a mixture of the two.

This is accomplished by generating an invalid SFN when you have a LFN, since the directory entry format won't allow you to not have the SFN.

Page 1 of 1 All times are UTC - 6 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/