OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Dec 02, 2022 4:09 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: Nora's Reliable FileSystem (NRFS) v0.1
PostPosted: Sun Sep 25, 2022 6:44 am 
Offline
Member
Member
User avatar

Joined: Fri Jun 11, 2021 6:02 am
Posts: 79
Location: Belgium
Hello,

I finished the first version of my custom filesystem. You can find it here. Current features include:

  • Error detection (not correction) with a fast 64-bit hash.
  • Transparent compression with LZ4 (can be extended with other compression algorithms)
  • Transactional updates, i.e. the filesystem will never be in an inconsistent state on disk even if power is cut in the middle of a transaction.
  • Sparse files.
  • Up to 2^32 entries per directory, indexed with a hashmap and a cryptographic hash.
  • File names up to 255 bytes long.
  • Extensions that can be enabled per directory. Current extensions include "unix", which adds UID, GID and permissions to each entry, and "mtime", which adds a 64-bit modification time in milliseconds to each entry.
  • Small files (up to 2^16) can be embedded directly inside directories to reduce space usage and improve locality.

It includes a FUSE driver which has been tested on Linux.

Why yet another filesystem?

Personally, I'm a fan of filesystems that support transparent compression since you can save a huge amount of space with it. On my Linux desktop I use ZFS which currently reports a x1.96 compressratio, i.e. the total size of all data is halved, which doubles the effective capacity of my main disk.

However, ZFS is very complex and porting it to my own OS is likely very difficult. i've looked at other filesystems with compression such as BTRFS but I'm unable to find any that isn't either very complex or a read-only FS. Hence I decided to roll my own.

Design (briefly)

I ensured that all updates can be performed atomically as to reduce the odds the filesystem becomes corrupt. I went for a transaction instead of journal based system since a) it makes it easier to ensure consistency (IMO) and b) there is constant copying between host and device anyways, but at least transaction based systems should need less copies as you don't need to write twice.

All data is grouped into power-of-two sized "records". Each record points to a variable amount of blocks. By splitting data into records it is possible to compress and decompress only parts of large files instead of the entire file.

While all modern disks have ECC to protect against bitrot a hash has also been added to records anyways. This should help with catching errors that may have occurred during transmission which may occur over a poor link. While the hash only provides detection capabilities it at least let the user be aware of issues immediately instead of letting corruption happen silently.

I use hashmaps for directories instead of e.g. B-trees since the object layer of NRFS acts as a sort of MMU, i.e. it exposes each object as a contiguous space. Since hashmaps are much simpler to implement and the object layer makes it practical I went with that.

Specification & Future work

I made a specification of NRFS, which is 430 lines long. I first wrote the specification and then made an implementation based on this specification, so it should be complete as of writing.

While the storage format is mostly finished, the implementation is admittedly not great: it doesn't support asynchronous requests, writes out records redundantly and doesn't handle errors such as a hash mismatch gracefully. I will address this once I figure a good design for other planned features (notably, mirroring and error correction).

Feedback is welcome!

_________________
My OS is Norost B (website, Github, sourcehut)


Top
 Profile  
 
 Post subject: Re: Nora's Reliable FileSystem (NRFS) v0.1
PostPosted: Sun Sep 25, 2022 6:56 am 
Offline
Member
Member

Joined: Fri Feb 11, 2022 4:55 am
Posts: 330
Location: behind the keyboard
Seems good.


Top
 Profile  
 
 Post subject: Re: Nora's Reliable FileSystem (NRFS) v0.1
PostPosted: Sun Sep 25, 2022 11:25 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1384
Demindiro wrote:
Transparent compression with LZ4 (can be extended with other compression algorithms)
I do hope this one can be turned off. This is just futile effort for encrypted or already compressed files (e.g. JPEG or MP3, or most video formats), because there the entropy is too large for LZ compression to make a meaningful difference. And for small files you have the problem that compression doesn't even save any disk space, since a file cannot occupy less than one unit of storage. In that case it also doesn't save any transfer time, since you cannot transfer less than one block into memory. So it would just be a total waste of CPU time to compress there.

Demindiro wrote:
While all modern disks have ECC to protect against bitrot a hash has also been added to records anyways.
That is a good idea in general. You do not always read back what you wrote to the disk.
Demindiro wrote:
I use hashmaps for directories instead of e.g. B-trees
The only reason ext3/4 went with B-trees for indexed directories is because they wanted to remain at least read-only compatible with ext2 drivers. In general, a directory is a map from string (filename) to file/inode, and a hashmap is the best known implementation of such a thing (giving average constant lookup time, rather than logarithmic in the directory size, as with most trees, or linear in the directory size, as with most lists).

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Nora's Reliable FileSystem (NRFS) v0.1
PostPosted: Mon Sep 26, 2022 2:06 pm 
Offline
Member
Member
User avatar

Joined: Fri Jun 11, 2021 6:02 am
Posts: 79
Location: Belgium
nullplan wrote:
I do hope this one can be turned off.

The compression is optional and can be turned off entirely. It is also set per record so that the driver can pick the optimal choice.

nullplan wrote:
So it would just be a total waste of CPU time to compress there.

LZ4 compression and decompression is very cheap, especially compared to the transfer speeds of e.g. SATA disks. The library I'm using claims it achieves 1897 MiB/s compression and 7123 MiB/s decompression (running the benchmarks myself after disabled the "safe" features gives ~1000 MiB/s and ~4400 MiB/s respectively). Unless you have a very fast disk (e.g. NVMe SSD) it usually makes more sense to keep it enabled.

EDIT: I ran a benchmark with a 1.8G MKV file and measured ~6.0 GiB/s for both compression and decompression. Even the fastest SSDs can only barely keep up with this, let alone if you use multiple cores for compression. Keeping LZ4 on by default is a sane choice IMO.

_________________
My OS is Norost B (website, Github, sourcehut)


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group