Proper buffer design for file reading

deleted8917 · Post by **deleted8917** » Sun Jul 07, 2019 5:42 pm

Let's say that we implemented an X filesystem (in this case FAT32) into our OS.
We have a function that reads a file, which as parameters needs the filename and a buffer.
Now we have a problem, the buffer size. The file can be larger or smaller than the buffer, we can't just create a buffer with the size of the file because can happen that the file is a big one (2GB for example). It will use almost all the RAM. Obiously, that is a very very terrible design.
So what I have to do? How I implement a proper buffer for file reading, independently of the file size?
This function should read a file (not working), with a buffer as parameter.

Code: Select all

int fat32_read_file(uint8_t* buffer, struct file fp)
{
	if (!hd_exists())
		return 1; 
	hd_read(start_of_root, FAT32_FILES_PER_DIRECTORY * sizeof(struct DirectoryEntry), (uint8_t*)&drce[0]);
	uint8_t* buff = 0;
	uint8_t* fatbuff = 0;
	if (strcmp((char*)drce[fp.file_no].file_name, (char*)fp.file_name) == 0) {
		uint8_t fcluster = ((uint32_t)drce[fp.file_no].cluster_number_hi) << 16 | ((uint32_t)drce[fp.file_no].cluster_number_lo);   
		int32_t ncluster = fcluster;
		int32_t file_size = fp.file_size;
		/* 1 sector file (less than 512 bytes) */
		if (file_size < 512) {
			hd_read(fcluster, 512, buff);
			buff[file_size] = '\0';
			//kputs("%s", (char*)buff);
		}

		while (file_size > 0) {
			uint32_t fsect = start_of_data + bpb.sectors_per_cluster * (ncluster - 2);
			uint32_t sector_offset = 0;
			for (; file_size > 0; file_size -= 512) {
				hd_read(fsect + sector_offset, 512, buff);
				buff[file_size > 512 ? 512 : file_size] = '\0';
				//kputs("%s", (char*)buff);
				memcpy(buffer, buff, 512);
				if (++sector_offset > bpb.sectors_per_cluster) 
					break;
			}
			uint32_t fsectcurrentcl = ncluster / (512 / sizeof(uint32_t));
			hd_read(fat_start + fsectcurrentcl, 512, fatbuff);
			uint32_t foffsectcurrentcl = ncluster % (512 / sizeof (uint32_t));
			ncluster = ((uint32_t*)&fatbuff)[foffsectcurrentcl] & 0x0FFFFFFF;
		}
		return 0;
	}
	return 1;
}

We "open" the file with this:

Code: Select all

struct file fat32_open_file(uint8_t* filename)
{
	if (!hd_exists() && !filename)
		return; 
	struct file fp; 
	hd_read(start_of_root, FAT32_FILES_PER_DIRECTORY * sizeof(struct DirectoryEntry), (uint8_t*)&drce[0]);

	for (int i = 0; i < FAT32_FILES_PER_DIRECTORY; ++i) {
		if (drce[i].file_name[0] == FAT32_NO_FILES) 
			break;
		/* LFN */
		if (drce[i].attributes & FAT32_LFN)
			continue;
		/* folder */
		if (drce[i].attributes & FAT32_DIRECTORY)
			continue;

		/* If the first byte of file_name is 0xE5, means that the file is deleted */
		if (drce[i].file_name[0] == FAT32_DELETED_FILE)
			continue;
		
		uint8_t fil[12];
		fat2human(drce[i].file_name, fil);
		trimName(fil, 11);
		if (strcmp((char*)fil, (char*)filename) == 0) {
			strcpy((char*)fp.file_name, (char*)filename);
			fp.file_size = drce[i].file_size;
			fp.file_no = i;
			return fp;
		}
	}
	kputs("\nFile %s not found\n", filename);
	strcpy((char*)fp.file_name, (char*)filename);
	fp.file_size = 0;
	fp.file_no = 0;
	return fp;
}

"fp" is a file structure:

Code: Select all

struct file
{
	uint8_t file_name[11]; /* let's assume fat32 SFN for now */
	uint32_t file_size;
	uint32_t file_no; /* Number of the file in the directory (starting from 0) */
	uint32_t file_pos;
};

Now, opening the file:

Code: Select all

uint8_t a[2048]; /* Main problem here, the buffer is limited to 2048 bytes! */
fat32_read_file(a, fp);
kputs("%s", (char*)a);

nullplan · Post by **nullplan** » Sun Jul 07, 2019 10:27 pm

Learning from UNIX means learning to win. In UNIX there is a read function defined as:

Code: Select all

ssize_t read(int, char*, size_t);

The first parameter is the FD, the OS token used to identify the file (returned from open()). After that you specify a buffer as pointer and size. The function tries to fill the buffer as far as possible, but returns how much it read. Or -1 on error. Importantly, it can return short for any reason at all. For example, on FAT32, it might just decide to only finish up the current cluster of the file and return data from the next cluster only on the next call.

End of file is marked by this function returning zero.

bzt · Post by **bzt** » Mon Jul 08, 2019 5:44 am

Hi,

hextakatt wrote:Now we have a problem, the buffer size. The file can be larger or smaller than the buffer

nullplan is right, you should not worry about this. Most OS lets the userspace application define the buffer and its size. Your concern is more how to translate the file offset into series of block buffers. You see, the storage has 512 bytes blocks. One or more blocks (which may or may not be contiguous) make up the data that you have to copy to the caller's buffer. Copying is important, because those nasty userland applications can read from any arbitrary offsets, and any arbitrary lengths, not just 512 byte aligned offsets and multiples of 512 bytes. This means there's a 1 to 511 chance that the data you'll need to return starts in the middle of a sector. Also because data sectors may not be contiguous on the storage, you'll have to copy arbitrary sequence of sectors to get the correct data.

This whole situation is very similar to virtual address mappings. Only this time "virtual address space" is the user buffer, "physical address space" are the LBA sectors, page frame is 512 bytes, and there's no hardware to do the translation for you, you have to do it from software. The actual translation scheme depends on what FS you're using. The trick here is, how to implement the (offset) .. (offset + buffer size) mapping (in "virtual address space") to a sequence of sectors (in "physical address space") efficiently and FS independently (if you want to support more FSes with VFS).

I'd suggest to download the Minix2 source and take a look at minix/fs/read.c. Most of the code deals with splitting up the "virtual address space" into a sequence of LBA addresses. I would not recommend Minix3 or Linux as they are much more bloated and complicated. Minix2 is simple.

Another possible source would be Xv6, although I must admit the function names are a bit cryptic, and you must read the doc to understand how file system buffers work in this little OS.

More precisely you have three layers to deal with:
1. layer: LBA sectors
2. layer: file content, each with it's own "address space" up to file size (depends on FS how this is mapped to sectors)
3. layer: user space buffers, specified by (file offset) .. (file offset + buffer size)
You'll need a code that gets layer 3 as its input, and walks down to layer 1 to get a list of sectors. Typically layer 3 is implemented in the VFS code. Layer 2 is implemented in the FS driver, finally layer 1 is implemented in the block device drivers.

Cheers,
bzt

deleted8917 · Post by **deleted8917** » Mon Jul 08, 2019 4:58 pm

bzt wrote:
hextakatt wrote:Now we have a problem, the buffer size. The file can be larger or smaller than the buffer
nullplan is right, you should not worry about this. Most OS lets the userspace application define the buffer and its size.

But for internal OS usage and If I need to read an entire file? (e.g: binary files)

eekee · Post by **eekee** » Mon Jul 08, 2019 5:01 pm

hextakatt wrote:But for internal OS usage and If I need to read an entire file? (e.g: binary files)

You mean in the kernel? You need kmalloc as a kernel-space equivalent of malloc, AFAIK. Get the size, kmalloc a big enough buffer, ...

deleted8917 · Post by **deleted8917** » Mon Jul 08, 2019 5:10 pm

eekee wrote:
hextakatt wrote:But for internal OS usage and If I need to read an entire file? (e.g: binary files)
You mean in the kernel? You need kmalloc as a kernel-space equivalent of malloc, AFAIK. Get the size, kmalloc a big enough buffer, ...

Well, that was my initial solution, but that doesn't works for big files (magnitude of gigabytes), it will use all the RAM. It works?

eekee · Post by **eekee** » Mon Jul 08, 2019 6:21 pm

Ooh... I didn't understand what you're asking until I looked at the code. Not that I fully understand it, but I think I get what part of the job you're asking now. I suppose you'd have to read into a cluster-size or sector-size buffer, then copy from there into the buffer the user supplies. If the user's buffer is bigger, read another cluster or sector into your buffer, and copy again. If the user's buffer is smaller, well that's the reason you have a buffer the size of a full sector or cluster. The file size barely matters here, you only need it so you can calculate the right amount to copy from the last cluster.

As a simple disk cache, you could have a whole array of these cluster-buffers together with data on which buffer corresponds to which cluster on the disk. Just an idea I probably haven't given enough thought to.

This is all about read(); mmap() is a whole other planet.

Edit: This would probably benefit from being separated into layers as bzt suggests. I'm not sure which layers, though.

deleted8917 · Post by **deleted8917** » Mon Jul 08, 2019 7:43 pm

Well, I tried to fix my code, but now I get the goddamn invalid opcode exception. Nice.

Code: Select all

int fat32_read_file(uint8_t* filename, uint8_t* buffer, struct file fp)
{
	/* Check HDD precense, I don't want to shoot my foot */
	if (!hd_exists() && !filename)
		return 1; 
	hd_read(start_of_root, FAT32_FILES_PER_DIRECTORY * sizeof(struct DirectoryEntry), (uint8_t*)&drce[0]);

	uint64_t buffer_size = sizeof(buffer) / sizeof(uint8_t);
	uint8_t* buff = 0;
	uint8_t* fatbuff = 0;
	uint8_t fil[12];
	fat2human(drce[fp.file_no].file_name, fil);
	trimName(fil, 11);
	if (strcmp((char*)fil, (char*)filename) == 0) {
		uint8_t fcluster = ((uint32_t)drce[fp.file_no].cluster_number_hi) << 16 | ((uint32_t)drce[fp.file_no].cluster_number_lo);   
		int32_t ncluster = fcluster;
		int32_t file_size = fp.file_size;
		int32_t bytesr = 0;

		kputs("\nFile content: \n");

		/* 1 sector file (less than 512 bytes) */
		if (file_size < 512 && buffer_size > 512) {
			hd_read(fcluster, 512, buff);
			memcpy(buffer, buff, 512);
			//buff[file_size] = '\0';
			//kputs("%s", (char*)buff);
		}

		/* File bigger than a sector, cluster */
		while (file_size > 0) {
			uint32_t fsect = start_of_data + bpb.sectors_per_cluster * (ncluster - 2);
			uint32_t sector_offset = 0;
			for (; file_size > 0; file_size -= 512) {
				hd_read(fsect + sector_offset, 512, buff);
				//buff[file_size > 512 ? 512 : file_size] = '\0';
				//kputs("%s", (char*)buff);
				memcpy(buffer, buff, 512);
				bytesr += (file_size > 512 ? 512 : file_size);
				if (++sector_offset > bpb.sectors_per_cluster) 
					break;
			}
			uint32_t fsectcurrentcl = ncluster / (512 / sizeof(uint32_t));

			hd_read(fat_start + fsectcurrentcl, 512, fatbuff);
			uint32_t foffsectcurrentcl = ncluster % (512 / sizeof (uint32_t));
			ncluster = ((uint32_t*)&fatbuff)[foffsectcurrentcl] & 0x0FFFFFFF;
		}
		return 0;
	}
	kputs("\nFile %s not found\n", filename);
	return 1;
}

nullplan · Post by **nullplan** » Mon Jul 08, 2019 9:44 pm

A pointer without size is like ice cream without a tub: Sort of what you wanted, but your hands are gross afterwards.

I haven't read much of your code, but this line can't work:

Code: Select all

uint64_t buffer_size = sizeof(buffer) / sizeof(uint8_t);

Leaving aside that, if uint8_t exists, it must be an unsigned char, the size of which is defined as 1, buffer in this case is a pointer. sizeof therefore returns 4 or 8, depending on architecture. You should always pass the buffer size as argument. The called function has no other means of identifying the size.

That holds for most kinds of code, BTW. Pointer without size is usually an error. Sometimes the size can be inferred, but not often.

deleted8917 · Post by **deleted8917** » Tue Jul 09, 2019 8:05 pm

nullplan wrote:A pointer without size is like ice cream without a tub: Sort of what you wanted, but your hands are gross afterwards.

I haven't read much of your code, but this line can't work:
Code: Select all
uint64_t buffer_size = sizeof(buffer) / sizeof(uint8_t);
Leaving aside that, if uint8_t exists, it must be an unsigned char, the size of which is defined as 1, buffer in this case is a pointer. sizeof therefore returns 4 or 8, depending on architecture. You should always pass the buffer size as argument. The called function has no other means of identifying the size.

That holds for most kinds of code, BTW. Pointer without size is usually an error. Sometimes the size can be inferred, but not often.

I deleted that faulty line, and now as a paremeter I specify the buffer size (uint32_t). But, still having the same error.

Code: Select all

int fat32_read_file(uint8_t* filename, uint8_t* buffer, uint32_t buffsiz, struct file fp)
{
	/* Check HDD precense, I don't want to shoot my foot */
	if (!hd_exists() && !filename)
		return 1; 
	hd_read(start_of_root, FAT32_FILES_PER_DIRECTORY * sizeof(struct DirectoryEntry), (uint8_t*)&drce[0]);

	uint8_t* buff = 0;
	uint8_t* fatbuff = 0;
	uint8_t fil[12];
	fat2human(drce[fp.file_no].file_name, fil);
	trimName(fil, 11);
	if (strcmp((char*)fil, (char*)filename) == 0) {
		uint8_t fcluster = ((uint32_t)drce[fp.file_no].cluster_number_hi) << 16 | ((uint32_t)drce[fp.file_no].cluster_number_lo);   
		int32_t ncluster = fcluster;
		int32_t file_size = fp.file_size;

		kputs("\nFile content: \n");

		/* 1 sector file (less than 512 bytes) */
		if (file_size < 512) {
			hd_read(fcluster, 512, buff);
			memcpy(buffer, buff, 512);
		}

		/* File bigger than a sector, cluster */
		while (file_size > 0) {
			uint32_t fsect = start_of_data + bpb.sectors_per_cluster * (ncluster - 2);
			uint32_t sector_offset = 0;
			for (; file_size > 0; file_size -= 512) {
				hd_read(fsect + sector_offset, 512, buff);
				memcpy(buffer, buff, 512);
	
				if (++sector_offset > bpb.sectors_per_cluster) 
					break;
			}
			uint32_t fsectcurrentcl = ncluster / (512 / sizeof(uint32_t));

			hd_read(fat_start + fsectcurrentcl, 512, fatbuff);
			uint32_t foffsectcurrentcl = ncluster % (512 / sizeof (uint32_t));
			ncluster = ((uint32_t*)&fatbuff)[foffsectcurrentcl] & 0x0FFFFFFF;
		}
		return 0;
	}
	kputs("\nFile %s not found\n", filename);
	return 1;
}

Code: Select all

uint8_t a[512];
fat32_read_file(filename, a, 512, fp);
kputs("%s", (char*)a);

Even if this code worked, I still cannot read the entire file at the same time!

Ethin · Post by **Ethin** » Tue Jul 09, 2019 8:19 pm

If I may ask... why can't you read the file in chunks and treat the buffer as a ring buffer?

Code: Select all

uint8_t a[512];
fat32_read_file(filename, a, 512, fp);
kputs("%s", (char*)a);
// Zero out buffer
for (int i = 0; i < 512; ++i) {
a[i] = 0;
}
// Keep reading

Would this not suffice for your use-case?

deleted8917 · Post by **deleted8917** » Tue Jul 09, 2019 9:21 pm

Ethin wrote:If I may ask... why can't you read the file in chunks and treat the buffer as a ring buffer?
Code: Select all
uint8_t a[512];
fat32_read_file(filename, a, 512, fp);
kputs("%s", (char*)a);
// Zero out buffer
for (int i = 0; i < 512; ++i) {
a[i] = 0;
}
// Keep reading
Would this not suffice for your use-case?

Because I need to read the entire file at once.

Korona · Post by **Korona** » Wed Jul 10, 2019 1:28 am

The usual approach (if you cannot predict the file size from the inode / directory entry) is to reallocate a buffer that is c times as large as the current one once you run out of space, for some constant c. That way, you're only wasting a factor of c space, with only constant amortized performance overhead.

That being said, allocating unlimited buffers in the kernel (i.e., from non-swapable memory) is a clear code smell and a potential DOS vector. Why are you trying to read an entire file at once?

loonie · Post by **loonie** » Wed Jul 10, 2019 9:51 am

First you alloc linear memory without any physical RAM mapped to it. You need to know max file size for min complexity.
Then user(or your kernel) accesses 1st 2MB chunk, #PF triggers, and you read 2MB from disk and map appropriate ram into 1st 2MB.
Repeat for the remaining 2MB chunks.
This way you can read all file at once or only a portion. Covers all cases.
You can choose 64MB chunks or larger - up to you.

hextakatt wrote: we can't just create a buffer with the size of the file because can happen that the file is a big one (2GB for example).

You have no choice but to alloc linear memory of the required size if you want entire file at once. But you don't have to map physical RAM into that right away.

hextakatt wrote: But for internal OS usage and If I need to read an entire file? (e.g: binary files)

Set reasonable limits and your life will get easier

eekee · Post by **eekee** » Wed Jul 10, 2019 11:19 am

loonie's post basically describes a simple mmap, doesn't it? The other day I was thinking about mmap versus read/write, and realised mmap is really awesome for certain tasks, it really simplifies the code and especially how you think about the task.

Korona's right if you haven't got the size ahead of time, (or don't want to get it; stat() can be slow,) but I want to add a little caveat.

Korona wrote:The usual approach (if you cannot predict the file size from the inode / directory entry) is to reallocate a buffer that is c times as large as the current one once you run out of space, for some constant c. That way, you're only wasting a factor of c space, with only constant amortized performance overhead.

Yes, but you want to cap the multiplication at some point. Make it grow linearly after a certain size. Python uses c=2, but they found they had to cap it when someone presented them with a real-life case of allocations big enough to reach some huge size but not much more than that. Hundreds of megabytes or maybe gigabytes of address space were wasted.

OSDev.org

Proper buffer design for file reading

Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading

Re: Proper buffer design for file reading