OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 4:59 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 19 posts ]  Go to page Previous  1, 2
Author Message
 Post subject: Re: Proper buffer design for file reading
PostPosted: Wed Jul 10, 2019 1:44 pm 
Offline
Member
Member
User avatar

Joined: Sun Sep 19, 2010 10:05 pm
Posts: 1074
It's rare that you actually want an entire file loaded into memory at once. For very large files, this risks crashing the system.

Because storage devices rarely give you access to a single byte of data, then you are forced to deal with blocks of data. However, blocks of data are rarely useful for an application, so it may be desirable to hide all of the block details from the application, which is what I've done.

For an application, what you normally want is to be able to read X bytes from file offset address Y, so that is where I've focused most of my attention. All of the block swapping is handled behind the scenes. As an aside, all of my other "data" access is handled through the same basic interface, including low level storage, system RAM, audio input/output, network connections and even things like the keyboard and mouse. The design is heavily influenced by the Reader/Writer concept in .NET. A reader or writer object has a current position, and a ReadByte, ReadInt16, ReadInt32, ReadInt64, ReadString, etc. set of functions. This makes it trivial to find information even in very large files without having to worry about having the entire file in memory.

There are several other approaches that you can use, but loading the entire file into memory is probably not what you want, long term.

_________________
Project: OZone
Source: GitHub
Current Task: LIB/OBJ file support
"The more they overthink the plumbing, the easier it is to stop up the drain." - Montgomery Scott


Top
 Profile  
 
 Post subject: Re: Proper buffer design for file reading
PostPosted: Thu Jul 11, 2019 12:16 pm 
Offline
Member
Member
User avatar

Joined: Mon May 22, 2017 5:56 am
Posts: 812
Location: Hyperspace
SpyderTL wrote:
As an aside, all of my other "data" access is handled through the same basic interface, including low level storage, system RAM, audio input/output, network connections and even things like the keyboard and mouse. The design is heavily influenced by the Reader/Writer concept in .NET. A reader or writer object has a current position, and a ReadByte, ReadInt16, ReadInt32, ReadInt64, ReadString, etc. set of functions.

Apart from RAM and the convenience functions, this sounds exactly like Plan 9. Take out network and optionally audio too, and you have Unix. (Remember /dev/dsp for audio?) (I think RAM is available as a file now too, but I'm not sure if the addresses correspond.) Standard C incorporates the basic functions, read write & seek. It's a powerful design!

ReadString would be a nice addition to Unix/Plan 9. I think in early Unix, they looped getc or fgetc to get the same effect. Get a null byte -> exit loop. I wouldn't be surprised if there's still some code doing that in Plan 9, or there's a buffering library which can do it more efficiently than getc. Oh... this buffering library already has it. :) Brdstr in bio.

As for the other convenience functions, I think ReadInt32 is like if(read(fd, *some_int32_var, sizeof(int32)) != sizeof(int32)){error();}. I don't know if the .NET convenience functions have a byteswapping feature, but read obviously doesn't.

Anyway, this basic scheme was used almost everywhere until mmap got popular. Its biggest shortfall is a lack of atomicity, (seek+read is 2 syscalls,) which is easily solved by providing syscalls which combine both.

_________________
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie


Top
 Profile  
 
 Post subject: Re: Proper buffer design for file reading
PostPosted: Thu Jul 11, 2019 4:44 pm 
Offline
Member
Member

Joined: Wed Dec 12, 2018 12:16 pm
Posts: 119
Well, now works, but I still nedding to specify the buffer size (seems to be no choice). If I try to read a file bigger than 512 bytes, the HDD gets stuck reading with a software reset.
Code:
int fat32_read_file(uint8_t* filename, uint8_t* buffer, uint32_t buffsiz, struct file fp)
{
   /* Check HDD precense, I don't want to shoot my foot */
   if (!hd_exists() && !filename)
      return 1;
   hd_read(start_of_root, FAT32_FILES_PER_DIRECTORY * sizeof(struct DirectoryEntry), (uint8_t*)&drce[0]);

   static uint32_t sector_offset = 0;
   static uint32_t fsect = 0;
   uint8_t buff[buffsiz];
   uint8_t* fatbuff = 0;
   uint8_t fil[12];
   for (int i = 0; i < FAT32_FILES_PER_DIRECTORY; ++i) {
      fat2human(drce[i].file_name, fil);
      trimName(fil, 11);
      if (strcmp((char*)fil, (char*)filename) == 0) {
         uint8_t fcluster = ((uint32_t)drce[i].cluster_number_hi) << 16 | ((uint32_t)drce[i].cluster_number_lo);   
         int32_t ncluster = fcluster;
         int32_t file_size = fp.file_size;

         kputs("\nFile content: \n");

         /* 1 sector file (less than 512 bytes) */
         if (file_size < 512) {
            hd_read(fcluster, 512, buff);
            memcpy(buffer, buff, buffsiz);
            //buff[file_size] = '\0';
            //kputs("%s", (char*)buff);
         }

         /* File bigger than a sector, cluster */
         while (file_size > 0) {
            fsect = start_of_data + bpb.sectors_per_cluster * (ncluster - 2);
            for (; file_size > 0; file_size -= 512) {
               hd_read(fsect + sector_offset, 512, buff);
               //buff[file_size > 512 ? 512 : file_size] = '\0';
               //kputs("%s", (char*)buff);
               memcpy(buffer, buff, buffsiz);
   
               if (++sector_offset > bpb.sectors_per_cluster)
                  break;
            }
            uint32_t fsectcurrentcl = ncluster / (512 / sizeof(uint32_t));

            hd_read(fat_start + fsectcurrentcl, 512, fatbuff);
            uint32_t foffsectcurrentcl = ncluster % (512 / sizeof (uint32_t));
            ncluster = ((uint32_t*)&fatbuff)[foffsectcurrentcl] & 0x0FFFFFFF;
         }
         return 0;
      }
   }
   kputs("\nFile %s not found\n", filename);
   return 1;
}


Top
 Profile  
 
 Post subject: Re: Proper buffer design for file reading
PostPosted: Sat Jul 20, 2019 3:47 pm 
Offline

Joined: Sun Jul 14, 2019 4:27 pm
Posts: 22
I'd think about this from the user's page perspective. If I had a fixed size file this is 100% a mmap where you just map the file as a special page table swap file. Then the memory manager handles the working set for you. You have a good algorithm for this already.

For a stream you just pick an appropriate integer block buffer behind the scenes.

If it is random access and potentially variable, make them give you a pointer and length.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ]  Go to page Previous  1, 2

All times are UTC - 6 hours


Who is online

Users browsing this forum: FrankRay78, SemrushBot [Bot] and 67 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group