OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Apr 19, 2024 5:44 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 24 posts ]  Go to page Previous  1, 2
Author Message
 Post subject: Re: Most reused/versatile code
PostPosted: Tue Feb 24, 2015 4:33 pm 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4594
Location: Chichester, UK
alexfru wrote:
fseek() to positions 0, 1, 2, 4, 8, etc, fgetc(), check for errors. You get the top bit of the size. Similarly you get the rest.

What happens if you have write-only access to the file?

And how many system calls is that going to take on a multi-gigabyte file? All in the name of some illusory "portability" when any operating system will provide a single API call to determine the size of a file.


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 1:52 am 
Offline
Member
Member
User avatar

Joined: Wed Oct 18, 2006 3:45 am
Posts: 9301
Location: On the balcony, where I can actually keep 1½m distance
Because portable is exactly the opposite of doing the same thing a hundred times differently.

Quote:
write-only access to the file
That's a troll, right?

_________________
"Certainly avoid yourself. He is a newbie and might not realize it. You'll hate his code deeply a few years down the road." - Sortie
[ My OS ] [ VDisk/SFS ]


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 2:05 am 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 1108
iansjack wrote:
alexfru wrote:
fseek() to positions 0, 1, 2, 4, 8, etc, fgetc(), check for errors. You get the top bit of the size. Similarly you get the rest.

What happens if you have write-only access to the file?


If you can't read the file, what can you use its size for? OTOH, if you're only writing to a file, either you don't care about its size (e.g. it's a log that only grows) or you know exactly how much you're supposed to write since standard C has no functions to truncate files at specified size. I think other cases (e.g. calculating size of files in a directory, taking proper care of links and other peculiarities) are rare and/or need special handling anyway (e.g. POSIX functions, path manipulation / hierarchy navigation, etc). But basic file sizing for basic purposes (=find out its size in order to read or copy it) can be implemented nearly portably.

iansjack wrote:
And how many system calls is that going to take on a multi-gigabyte file? All in the name of some illusory "portability" when any operating system will provide a single API call to determine the size of a file.


For a 4G-1B file you'd need something like:
31 fseek() calls for the most significant bit of size
1 fseek() call for bit 30
1 fseek() call for bit 29
...
1 fseek() call for bit 1
1 fseek() call for the least significant bit
Double that because of fgetc().

Less than 100 calls. Not too bad, IMO.


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 2:07 am 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 1108
Combuster wrote:
Because portable is exactly the opposite of doing the same thing a hundred times differently.


The discouraged use of fseek(f, 0, SEEK_END) is one of those things that people do thousands of times in pretty much exactly the same way and situation. :)


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 2:46 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4594
Location: Chichester, UK
alexfru wrote:
Less than 100 calls. Not too bad, IMO.
100 system calls (which are a relatively expensive procedure) to do something that could be done with a single call. Your definition of "not bad" is different to mine. It seems to me that you are using a sledgehammer to crack a nut. We have information that is stored in the meta-description (the directory entry) of the file and rather than using this simple information you want to walk the file counting the bytes (albeit using a binary search rather than a linear one). Crazy (IMO)!

And what about that case of a write-only file?


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 2:59 am 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 1108
iansjack wrote:
alexfru wrote:
Less than 100 calls. Not too bad, IMO.
100 system calls (which are a relatively expensive procedure) to do something that could be done with a single call. Your definition of "not bad" is different to mine. It seems to me that you are using a sledgehammer to crack a nut. We have information that is stored in the meta-description (the directory entry) of the file and rather than using this simple information you want to walk the file counting the bytes (albeit using a binary search rather than a linear one). Crazy (IMO)!


That's the price you pay for portability. You could argue that a bunch of #ifdef's also counts as a portable solution. :)

iansjack wrote:
And what about that case of a write-only file?


I wrote about that. Did you not read that part or did you find it unsatisfactory?


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 4:47 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4594
Location: Chichester, UK
alexfru wrote:
I wrote about that. Did you not read that part or did you find it unsatisfactory?
Ah, sorry - I missed that part of your reply. I've now read it and, yes, I do find it unsatisfactory. I think that saying "I can't see why you would want to do that" is a poor excuse. A simple example, would be a program that monitored the size of files and then took some action (such as logging an alert) if the file grew above a certain size. These are confidential files that no-one but the owner should be able to access.

As for #ifdefs - yes, that's a better way of achieving portability than making a huge number of unnecessary system calls (IMO), especially as it doesn't work in all circumstance that I can envisage.


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 4:51 am 
Offline
Member
Member
User avatar

Joined: Sat Mar 31, 2012 3:07 am
Posts: 4594
Location: Chichester, UK
Combuster wrote:
Because portable is exactly the opposite of doing the same thing a hundred times differently.

Quote:
write-only access to the file
That's a troll, right?

What??? You seem very keen to shout "troll" when something is beyond your ken. A sure sign of a troll (IMO) :). Is the concept of a write-only file really so foreign to you? I can see that you have never worked in an environment where security is paramount. Or somewhere where an incorruptible audit trail needs to be maintained.

Don't conclude that the concept of a write-only file makes no sense just because of your lack of imagination.


Top
 Profile  
 
 Post subject: Re: Most reused/versatile code
PostPosted: Wed Feb 25, 2015 10:34 am 
Offline
Member
Member
User avatar

Joined: Wed Mar 21, 2012 3:01 pm
Posts: 930
Bender wrote:
This:
Code:
/** NOTE: This function trashes the current position in file **/
/** Does no error checking either **/
long int GetFileSize(FILE* FilePointer)  { 
      fseek(FilePointer, 0L, SEEK_END);
      long int ReturnVal = ftell(FilePointer);
      fseek(FilePointer, 0L, SEEK_SET);
      return ReturnVal;
}


(Everywhere?)


Obvious improvements you should adapt in this code:
  • No error handling, you should do that.
  • You could ftell ahead of time and restore it afterwards.
  • Use off_t and ftello and fseeko for large file support. It's POSIX and not ISO C, but any platforms that don't supply it are braindead, as evident by how large file support is on Windows.
  • Use flockfile and funlockfile for thread safety, so the operation is atomic. This is also less portable, I forgot if they made it into C11, but any sane platform will have them. (At this point, you should just stop supporting native Windows C altogether, it's too horrible, check out cygwin or soon midipix if you care).
  • You might as well use fileno and fstat on real files as a much more efficient alternative. Not all FILE objects are backed by file descriptors, so there's still some value in this approach.
  • Really consider whether you really need to know the file size, or whether you can just consider it a stream and process a char, element or line at a time. If it's because you want to read the whole file into memory, there might still be race conditions where the file grows while you read it, so a realloc exponential double pattern is better, until EOF is hit (but such a realloc pattern can be optimized by having the first buffer be the size of the file, so it's almost always right on the first try).

Sorry. Someone passed me -Wpedantic and -Wsuperior-interfaces and -Wstyle-suggestions -Wimperial-opinions. I gotta teach better C coding.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 24 posts ]  Go to page Previous  1, 2

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 89 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group