将 size 和 count 作为参数的 fread/fwrite 的基本原理是什么？

小开

The difference in fread(buf, 1000, 1, stream) and fread(buf, 1, 1000, stream) is, that in the first case you get only one chunk of 1000 bytes or nothing, if the file is smaller and in the second case you get everything in the file less than and up to 1000 bytes.

小开

最佳答案

It's based on how fread is implemented.

The Single UNIX Specification says

For each object, size calls shall be made to the fgetc() function and the results stored, in the order read, in an array of unsigned char exactly overlaying the object.

fgetc also has this note:

Since fgetc() operates on bytes, reading a character consisting of multiple bytes (or "a multi-byte character") may require multiple calls to fgetc().

Of course, this predates fancy variable-byte character encodings like UTF-8.

The SUS notes that this is actually taken from the ISO C documents.

小开

Likely it goes back to the way that file I/O was implemented. (back in the day) It might have been faster to write / read to files in blocks then to write everything at once.

小开

Here, let me fix those functions:

size_t fread_buf( void* ptr, size_t size, FILE* stream)
{
return fread( ptr, 1, size, stream);
}




size_t fwrite_buf( void const* ptr, size_t size, FILE* stream)
{
return fwrite( ptr, 1, size, stream);
}

As for a rationale for the parameters to fread()/fwrite(), I've lost my copy of K&R long ago so I can only guess. I think that a likely answer is that Kernighan and Ritchie may have simply thought that performing binary I/O would be most naturally done on arrays of objects. Also, they may have thought that block I/O would be faster/easier to implement or whatever on some architectures.

Even though the C standard specifies that fread() and fwrite() be implemented in terms of fgetc() and fputc(), remember that the standard came into existence long after C was defined by K&R and that things specified in the standard might not have been in the original designers ideas. It's even possible that things said in K&R's "The C Programming Language" might not be the same as when the language was first being designed.

Finally, here's what P.J. Plauger has to say about fread() in "The Standard C Library":

If the size (second) argument is greater than one, you cannot determine whether the function also read up to size - 1 additional characters beyond what it reports. As a rule, you are better off calling the function as fread(buf, 1, size * n, stream); instead of fread(buf, size, n, stream);

Bascially, he's saying that fread()'s interface is broken. For fwrite() he notes that, "Write errors are generally rare, so this is not a major shortcoming" - a statement I wouldn't agree with.

小开

This is pure speculations, however back in the days(Some are still around) many filesystems were not simple byte streams on a hard drive.

Many file systems were record based, thus to satisfy such filesystems in an efficient manner, you'll have to specify the number of items ("records"), allowing fwrite/fread to operate on the storage as records, not just byte streams.

小开

I think it is because C lacks function overloading. If there was some, size would be redundant. But in C you can't determine a size of an array element, you have to specify one.

Consider this:

int intArray[10];
fwrite(intArray, sizeof(int), 10, fd);

If fwrite accepted number of bytes, you could write the following:

int intArray[10];
fwrite(intArray, sizeof(int)*10, fd);

But it is just inefficient. You will have sizeof(int) times more system calls.

Another point that should be taked into consideration is that you usually don't want a part of an array element be written to a file. You want the whole integer or nothing. fwrite returns a number of elements succesfully written. So if you discover that only 2 low bytes of an element is written what would you do?

On some systems (due to alignment) you can't access one byte of an integer without creating a copy and shifting.

小开

Having separate arguments for size and count could be advantageous on an implementation that can avoid reading any partial records. If one were to use single-byte reads from something like a pipe, even if one was using fixed-format data, one would have to allow for the possibility of a record getting split over two reads. If could instead requests e.g. a non-blocking read of up to 40 records of 10 bytes each when there are 293 bytes available, and have the system return 290 bytes (29 whole records) while leaving 3 bytes ready for the next read, that would be much more convenient.

I don't know to what extent implementations of fread can handle such semantics, but they could certainly be handy on implementations that could promise to support them.