使用 fflush (stdin)

因此,在谷歌上快速搜索 fflush(stdin)以清除输入缓冲区,就会发现许多网站警告不要使用它。然而,我的计算机科学教授就是这样教这门课的。

使用 fflush(stdin)有多糟糕?我真的应该放弃使用它,即使我的教授正在使用它,它似乎工作得完美无瑕?

59487 次浏览

Simple: this is undefined behavior, since fflush is meant to be called on an output stream. This is an excerpt from the C standard:

int fflush(FILE *ostream);

ostream points to an output stream or an update stream in which the most recent operation was not input, the fflush function causes any unwritten data for that stream to be delivered to the host environment to be written to the file; otherwise, the behavior is undefined.

So it's not a question of "how bad" this is. fflush(stdin) is simply not portable, so you should not use it if you want your code to be portable between compilers.

According to the standard, fflush can only be used with output buffers, and obviously stdin isn't one. However, some standard C libraries provide the use of fflush(stdin) as an extension. In that case you can use it, but it will affect portability, so you will no longer be able to use any standards-compliant standard C library on earth and expect the same results.

Converting comments into an answer.

TL;DR — Portable code doesn't use fflush(stdin)

The rest of this answer explains why portable code does not use fflush(stdin). It is tempting to add "reliable code doesn't use fflush(stdin)", which is also generally true.

Standard C and POSIX leave fflush(stdin) as undefined behaviour

The POSIX, C and C++ standards for fflush() explicitly state that the behaviour is undefined (because stdin is an input stream), but none of them prevent a system from defining it.

ISO/IEC 9899:2011 — the C11 Standard — says:

§7.21.5.2 The fflush function

¶2 If stream points to an output stream or an update stream in which the most recent operation was not input, the fflush function causes any unwritten data for that stream to be delivered to the host environment to be written to the file; otherwise, the behavior is undefined.

POSIX mostly defers to the C standard but it does mark this text as a C extension.

[CX] ⌦ For a stream open for reading, if the file is not already at EOF, and the file is one capable of seeking, the file offset of the underlying open file description shall be set to the file position of the stream, and any characters pushed back onto the stream by ungetc() or ungetwc() that have not subsequently been read from the stream shall be discarded (without further changing the file offset). ⌫

Note that terminals are not capable of seeking; neither are pipes or sockets.

Microsoft defines the behaviour of fflush(stdin)

In 2015, Microsoft and the Visual Studio runtime used to define the behaviour of fflush() on an input stream like this (but the link leads to different text in 2021):

If the stream is open for input, fflush clears the contents of the buffer.

M.M notes:

Cygwin is an example of a fairly common platform on which fflush(stdin) does not clear the input.

This is why this answer version of my comment notes 'Microsoft and the Visual Studio runtime' — if you use a non-Microsoft C runtime library, the behaviour you see depends on that library.

Weather Vane pointed out to me in a comment to another question that, at some time before June 2021, Microsoft changed its description of fflush() compared with what was originally specified when this answer was written in 2015. It now says:

If the stream was opened in read mode, or if the stream has no buffer, the call to fflush has no effect, and any buffer is retained. A call to fflush negates the effect of any prior call to ungetc for the stream.

Caveat Lector: it is probably best not to rely on fflush(stdin) on any platform.

Linux documentation and practice seem to contradict each other

Surprisingly, Linux nominally documents the behaviour of fflush(stdin) too, and even defines it the same way (miracle of miracles). This quote is from 2015.

For input streams, fflush() discards any buffered data that has been fetched from the underlying file, but has not been consumed by the application.

In 2021, the quote changes to:

For input streams, fflush() discards any buffered data that has been fetched from the underlying file, but has not been consumed by the application. The open status of the stream is unaffected.

And another source for fflush(3) on Linux agrees (give or take paragraph breaks):

For input streams associated with seekable files (e.g., disk files, but not pipes or terminals), fflush() discards any buffered data that has been fetched from the underlying file, but has not been consumed by the application.

Neither of these explicitly addresses the points made by the POSIX specification about ungetc().

In 2021, zwol commented that the Linux documentation has been improved. It seems to me that there is still room for improvement.

In 2015, I was a bit puzzled and surprised at the Linux documentation saying that fflush(stdin) will work. Despite that suggestion, it most usually does not work on Linux. I just checked the documentation on Ubuntu 14.04 LTS; it says what is quoted above, but empirically, it does not work — at least when the input stream is a non-seekable device such as a terminal.

demo-fflush.c

#include <stdio.h>


int main(void)
{
int c;
if ((c = getchar()) != EOF)
{
printf("Got %c; enter some new data\n", c);
fflush(stdin);
}
if ((c = getchar()) != EOF)
printf("Got %c\n", c);


return 0;
}

Example output

$ ./demo-fflush
Alliteration
Got A; enter some new data
Got l
$

This output was obtained on both Ubuntu 14.04 LTS and Mac OS X 10.11.2. To my understanding, it contradicts what the Linux manual says. If the fflush(stdin) operation worked, I would have to type a new line of text to get information for the second getchar() to read.

Given what the POSIX standard says, maybe a better demonstration is needed, and the Linux documentation should be clarified.

demo-fflush2.c

#include <stdio.h>


int main(void)
{
int c;
if ((c = getchar()) != EOF)
{
printf("Got %c\n", c);
ungetc('B', stdin);
ungetc('Z', stdin);
if ((c = getchar()) == EOF)
{
fprintf(stderr, "Huh?!\n");
return 1;
}
printf("Got %c after ungetc()\n", c);
fflush(stdin);
}
if ((c = getchar()) != EOF)
printf("Got %c\n", c);


return 0;
}

Example output

Note that /etc/passwd is a seekable file. On Ubuntu, the first line looks like:

root:x:0:0:root:/root:/bin/bash

On Mac OS X, the first 4 lines look like:

##
# User Database
#
# Note that this file is consulted directly only when the system is running

In other words, there is commentary at the top of the Mac OS X /etc/passwd file. The non-comment lines conform to the normal layout, so the root entry is:

root:*:0:0:System Administrator:/var/root:/bin/sh

Ubuntu 14.04 LTS:

$ ./demo-fflush2 < /etc/passwd
Got r
Got Z after ungetc()
Got o
$ ./demo-fflush2
Allotrope
Got A
Got Z after ungetc()
Got B
$

Mac OS X 10.11.2:

$ ./demo-fflush2 < /etc/passwd
Got #
Got Z after ungetc()
Got B
$

The Mac OS X behaviour ignores (or at least seems to ignore) the fflush(stdin) (thus not following POSIX on this issue). The Linux behaviour corresponds to the documented POSIX behaviour, but the POSIX specification is far more careful in what it says — it specifies a file capable of seeking, but terminals, of course, do not support seeking. It is also much less useful than the Microsoft specification.

Summary

Microsoft documents the behaviour of fflush(stdin), but that behaviour has changed between 2015 and 2021. Apparently, it works as documented on the Windows platform, using the native Windows compiler and C runtime support libraries.

Despite documentation to the contrary, it does not work on Linux when the standard input is a terminal, but it seems to follow the POSIX specification which is far more carefully worded. According to the C standard, the behaviour of fflush(stdin) is undefined. POSIX adds the qualifier 'unless the input file is seekable', which a terminal is not. The behaviour is not the same as Microsoft's.

Consequently, portable code does not use fflush(stdin). Code that is tied to Microsoft's platform may use it and it may work as expected, but beware of the portability issues.

POSIX way to discard unread terminal input from a file descriptor

The POSIX standard way to discard unread information from a terminal file descriptor (as opposed to a file stream like stdin) is illustrated at How can I flush unread data from a tty input queue on a Unix system. However, that is operating below the standard I/O library level.

Quote from POSIX:

For a stream open for reading, if the file is not already at EOF, and the file is one capable of seeking, the file offset of the underlying open file description shall be set to the file position of the stream, and any characters pushed back onto the stream by ungetc() or ungetwc() that have not subsequently been read from the stream shall be dis- carded (without further changing the file offset).

Note that terminal is not capable of seeking.

I believe that you should never call fflush(stdin), and for the simple reason that you should never even find it necessary to try to flush input in the first place. Realistically, there is only one reason you might think you had to flush input, and that is: to get past some bad input that scanf is stuck on.

For example, you might have a program that is sitting in a loop reading integers using scanf("%d", &n). Soon enough you'll discover that the first time the user types a non-digit character like 'x', the program goes into an infinite loop.

When faced with this situation, I believe you basically have three choices:

  1. Flush the input somehow (if not by using fflush(stdin), then by calling getchar in a loop to read characters until \n, as is often recommended).
  2. Tell the user not to type non-digit characters when digits are expected.
  3. Use something other than scanf to read input.

Now, if you're a beginner, scanf seems like the easiest way to read input, and so choice #3 looks scary and difficult. But #2 seems like a real cop-out, because everyone knows that user-unfriendly computer programs are a problem, so it'd be nice to do better. So all too many beginning programmers get painted into a corner, feeling that they have no choice but to do #1. They more or less have to do input using scanf, meaning that it will get stuck on bad input, meaning that they have to figure out a way to flush the bad input, meaning that they're sorely tempted to use fflush(stdin).

I would like to encourage all beginning C programmers out there to make a different set of tradeoffs:

  1. During the earliest stages of your C programming career, before you're comfortable using anything other than scanf, just don't worry about bad input. Really. Go ahead and use cop-out #2 above. Think about it like this: You're a beginner, there are lots of things you don't know how to do yet, and one of the things you don't know how to do yet is: deal gracefully with unexpected input.

  2. As soon as you can, learn how to do input using functions other than scanf. At that point, you can start dealing gracefully with bad input, and you'll have many more, much better techniques available to you, that won't require trying to "flush the bad input" at all.

Or, in other words, beginners who are still stuck using scanf should feel free to use cop-out #2, and when they're ready they should graduate from there to technique #3, and nobody should be using technique #1 to try to flush input at all -- and certainly not with fflush(stdin).

Using fflush(stdin) to flush input is kind of like dowsing for water using a stick shaped like the letter "S".

And helping people to flush input in some "better" way is kind of like rushing up to an S-stick dowser and saying "No, no, you're doing it wrong, you need to use a Y-shaped stick!".

In other words, the real problem isn't that fflush(stdin) doesn't work. Calling fflush(stdin) is a symptom of an underlying problem. Why are you having to "flush" input at all? That's your problem.

And, usually, that underlying problem is that you're using scanf, in one of its many unhelpful modes that unexpectedly leaves newlines or other "unwanted" text on the input. The best long-term solution, therefore, is to learn how to do input using better techniques than scanf, so that you don't have to deal with its unhandled input and other idiosyncrasies at all.

None of the existing answers point out a key aspect of the issue.

If you find yourself wanting to "clear the input buffer", you're probably writing a command-line interactive program, and it would be more accurate to say that what you want is to discard characters from the current line of input that you haven't already read.

This is not what fflush(stdin) does. The C libraries that support using fflush on an input stream, document it as either doing nothing, or as discarding buffered data that has been read from the underlying file but not passed to the application. That can easily be either more or less input than the rest of the current line. It probably does work by accident in a lot of cases, because the terminal driver (in its default mode) supplies input to a command-line interactive program one line at a time. However, the moment you try to feed input to your program from an actual file on disk (perhaps for automated testing), the kernel and C library will switch over to buffering data in large "blocks" (often 4 to 8 kB) with no relationship to line boundaries, and you'll be wondering why your program is processing the first line of the file and then skipping several dozen lines and picking up in the middle of some apparently random line below. Or, if you decide to test your program on a very long line typed by hand, then the terminal driver won't be able to give the program the whole line at once and fflush(stdin) won't skip all of it.

So what should you do instead? The approach that I prefer is, if you're processing input one line at a time, then read an entire line all at once. The C library has functions specifically for this: fgets (in C90, so fully portable, but does still make you process very long lines in chunks) and getline (POSIX-specific, but will manage a malloced buffer for you so you can process long lines all at once no matter how long they get). There's usually a direct translation from code that processes "the current line" directly from stdin to code that processes a string containing "the current line".