这个1988年的 C 代码有什么问题吗?

我正在尝试从“ C 编程语言”(K & R)一书中编译这段代码。它是 UNIX 程序 wc的一个基本版本:

#include <stdio.h>


#define IN   1;     /* inside a word */
#define OUT  0;     /* outside a word */


/* count lines, words and characters in input */
main()
{
int c, nl, nw, nc, state;


state = OUT;
nl = nw = nc = 0;
while ((c = getchar()) != EOF) {
++nc;
if (c == '\n')
++nl;
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
}
printf("%d %d %d\n", nl, nw, nc);
}

我得到了以下错误:

$ gcc wc.c
wc.c: In function ‘main’:
wc.c:18: error: ‘else’ without a previous ‘if’
wc.c:18: error: expected ‘)’ before ‘;’ token

这本书的第二版是从1988年开始的,我对 C 语言还很陌生。也许这和编译器的版本有关,也许我只是在胡说八道。

我在现代 C 代码中看到了 main函数的不同用法:

int main()
{
/* code */
return 0;
}

这是一个新的标准,还是我仍然可以使用无类型的主?

7859 次浏览

Your problem is with your preprocessor definitions of IN and OUT:

#define IN   1;     /* inside a word */
#define OUT  0;     /* outside a word */

Notice how you have a trailing semicolon in each of these. When the preprocessor expands them, your code will look roughly like:

    if (c == ' ' || c == '\n' || c == '\t')
state = 0;; /* <--PROBLEM #1 */
else if (state == 0;) { /* <--PROBLEM #2 */
state = 1;;

That second semicolon causes the else to have no previous if as a match, because you are not using braces. So, remove the semicolons from the preprocessor definitions of IN and OUT.

The lesson learned here is that preprocessor statements do not have to end with a semicolon.

Also, you should always use braces!

    if (c == ' ' || c == '\n' || c == '\t') {
state = OUT;
} else if (state == OUT) {
state = IN;
++nw;
}

There is no hanging-else ambiguity in the above code.

Try adding explicit braces around code blocks. The K&R style can be ambiguous.

Look at line 18. The compiler is telling you where the issue is.

    if (c == '\n') {
++nl;
}
if (c == ' ' || c == '\n' || c == '\t') { // You're missing an "=" here; should be "=="
state = OUT;
}
else if (state == OUT) {
state = IN;
++nw;
}

There should not be any semicolons after the macros,

#define IN   1     /* inside a word */
#define OUT  0     /* outside a word */

and it should probably be

if (c == ' ' || c == '\n' || c == '\t')

The definitions of IN and OUT should look like this:

#define IN   1     /* inside a word  */
#define OUT  0     /* outside a word */

The semicolons were causing the problem! The explanation is simple: both IN and OUT are preprocessor directives, essentially the compiler will replace all occurrences of IN with a 1 and all occurrences of OUT with a 0 in the source code.

Since the original code had a semicolon after the 1 and the 0, when IN and OUT got replaced in the code, the extra semicolon after the number produced invalid code, for instance this line:

else if (state == OUT)

Ended up looking like this:

else if (state == 0;)

But what you wanted was this:

else if (state == 0)

Solution: remove the semicolon after the numbers in the original definition.

Not exactly a problem, but the declaration of main() is also dated, it should be like something this.

int main(int argc, char** argv) {
...
return 0;
}

The compiler will assume an int return value for a function w/o one, and I'm sure the compiler/linker will work around the lack of declaration for argc/argv and the lack of return value, but they should be there.

The main problem with this code is that it is not the code from K&R. It includes semicolons after the macros definitions, which were not present in the book, which as others have pointed out changes the meaning.

Except when making a change in an attempt to understand the code, you should leave it alone until you do understand it. You can only safely modify code you understand.

This was probably just a typo on your part, but it does illustrate the need for understanding and attention to details when programming.

As you see there was a problem in macros.

GCC has option for stopping after pre-processing. (-E) This option is useful to see the result of pre-processing. In fact the technique is an important one if you are working with large code base in c/c++. Typically makefiles will have a target to stop after pre-processing.

For quick reference : The SO question covers the options -- How do I see a C/C++ source file after preprocessing in Visual Studio?. It starts with vc++, but also has gcc options mentioned down below.

A simple way is to use brackets like {} for each if and else:

if (c == '\n'){
++nl;
}
if (c == ' ' || c == '\n' || c == '\t')
{
state = OUT;
}
else if (state == OUT) {
state = IN;
++nw;
}

As other answers pointed out, the problem is in #define and semicolons. To minimize these problems I always prefer defining number constants as a const int:

const int IN = 1;
const int OUT = 0;

This way you get rid of many problems and possible problems. It is limited by just two things:

  1. Your compiler has to support const - which in 1988 wasn't generally true, but now it's supported by all commonly used compilers. (AFAIK the const is "borrowed" from C++.)

  2. You can't use these constants in some special places where you would need a string-like constant. But I think your program isn't that case.