在哪里可以和不可以在 C 中声明新变量?

我听说(可能是从老师那里)应该在程序/函数之上声明所有的变量,并且在语句中声明新的变量可能会导致问题。

但是当我读到 K & R 的时候,我碰到了这样一句话: “变量的声明(包括初始化)可以跟在引入任何复合语句的左括号后面,而不仅仅是引入函数的那个”。他举了一个例子:

if (n > 0){
int i;
for (i=0;i<n;i++)
...
}

我对这个概念进行了一些研究,它甚至可以用于数组。例如:

int main(){
int x = 0 ;


while (x<10){
if (x>5){
int y[x];
y[0] = 10;
printf("%d %d\n",y[0],y[4]);
}
x++;
}
}

那么到底什么时候不允许我声明变量呢?例如,如果我的变量声明不在大括号之后怎么办?比如这里:

int main(){
int x = 10;


x++;
printf("%d\n",x);


int z = 6;
printf("%d\n",z);
}

根据程序/机器的不同,这会引起麻烦吗?

72312 次浏览

I also often hear that putting variables at the top of the function is the best way to do things, but I strongly disagree. I prefer to confine variables to the smallest scope possible so they have less chance to be misused and so I have less stuff filling up my mental space in each line on the program.

While all versions of C allow lexical block scope, where you can declare the variables depends of the version of the C standard that you are targeting:

C99 onwards or C++

Modern C compilers such as gcc and clang support the C99 and C11 standards, which allow you to declare a variable anywhere a statement could go. The variable's scope starts from the point of the declaration to the end of the block (next closing brace).

if( x < 10 ){
printf("%d", 17);  // z is not in scope in this line
int z = 42;
printf("%d", z);   // z is in scope in this line
}

You can also declare variables inside for loop initializers. The variable will only exist only inside the loop.

for(int i=0; i<10; i++){
printf("%d", i);
}

ANSI C (C90)

If you are targeting the older ANSI C standard, then you are limited to declaring variables immediately after an opening brace1.

This doesn't mean you have to declare all your variables at the top of your functions though. In C you can put a brace-delimited block anywhere a statement could go (not just after things like if or for) and you can use this to introduce new variable scopes. The following is the ANSI C version of the previous C99 examples:

if( x < 10 ){
printf("%d", 17);  // z is not in scope in this line


{
int z = 42;
printf("%d", z);   // z is in scope in this line
}
}


{int i; for(i=0; i<10; i++){
printf("%d", i);
}}

1 Note that if you are using gcc you need to pass the --pedantic flag to make it actually enforce the C90 standard and complain that the variables are declared in the wrong place. If you just use -std=c90 it makes gcc accept a superset of C90 which also allows the more flexible C99 variable declarations.

If your compiler allows it then its fine to declare anywhere you want. In fact the code is more readable (IMHO) when you declare the variable where you use instead of at the top of a function because it makes it easier to spot errors e.g. forgetting to initialize the variable or accidently hiding the variable.

missingno covers what ANSI C allows, but he doesn't address why your teachers told you to declare your variables at the top of your functions. Declaring variables in odd places can make your code harder to read, and that can cause bugs.

Take the following code as an example.

#include <stdio.h>


int main() {
int i, j;
i = 20;
j = 30;


printf("(1) i: %d, j: %d\n", i, j);


{
int i;
i = 88;
j = 99;
printf("(2) i: %d, j: %d\n", i, j);
}


printf("(3) i: %d, j: %d\n", i, j);


return 0;
}

As you can see, I've declared i twice. Well, to be more precise, I've declared two variables, both with the name i. You might think this would cause an error, but it doesn't, because the two i variables are in different scopes. You can see this more clearly when you look at the output of this function.

(1) i: 20, j: 30
(2) i: 88, j: 99
(3) i: 20, j: 99

First, we assign 20 and 30 to i and j respectively. Then, inside the curly braces, we assign 88 and 99. So, why then does the j keep its value, but i goes back to being 20 again? It's because of the two different i variables.

Between the inner set of curly braces the i variable with the value 20 is hidden and inaccessible, but since we have not declared a new j, we are still using the j from the outer scope. When we leave the inner set of curly braces, the i holding the value 88 goes away, and we again have access to the i with the value 20.

Sometimes this behavior is a good thing, other times, maybe not, but it should be clear that if you use this feature of C indiscriminately, you can really make your code confusing and hard to understand.

A post shows the following code:

//C99
printf("%d", 17);
int z=42;
printf("%d", z);


//ANSI C
printf("%d", 17);
{
int z=42;
printf("%d", z);
}

and I think the implication is that these are equivalent. They are not. If int z is placed at the bottom of this code snippet, it causes a redefinition error against the first z definition but not against the second.

However, multiple lines of:

//C99
for(int i=0; i<10; i++){}

does work. Showing the subtlety of this C99 rule.

Personally, I passionately shun this C99 feature.

The argument that it narrows the scope of a variable is false, as shown by these examples. Under the new rule, you cannot safely declare a variable until you have scanned the entire block, whereas formerly you only needed to understand what was going on at the head of each block.

As per the The C Programming Language By K&R -

In C, all variables must be declared before they are used, usually at the beginning of the function before any executable statements.

Here you can see word usually it is not must..

With clang and gcc, I encountered major issues with the following. gcc version 8.2.1 20181011 clang version 6.0.1

  {
char f1[]="This_is_part1 This_is_part2";
char f2[64]; char f3[64];
sscanf(f1,"%s %s",f2,f3);      //split part1 to f2, part2 to f3
}

neither compiler liked f1,f2 or f3, to be within the block. I had to relocate f1,f2,f3 to the function definition area. the compiler did not mind the definition of an integer with the block.

Internally all variables local to a function are allocated on a stack or inside CPU registers, and then the generated machine code swaps between the registers and the stack (called register spill), if compiler is bad or if CPU doesn't have enough registers to keep all the balls juggling in the air.

To allocate stuff on stack, CPU has two special registers, one called Stack Pointer (SP) and another -- Base Pointer (BP) or frame pointer (meaning the stack frame local to the current function scope). SP points inside the current location on a stack, while BP points to the working dataset (above it) and the function arguments (below it). When function is invoked, it pushes the BP of the caller/parent function onto the stack (pointed by SP), and sets the current SP as the new BP, then increases SP by the number of bytes spilled from registers onto stack, does computation, and on return, it restores its parent's BP, by poping it from the stack.

Generally, keeping your variables inside their own {}-scope could speedup compilation and improve the generated code by reducing the size of the graph the compiler has to walk to determine which variables are used where and how. In some cases (especially when goto is involved) compiler can miss the fact the variable wont be used anymore, unless you explicitly tell compiler its use scope. Compilers could have time/depth limit to search the program graph.

Compiler could place variables declared near each other to the same stack area, which means loading one will preload all other into cache. Same way, declaring variable register, could give compiler a hint that you want to avoid said variable being spilled on stack at all costs.

Strict C99 standard requires explicit { before declarations, while extensions introduced by C++ and GCC allow declaring vars further into the body, which complicates goto and case statements. C++ further allows declaring stuff inside for loop initialization, which is limited to the scope of the loop.

Last but not least, for another human being reading your code, it would be overwhelming when he sees the top of a function littered with half a hundred variables declarations, instead of them localized at their use places. It also makes easier to comment out their use.

TLDR: using {} to explicitly state variables scope can help both compiler and human reader.