Conflict between a Stanford tutorial and GCC

According to this movie (around minute 38), if I have two functions with the same local vars, they will use the same space. So the following program, should print 5. Compiling it with gcc results -1218960859. why?

The program:

#include <stdio.h>


void A()
{
int a;
printf("%i",a);
}


void B()
{
int a;
a = 5;
}


int main()
{
B();
A();
return 0;
}

as requested, here is the output from the disassembler:

0804840c <A>:
804840c:   55                      push   ebp
804840d:   89 e5                   mov    ebp,esp
804840f:   83 ec 28                sub    esp,0x28
8048412:   8b 45 f4                mov    eax,DWORD PTR [ebp-0xc]
8048415:   89 44 24 04             mov    DWORD PTR [esp+0x4],eax
8048419:   c7 04 24 e8 84 04 08    mov    DWORD PTR [esp],0x80484e8
8048420:   e8 cb fe ff ff          call   80482f0 <printf@plt>
8048425:   c9                      leave
8048426:   c3                      ret


08048427 <B>:
8048427:   55                      push   ebp
8048428:   89 e5                   mov    ebp,esp
804842a:   83 ec 10                sub    esp,0x10
804842d:   c7 45 fc 05 00 00 00    mov    DWORD PTR [ebp-0x4],0x5
8048434:   c9                      leave
8048435:   c3                      ret


08048436 <main>:
8048436:   55                      push   ebp
8048437:   89 e5                   mov    ebp,esp
8048439:   83 e4 f0                and    esp,0xfffffff0
804843c:   e8 e6 ff ff ff          call   8048427 <B>
8048441:   e8 c6 ff ff ff          call   804840c <A>
8048446:   b8 00 00 00 00          mov    eax,0x0
804844b:   c9                      leave
804844c:   c3                      ret
804844d:   66 90                   xchg   ax,ax
804844f:   90                      nop
4177 次浏览

In the function A, the variable a is not initialized, printing its value leads to undefined behavior.

In some compiler, the variable a in A and a in B are in the same address, so it may print 5, but again, you can't rely on undefined behavior.

It's undefined behavior. An uninitialized local variable has an indeterminate value, and using it will lead to undefined behavior.

Yes, yes, this is undefined behavior, because you're using the variable uninitialized1.

However, on the x86 architecture2, this experiment should work. The value isn't "erased" from the stack, and since it's not initialized in B(), that same value should still be there, provided the stack frames are identical.

I'd venture to guess that, since int a is not used inside of void B(), the compiler optimized that code out, and a 5 was never written to that location on the stack. Try adding a printf in B() as well - it just may work.

Also, compiler flags - namely optimization level - will likely affect this experiment as well. Try disabling optimizations by passing -O0 to gcc.

Edit: I just compiled your code with gcc -O0 (64-bit), and indeed, the program prints 5, as one familiar with the call stack would expect. In fact, it worked even without -O0. A 32-bit build may behave differently.

Disclaimer: Don't ever, ever use something like this in "real" code!

1 - There's a debate going on below about whether or not this is officially "UB", or just unpredictable.

2 - Also x64, and probably every other architecture that uses a call stack (at least ones with an MMU)


Let's take a look at a reason why it didn't work. This is best seen in 32 bit, so I will compile with -m32.

$ gcc --version
gcc (GCC) 4.7.2 20120921 (Red Hat 4.7.2-2)

I compiled with $ gcc -m32 -O0 test.c (Optimizations disabled). When I run this, it prints garbage.

Looking at $ objdump -Mintel -d ./a.out:

080483ec <A>:
80483ec:   55                      push   ebp
80483ed:   89 e5                   mov    ebp,esp
80483ef:   83 ec 28                sub    esp,0x28
80483f2:   8b 45 f4                mov    eax,DWORD PTR [ebp-0xc]
80483f5:   89 44 24 04             mov    DWORD PTR [esp+0x4],eax
80483f9:   c7 04 24 c4 84 04 08    mov    DWORD PTR [esp],0x80484c4
8048400:   e8 cb fe ff ff          call   80482d0 <printf@plt>
8048405:   c9                      leave
8048406:   c3                      ret


08048407 <B>:
8048407:   55                      push   ebp
8048408:   89 e5                   mov    ebp,esp
804840a:   83 ec 10                sub    esp,0x10
804840d:   c7 45 fc 05 00 00 00    mov    DWORD PTR [ebp-0x4],0x5
8048414:   c9                      leave
8048415:   c3                      ret

We see that in B, the compiler reserved 0x10 bytes of stack space, and initialized our int a variable at [ebp-0x4] to 5.

In A however, the compiler placed int a at [ebp-0xc]. So in this case our local variables did not end up at the same place! By adding a printf() call in A as well will cause the stack frames for A and B to be identical, and print 55.

Compile your code with gcc -Wall filename.c You will see these warnings.

In function 'B':
11:9: warning: variable 'a' set but not used [-Wunused-but-set-variable]


In function 'A':
6:11: warning: 'a' is used uninitialized in this function [-Wuninitialized]

In c Printing uninitialized variable Leads to Undefined behavior.

Section 6.7.8 Initialization of C99 standard says

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:

— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules;
— if it is a union, the first named member is initialized (recursively) according to these rules.

Edit1

As @Jonathon Reinhart If you disable optimization by Using -O flag gcc-O0 then you might get output 5.

But this is not at all good idea , never ever use this in production code.

-Wuninitialized this is one of the valuable warning You should consider this one You should not either disable or skip this warning that leads huge damage in production like causing crashes in while running daemons.


Edit2

Deep C slides explained Why result is 5/garbage.Adding this information from those slides with minor modifications to make this answer little more effective.

Case 1: without optimization

$ gcc -O0 file.c && ./a.out
5

Perhaps this compiler has a pool of named variables that it reuses. Eg variable a was used and released in B(), then when A() needs an integer names a it will get the variable will get the same memory location. If you rename the variable in B() to, say b, then I don’t think you will get 5.

Case 2: with optimization

A lot of things might happen when the optimizer kicks in. In this case I would guess that the call to B() can be skipped as it does not have any side effects. Also, I would not be surprised if the A() is inlined in main(), ie no function call. (But since A () has linker visibility the object code for the function must still be created just in case another object file wants to link with the function). Anyway, I suspect the value printed will be something else if you optimize the code.

gcc -O file.c && ./a.out
1606415608

Garbage!

One important thing to remember - don't ever rely on something like that and never use this in real code! It's just an interesting thing(which even isn't always true), not a feature or something like that. Imagine yourself trying to find bug produced by that kind of "feature" - nightmare.

Btw. - C and C++ are full of that kind of "features", here is GREAT slideshow about it: http://www.slideshare.net/olvemaudal/deep-c So if you want to see more similar "features", understand what's under the hood and how it's working just watch this slideshow - you won't regret and i'm sure that even most of experienced c/c++ programmers can learn a lot from this.