Main()方法在 C 中是如何工作的?

我知道有两个不同的签名写的主要方法-

int main()
{
//Code
}

或者为了处理命令行参数,我们将其写为-

int main(int argc, char * argv[])
{
//code
}

C++中我知道我们可以重载一个方法,但是在 C中编译器如何处理 main函数的这两个不同的签名?

27378 次浏览

There is no need for overloading. Yes, there are 2 versions, but only one can be used at the time.

There is NO overloading of main even in C++. Main function is the entry point for a program and only a single definition should exist.

For Standard C

For a hosted environment (that's the normal one), the C99 standard says:

5.1.2.2.1 Program startup

The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

int main(void) { /* ... */ }

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

int main(int argc, char *argv[]) { /* ... */ }

or equivalent;9) or in some other implementation-defined manner.

9) Thus, int can be replaced by a typedef name defined as int, or the type of argv can be written as char **argv, and so on.

For standard C++:

3.6.1 Main function [basic.start.main]

1 A program shall contain a global function called main, which is the designated start of the program. [...]

2 An implementation shall not predefine the main function. This function shall not be overloaded. It shall have a return type of type int, but otherwise its type is implementation defined. All implementations shall allow both of the following definitions of main:

int main() { /* ... */ }

and

int main(int argc, char* argv[]) { /* ... */ }

The C++ standard explicitly says "It [the main function] shall have a return type of type int, but otherwise its type is implementation defined", and requires the same two signatures as the C standard.

In a hosted environment (A C environment which also supports the C libraries) - the Operating System calls main.

In a non-hosted environment (One intended for embedded applications) you can always change the entry point (or exit) of your program using the pre-processor directives like

#pragma startup [priority]
#pragma exit [priority]

Where priority is an optional integral number.

Pragma startup executes the function before the main (priority-wise) and pragma exit executes the function after the main function. If there is more than one startup directive then priority decides which will execute first.

The main is just a name for a starting address decided by the linker where main is the default name. All function names in a program are starting addresses where the function starts.

The function arguments are pushed/popped on/from the stack so if there are no arguments specified for the function there are no arguments pushed/popped on/off the stack. That is how main can work both with or without arguments.

This is one of the strange asymmetries and special rules of the C and C++ language.

In my opinion it exists only for historical reasons and there's no real serious logic behind it. Note that main is special also for other reasons (for example main in C++ cannot be recursive and you cannot take its address and in C99/C++ you are allowed to omit a final return statement).

Note also that even in C++ it's not an overload... either a program has the first form or it has the second form; it cannot have both.

Some of the features of the C language started out as hacks which just happened to work.

Multiple signatures for main, as well as variable-length argument lists, is one of those features.

Programmers noticed that they can pass extra arguments to a function, and nothing bad happens with their given compiler.

This is the case if the calling conventions are such that:

  1. The calling function cleans up the arguments.
  2. The leftmost arguments are closer to the top of the stack, or to the base of the stack frame, so that spurious arguments do not invalidate the addressing.

One set of calling conventions which obeys these rules is stack-based parameter passing whereby the caller pops the arguments, and they are pushed right to left:

 ;; pseudo-assembly-language
;; main(argc, argv, envp); call


push envp  ;; rightmost argument
push argv  ;;
push argc  ;; leftmost argument ends up on top of stack


call main


pop        ;; caller cleans up
pop
pop

In compilers where this type of calling convention is the case, nothing special need to be done to support the two kinds of main, or even additional kinds. main can be a function of no arguments, in which case it is oblivious to the items that were pushed onto the stack. If it's a function of two arguments, then it finds argc and argv as the two topmost stack items. If it's a platform-specific three-argument variant with an environment pointer (a common extension), that will work too: it will find that third argument as the third element from the top of the stack.

And so a fixed call works for all cases, allowing a single, fixed start-up module to be linked to the program. That module could be written in C, as a function resembling this:

/* I'm adding envp to show that even a popular platform-specific variant
can be handled. */
extern int main(int argc, char **argv, char **envp);


void __start(void)
{
/* This is the real startup function for the executable.
It performs a bunch of library initialization. */


/* ... */


/* And then: */
exit(main(argc_from_somewhere, argv_from_somewhere, envp_from_somewhere));
}

In other words, this start module just calls a three-argument main, always. If main takes no arguments, or only int, char **, it happens to work fine, as well as if it takes no arguments, due to the calling conventions.

If you were to do this kind of thing in your program, it would be nonportable and considered undefined behavior by ISO C: declaring and calling a function in one manner, and defining it in another. But a compiler's startup trick does not have to be portable; it is not guided by the rules for portable programs.

But suppose that the calling conventions are such that it cannot work this way. In that case, the compiler has to treat main specially. When it notices that it's compiling the main function, it can generate code which is compatible with, say, a three argument call.

That is to say, you write this:

int main(void)
{
/* ... */
}

But when the compiler sees it, it essentially performs a code transformation so that the function which it compiles looks more like this:

int main(int __argc_ignore, char **__argv_ignore, char **__envp_ignore)
{
/* ... */
}

except that the names __argc_ignore don't literally exist. No such names are introduced into your scope, and there won't be any warning about unused arguments. The code transformation causes the compiler to emit code with the correct linkage which knows that it has to clean up three arguments.

Another implementation strategy is for the compiler or perhaps linker to custom-generate the __start function (or whatever it is called), or at least select one from several pre-compiled alternatives. Information could be stored in the object file about which of the supported forms of main is being used. The linker can look at this info, and select the correct version of the start-up module which contains a call to main which is compatible with the program's definition. C implementations usually have only a small number of supported forms of main so this approach is feasible.

Compilers for the C99 language always have to treat main specially, to some extent, to support the hack that if the function terminates without a return statement, the behavior is as if return 0 were executed. This, again, can be treated by a code transformation. The compiler notices that a function called main is being compiled. Then it checks whether the end of the body is potentially reachable. If so, it inserts a return 0;

Well, the two different signatures of the same function main() comes in picture only when you want them so, I mean if your programm needs data before any actual processing of your code you may pass them via use of -

    int main(int argc, char * argv[])
{
//code
}

where the variable argc stores the count of data that is passed and argv is an array of pointers to char which points to the passed values from console. Otherwise it's always good to go with

    int main()
{
//Code
}

However in any case there can be one and only one main() in a programm, as because that's the only point where from a program starts its execution and hence it can not be more than one. (hope its worthy)

What's unusual about main isn't that it can be defined in more than one way, it's that it can only be defined in one of two different ways.

main is a user-defined function; the implementation doesn't declare a prototype for it.

The same thing is true for foo or bar, but you can define functions with those names any way you like.

The difference is that main is invoked by the implementation (the runtime environment), not just by your own code. The implementation isn't limited to ordinary C function call semantics, so it can (and must) deal with a few variations -- but it's not required to handle infinitely many possibilities. The int main(int argc, char *argv[]) form allows for command-line arguments, and int main(void) in C or int main() in C++ is just a convenience for simple programs that don't need to process command-line arguments.

As for how the compiler handles this, it depends on the implementation. Most systems probably have calling conventions that make the two forms effectively compatible, and any arguments passed to a main defined with no parameters are quietly ignored. If not, it wouldn't be difficult for a compiler or linker to treat main specially. If you're curious how it works on your system, you might look at some assembly listings.

And like many things in C and C++, the details are largely a result of history and arbitrary decisions made by the designers of the languages and their predecessors.

Note that both C and C++ both permit other implementation-defined definitions for main -- but there's rarely any good reason to use them. And for freestanding implementations (such as embedded systems with no OS), the program entry point is implementation-defined, and isn't necessarily even called main.

A similar question was asked before: Why does a function with no parameters (compared to the actual function definition) compile?

One of the top-ranked answers was:

In C func() means that you can pass any number of arguments. If you want no arguments then you have to declare as func(void)

So, I guess it's how main is declared (if you can apply the term "declared" to main). In fact you can write something like this:

int main(int only_one_argument) {
// code
}

and it will still compile and run.

You do not need to override this.because only one will used at a time.yes there are 2 different version of main function