为什么一个函数只有一个出口点?

我总是听说单个退出点函数是一种糟糕的编码方式,因为它会损失可读性和效率。我从没听过有人反对。

我以为这和 CS 有关,但是这个问题在 cstheory 堆栈交换中被否决了。

56613 次浏览

There are different schools of thought, and it largely comes down to personal preference.

One is that it is less confusing if there is only a single exit point - you have a single path through the method and you know where to look for the exit. On the minus side if you use indentation to represent nesting, your code ends up massively indented to the right, and it becomes very difficult to follow all the nested scopes.

Another is that you can check preconditions and exit early at the start of a method, so that you know in the body of the method that certain conditions are true, without the entire body of the method being indented 5 miles off to the right. This usually minimises the number of scopes you have to worry about, which makes code much easier to follow.

A third is that you can exit anywhere you please. This used to be more confusing in the old days, but now that we have syntax-colouring editors and compilers that detect unreachable code, it's a lot easier to deal with.

I'm squarely in the middle camp. Enforcing a single exit point is a pointless or even counterproductive restriction IMHO, while exiting at random all over a method can sometimes lead to messy difficult to follow logic, where it becomes difficult to see if a given bit of code will or won't be executed. But "gating" your method makes it possible to significantly simplify the body of the method.

Single entry and exit point was original concept of structured programming vs step by step Spaghetti Coding. There is a belief that multiple exit-point functions require more code since you have to do proper clean up of memory spaces allocated for variables. Consider a scenario where function allocates variables (resources) and getting out of the function early and without proper clean up would result in resource leaks. In addition, constructing clean-up before every exit would create a lot of redundant code.

My general recommendation is that return statements should, when practical, either be located before the first code that has any side-effects, or after the last code that has any side-effects. I would consider something like:

if (!argument)  // Check if non-null
return ERR_NULL_ARGUMENT;
... process non-null argument
if (ok)
return 0;
else
return ERR_NOT_OK;

clearer than:

int return_value;
if (argument) // Non-null
{
.. process non-null argument
.. set result appropriately
}
else
result = ERR_NULL_ARGUMENT;
return result;

If a certain condition should prevent a function from doing anything, I prefer to early-return out of the function at a spot above the point where the function would do anything. Once the function has undertaken actions with side-effects, though, I prefer to return from the bottom, to make clear that all side-effects must be dealt with.

With most anything, it comes down to the needs of the deliverable. In "the old days", spaghetti code with multiple return points invited memory leaks, since coders that preferred that method typically did not clean up well. There were also issues with some compilers "losing" the reference to the return variable as the stack was popped during the return, in the case of returning from a nested scope. The more general problem was one of re-entrant code, which attempts to have the calling state of a function be exactly the same as its return state. Mutators of oop violated this and the concept was shelved.

There are deliverables, most notably kernels, which need the speed that multiple exit points provide. These environments normally have their own memory and process management, so the risk of a leak is minimized.

Personally, I like to have a single point of exit, since I often use it to insert a breakpoint on the return statement and perform a code inspect of how the code determined that solution. I could just go to the entrance and step through, which I do with extensively nested and recursive solutions. As a code reviewer, multiple returns in a function requires a much deeper analysis - so if you're doing it to speed up the implementation, you're robbing Peter to save Paul. More time will be required in code reviews, invalidating the presumption of efficient implementation.

-- 2 cents

Please see this doc for more details: NISTIR 5459

In my view, the advice to exit a function (or other control structure) at only one point often is oversold. Two reasons typically are given to exit at only one point:

  1. Single-exit code is supposedly easier to read and debug. (I admit that I don't think much of this reason, but it is given. What is substantially easier to read and debug is single-entry code.)
  2. Single-exit code links and returns more cleanly.

The second reason is subtle and has some merit, especially if the function returns a large data structure. However, I wouldn't worry about it too much, except ...

If a student, you want to earn top marks in your class. Do what the instructor prefers. He probably has a good reason from his perspective; so, at the very least, you'll learn his perspective. This has value in itself.

Good luck.

If you feel like you need multiple exit points in a function, the function is too large and is doing too much.

I would recommend reading the chapter about functions in Robert C. Martin's book, Clean Code.

Essentially, you should try to write functions with 4 lines of code or less.

Some notes from Mike Long’s Blog:

  • The first rule of functions: they should be small
  • The second rule of functions: they should be smaller than that
  • Blocks within if statements, while statements, for loops, etc should be one line long
  • …and that line of code will usually be a function call
  • There should be no more than one or maybe two levels of indentation
  • Functions should do one thing
  • Function statements should all be at the same level of abstraction
  • A function should have no more than 3 arguments
  • Output arguments are a code smell
  • Passing a boolean flag into a function is truly awful. You are by definition doing two --things in the function.
  • Side effects are lies.

The answer is very context dependent. If you are making a GUI and have a function which initialises API's and opens windows at the start of your main it will be full of calls which may throw errors, each of which would cause the instance of the program to close. If you used nested IF statements and indent your code could quickly become very skewed to the right. Returning on an error at each stage might be better and actually more readable while being just as easy to debug with a few flags in the code.

If, however, you are testing different conditions and returning different values depending on the results in your method it may be much better practice to have a single exit point. I used to work on image processing scripts in MATLAB which could get very large. Multiple exit points could make the code extremely hard to follow. Switch statements were much more appropriate.

The best thing to do would be to learn as you go. If you are writing code for something try finding other people's code and seeing how they implement it. Decide which bits you like and which bits you don't.

I used to be an advocate of single-exit style. My reasoning came mostly from pain...

Single-exit is easier to debug.

Given the techniques and tools we have today, this is a far less reasonable position to take as unit tests and logging can make single-exit unnecessary. That said, when you need to watch code execute in a debugger, it was much harder to understand and work with code containing multiple exit points.

This became especially true when you needed to interject assignments in order to examine state (replaced with watch expressions in modern debuggers). It was also too easy to alter the control flow in ways that hid the problem or broke the execution altogether.

Single-exit methods were easier to step through in the debugger, and easier to tease apart without breaking the logic.