通过值传递与通过右值引用传递

我应在何时声明我的职能为:

void foo(Widget w);

而不是:

void foo(Widget&& w);

假设这是唯一的重载(例如,我选择一个或另一个,而不是两个,没有其他重载)。不涉及模板。假设函数 foo需要 Widget的所有权(例如,const Widget&不在本讨论范围之内)。我对这些情况之外的任何答案都不感兴趣。(关于 为什么,请参阅后面的附录,这些限制是问题的一部分。)

我和我的同事可以得出的主要区别是,rvalue 引用参数强制您显式地表示副本。调用方负责创建一个显式副本,然后在需要副本时用 std::move传递该副本。在按值传递的情况下,副本的成本是隐藏的:

    //If foo is a pass by value function, calling + making a copy:
Widget x{};
foo(x); //Implicit copy
//Not shown: continues to use x locally
    

//If foo is a pass by rvalue reference function, calling + making a copy:
Widget x{};
//foo(x); //This would be a compiler error
auto copy = x; //Explicit copy
foo(std::move(copy));
//Not shown: continues to use x locally

除了迫使人们明确地复制和改变在调用函数时得到的语法糖的数量之外,还有什么不同之处呢?他们对界面有什么不同的看法?它们之间的效率是高还是低?

我和我的同事们已经想到的其他事情:

  • Rvalue 引用参数意味着 移动参数,但不强制它。您在调用站点传入的参数之后可能会处于其原始状态。也有可能函数在不调用 move 构造函数的情况下吃掉/更改参数,但是假设因为它是一个 rvalue 引用,所以调用方放弃了控制权。通过值传递,如果你移动到它,你必须假设移动已经发生; 没有选择。
  • 假设没有省略,则使用 pass by rvalue 消除单个 move 构造函数调用。
  • 编译器有更好的机会省略通过值传递的副本/移动。有人能证实这一说法吗?最好有一个到 gcc.godbolt.org的链接,显示来自 gcc/clang 的优化生成的代码,而不是标准中的一行。我展示这一点的尝试可能无法成功地隔离行为: https://godbolt.org/g/4yomtt

附录: 为什么我是不是把这个问题限制得太多了?

  • 没有重载-如果还有其他重载,这将演变成一个按值传递与包含 const 引用和 rvalue 引用的重载集合的讨论,在这一点上,重载集合显然是更有效和胜利的。这是众所周知的,因此并不有趣。
  • 没有模板-我不感兴趣如何转发参考适合的图片。如果有转发引用,那么无论如何都要调用 std: : forward。带有转发引用的目标是在收到事物时传递它们。副本不相关,因为您只需要传递一个左值。这是众所周知的,而且没什么意思。
  • foo需要拥有 Widget(也就是没有 const Widget&)-我们不是在讨论只读函数。如果该函数是只读的,或者不需要拥有或延长 Widget的生命周期,那么答案就变成了 const Widget&,这也是众所周知的,而且没什么意思。我还提到了为什么我们不想谈论过载。
32513 次浏览

Choosing between by-value and by-rvalue-ref, with no other overloads, is not meaningful.

With pass by value the actual argument can be an lvalue expression.

With pass by rvalue-ref the actual argument must be an rvalue.


If the function is storing a copy of the argument, then a sensible choice is between pass-by-value, and a set of overloads with pass-by-ref-to-const and pass-by-rvalue-ref. For an rvalue expression as actual argument the set of overloads can avoid one move. It's an engineering gut-feeling decision whether the micro-optimization is worth the added complexity and typing.

Unless the type is a move-only type you normally have an option to pass by reference-to-const and it seems arbitrary to make it "not part of the discussion" but I will try.

I think the choice partly depends on what foo is going to do with the parameter.

The function needs a local copy

Let's say Widget is an iterator and you want to implement your own std::next function. next needs its own copy to advance and then return. In this case your choice is something like:

Widget next(Widget it, int n = 1){
std::advance(it, n);
return it;
}

vs

Widget next(Widget&& it, int n = 1){
std::advance(it, n);
return std::move(it);
}

I think by-value is better here. From the signature you can see it is taking a copy. If the caller wants to avoid a copy they can do a std::move and guarantee the variable is moved from but they can still pass lvalues if they want to. With pass-by-rvalue-reference the caller cannot guarantee that the variable has been moved from.

Move-assignment to a copy

Let's say you have a class WidgetHolder:

class WidgetHolder {
Widget widget;
//...
};

and you need to implement a setWidget member function. I'm going to assume you already have an overload that takes a reference-to-const:

WidgetHolder::setWidget(const Widget& w) {
widget = w;
}

but after measuring performance you decide you need to optimize for r-values. You have a choice between replacing it with:

WidgetHolder::setWidget(Widget w) {
widget = std::move(w);
}

Or overloading with:

WidgetHolder::setWidget(Widget&& widget) {
widget = std::move(w);
}

This one is a little bit more tricky. It is tempting choose pass-by-value because it accepts both rvalues and lvalues so you don't need two overloads. However it is unconditionally taking a copy so you can't take advantage of any existing capacity in the member variable. The pass by reference-to-const and pass by r-value reference overloads use assignment without taking a copy which might be faster

Move-construct a copy

Now lets say you are writing the constructor for WidgetHolder and as before you have already implemented a constructor that takes an reference-to-const:

WidgetHolder::WidgetHolder(const Widget& w) : widget(w) {
}

and as before you have measured peformance and decided you need to optimize for rvalues. You have a choice between replacing it with:

WidgetHolder::WidgetHolder(Widget w) : widget(std::move(w)) {
}

Or overloading with:

WidgetHolder::WidgetHolder(Widget&& w) : widget(std:move(w)) {
}

In this case, the member variable cannot have any existing capacity since this is the constructor. You are move-constucting a copy. Also, constructors often take many parameters so it can be quite a pain to write all the different permutations of overloads to optimize for r-value references. So in this case it is a good idea to use pass-by-value, especially if the constructor takes many such parameters.

Passing unique_ptr

With unique_ptr the efficiency concerns are less important given that a move is so cheap and it doesn't have any capacity. More important is expressiveness and correctness. There is a good discussion of how to pass unique_ptr here.

The rvalue reference parameter forces you to be explicit about copies.

Yes, pass-by-rvalue-reference got a point.

The rvalue reference parameter means that you may move the argument, but does not mandate it.

Yes, pass-by-value got a point.

But that also gives to pass-by-rvalue the opportunity to handle exception guarantee: if foo throws, widget value is not necessary consumed.

For move-only types (as std::unique_ptr), pass-by-value seems to be the norm (mostly for your second point, and first point is not applicable anyway).

EDIT: standard library contradicts my previous sentence, one of shared_ptr's constructor takes std::unique_ptr<T, D>&&.

For types which have both copy/move (as std::shared_ptr), we have the choice of the coherency with previous types or force to be explicit on copy.

Unless you want to guarantee there is no unwanted copy, I would use pass-by-value for coherency.

Unless you want guaranteed and/or immediate sink, I would use pass-by-rvalue.

For existing code base, I would keep consistency.

What do rvalue usages say about an interface versus copying? rvalue suggests to the caller that the function both wants to own the value and has no intention of letting the caller know of any changes it has made. Consider the following (I know you said no lvalue references in your example, but bear with me):

//Hello. I want my own local copy of your Widget that I will manipulate,
//but I don't want my changes to affect the one you have. I may or may not
//hold onto it for later, but that's none of your business.
void foo(Widget w);


//Hello. I want to take your Widget and play with it. It may be in a
//different state than when you gave it to me, but it'll still be yours
//when I'm finished. Trust me!
void foo(Widget& w);


//Hello. Can I see that Widget of yours? I don't want to mess with it;
//I just want to check something out on it. Read that one value from it,
//or observe what state it's in. I won't touch it and I won't keep it.
void foo(const Widget& w);


//Hello. Ooh, I like that Widget you have. You're not going to use it
//anymore, are you? Please just give it to me. Thank you! It's my
//responsibility now, so don't worry about it anymore, m'kay?
void foo(Widget&& w);

For another way of looking at it:

//Here, let me buy you a new car just like mine. I don't care if you wreck
//it or give it a new paint job; you have yours and I have mine.
void foo(Car c);


//Here are the keys to my car. I understand that it may come back...
//not quite the same... as I lent it to you, but I'm okay with that.
void foo(Car& c);


//Here are the keys to my car as long as you promise to not give it a
//paint job or anything like that
void foo(const Car& c);


//I don't need my car anymore, so I'm signing the title over to you now.
//Happy birthday!
void foo(Car&& c);

Now, if Widgets have to remain unique (as actual widgets in, say, GTK do) then the first option cannot work. The second, third and fourth options make sense, because there's still only one real representation of the data. Anyway, that's what those semantics say to me when I see them in code.

Now, as for efficiency: it depends. rvalue references can save a lot of time if Widget has a pointer to a data member whose pointed-to contents can be rather large (think an array). Since the caller used an rvalue, they're saying they don't care about what they're giving you anymore. So, if you want to move the caller's Widget's contents into your Widget, just take their pointer. No need to meticulously copy each element in the data structure their pointer points to. This can lead to pretty good improvements in speed (again, think arrays). But if the Widget class doesn't have any such thing, this benefit is nowhere to be seen.

Hopefully that gets at what you were asking; if not, I can perhaps expand/clarify things.

When you pass by rvalue reference object lifetimes get complicated. If the callee does not move out of the argument, the destruction of the argument is delayed. I think this is interesting in two cases.

First, you have an RAII class

void fn(RAII &&);


RAII x{underlying_resource};
fn(std::move(x));
// later in the code
RAII y{underlying_resource};

When initializing y, the resource could still be held by x if fn doesn't move out of the rvalue reference. In the pass by value code, we know that x gets moved out of, and fn releases x. This is probably a case where you would want to pass by value, and the copy constructor would likely be deleted, so you wouldn't have to worry about accidental copies.

Second, if the argument is a large object and the function doesn't move out, the lifetime of the vectors data is larger than in the case of pass by value.

vector<B> fn1(vector<A> &&x);
vector<C> fn2(vector<B> &&x);


vector<A> va;  // large vector
vector<B> vb = fn1(std::move(va));
vector<C> vc = fn2(std::move(vb));

In the example above, if fn1 and fn2 don't move out of x, then you will end up with all of the data in all of the vectors still alive. If you instead pass by value, only the last vector's data will still be alive (assuming vectors move constructor clears the sources vector).

One issue not mentioned in the other answers is the idea of exception-safety.

In general, if the function throws an exception, we would ideally like to have the strong exception guarantee, meaning that the call has no effect other than raising the exception. If pass-by-value uses the move constructor, then such an effect is essentially unavoidable. So an rvalue-reference argument may be superior in some cases. (Of course, there are various cases where the strong exception guarantee isn't achievable either way, as well as various cases where the no-throw guarantee is available either way. So this is not relevant in 100% of cases. But it's relevant sometimes.)

One notable difference is that if you move to an pass-by-value function:

void foo(Widget w);
foo(std::move(copy));

compiler must generate a move-constructor call Widget(Widget&&) to create the value object. In case of pass-by-rvalue-reference no such call is needed as the rvalue-reference is passed directly to the method. Usually this does not matter, as move constructors are trivial (or default) and are inlined most of the time. (you can check it on gcc.godbolt.org -- in your example declare move constructor Widget(Widget&&); and it will show up in assembly)

So my rule of thumb is this:

  • if the object represents a unique resource (without copy semantics) I prefer to use pass-by-rvalue-reference,
  • otherwise if it logically makes sense to either move or copy the object, I use pass-by-value.