C + + 11 lambda 实现和内存模型

我想了解一些关于如何正确地考虑 C + + 11闭包和 std::function的信息,包括它们是如何实现的以及如何处理内存的。

虽然我不相信过早的优化,但我有一个习惯,就是在编写新代码时,会仔细考虑我的选择对性能的影响。我还做了相当多的实时编程,例如在微控制器和音频系统上,这样可以避免不确定的内存分配/释放暂停。

因此,我想更好地理解何时使用或不使用 C + + lambdas。

我目前的理解是,没有捕获闭包的 lambda 与 C 回调完全一样。但是,当通过值或引用捕获环境时,将在堆栈上创建一个匿名对象。如果必须从函数返回值闭包,则将其包装在 std::function中。在这种情况下,闭包内存发生了什么变化?它是否从堆栈复制到堆中?它是否在 std::function被释放的时候被释放,也就是说,它是否像 std::shared_ptr一样被引用计数?

我想象在一个实时系统中,我可以设置一个 lambda 函数链,将 B 作为连续参数传递给 A,从而创建一个处理流水线 A->B。在这种情况下,A 和 B 闭包将被分配一次。尽管我不确定这些是在堆栈上还是在堆上分配。然而,一般来说,在实时系统中使用这种方法似乎是安全的。另一方面,如果 B 构造了一些它返回的 lambda 函数 C,那么 C 的内存将被重复分配和释放,这对于实时使用来说是不可接受的。

在伪代码,一个 DSP 循环,我认为这将是实时安全的。我想先执行处理块 A,然后执行处理块 B,其中 A 调用它的参数。这两个函数都返回 std::function对象,因此 f将是一个 std::function对象,其环境存储在堆中:

auto f = A(B);  // A returns a function which calls B
// Memory for the function returned by A is on the heap?
// Note that A and B may maintain a state
// via mutable value-closure!
for (t=0; t<1000; t++) {
y = f(t)
}

我认为在实时代码中使用这种方法可能不太好:

for (t=0; t<1000; t++) {
y = A(B)(t);
}

其中我认为堆栈内存可能用于闭包:

freq = 220;
A = 2;
for (t=0; t<1000; t++) {
y = [=](int t){ return sin(t*freq)*A; }
}

在后一种情况下,闭包是在循环的每次迭代中构造的,但是与前一个示例不同,它的成本很低,因为它就像一个函数调用,没有进行堆分配。此外,我想知道编译器是否可以“提升”闭包并进行内联优化。

是这样吗? 谢谢。

33394 次浏览

My current understanding is that a lambda with no captured closure is exactly like a C callback. However, when the environment is captured either by value or by reference, an anonymous object is created on the stack.

No; it is always a C++ object with an unknown type, created on the stack. A capture-less lambda can be converted into a function pointer (though whether it is suitable for C calling conventions is implementation dependent), but that doesn't mean it is a function pointer.

When a value-closure must be returned from a function, one wraps it in std::function. What happens to the closure memory in this case?

A lambda isn't anything special in C++11. It's an object like any other object. A lambda expression results in a temporary, which can be used to initialize a variable on the stack:

auto lamb = []() {return 5;};

lamb is a stack object. It has a constructor and destructor. And it will follow all of the C++ rules for that. The type of lamb will contain the values/references that are captured; they will be members of that object, just like any other object members of any other type.

You can give it to a std::function:

auto func_lamb = std::function<int()>(lamb);

In this case, it will get a copy of the value of lamb. If lamb had captured anything by value, there would be two copies of those values; one in lamb, and one in func_lamb.

When the current scope ends, func_lamb will be destroyed, followed by lamb, as per the rules of cleaning up stack variables.

You could just as easily allocate one on the heap:

auto func_lamb_ptr = new std::function<int()>(lamb);

Exactly where the memory for the contents of a std::function goes is implementation-dependent, but the type-erasure employed by std::function generally requires at least one memory allocation. This is why std::function's constructor can take an allocator.

Is it freed whenever the std::function is freed, i.e., is it reference-counted like a std::shared_ptr?

std::function stores a copy of its contents. Like virtually every standard library C++ type, function uses value semantics. Thus, it is copyable; when it is copied, the new function object is completely separate. It is also moveable, so any internal allocations can be transferred appropriately without needing more allocating and copying.

Thus there is no need for reference counting.

Everything else you state is correct, assuming that "memory allocation" equates to "bad to use in real-time code".

C++ lambda is just a syntactic sugar around (anonymous) Functor class with overloaded operator() and std::function is just a wrapper around callables (i.e functors, lambdas, c-functions, ...) which does copy by value the "solid lambda object" from the current stack scope - to the heap.

To test the number of actual constructors/relocatons I made a test (using another level of wrapping to shared_ptr but its not the case). See for yourself:

#include <memory>
#include <string>
#include <iostream>


class Functor {
std::string greeting;
public:


Functor(const Functor &rhs) {
this->greeting = rhs.greeting;
std::cout << "Copy-Ctor \n";
}
Functor(std::string _greeting="Hello!"): greeting { _greeting } {
std::cout << "Ctor \n";
}


Functor & operator=(const Functor & rhs) {
greeting = rhs.greeting;
std::cout << "Copy-assigned\n";
return *this;
}


virtual ~Functor() {
std::cout << "Dtor\n";
}


void operator()()
{
std::cout << "hey" << "\n";
}
};


auto getFpp() {
std::shared_ptr<std::function<void()>> fp = std::make_shared<std::function<void()>>(Functor{}
);
(*fp)();
return fp;
}


int main() {
auto f = getFpp();
(*f)();
}

it makes this output:

Ctor
Copy-Ctor
Copy-Ctor
Dtor
Dtor
hey
hey
Dtor

Exactly same set of ctors/dtors would be called for the stack-allocated lambda object! (Now it calls Ctor for stack allocation, Copy-ctor (+ heap alloc) to construct it in std::function and another one for making shared_ptr heap allocation + construction of function)