Proper stack and heap usage in C++?

I've been programming for a while but It's been mostly Java and C#. I've never actually had to manage memory on my own. I recently began programming in C++ and I'm a little confused as to when I should store things on the stack and when to store them on the heap.

My understanding is that variables which are accessed very frequently should be stored on the stack and objects, rarely used variables, and large data structures should all be stored on the heap. Is this correct or am I incorrect?

57976 次浏览

No, the difference between stack and heap isn't performance. It's lifespan: any local variable inside a function (anything you do not malloc() or new) lives on the stack. It goes away when you return from the function. If you want something to live longer than the function that declared it, you must allocate it on the heap.

class Thingy;


Thingy* foo( )
{
int a; // this int lives on the stack
Thingy B; // this thingy lives on the stack and will be deleted when we return from foo
Thingy *pointerToB = &B; // this points to an address on the stack
Thingy *pointerToC = new Thingy(); // this makes a Thingy on the heap.
// pointerToC contains its address.


// this is safe: C lives on the heap and outlives foo().
// Whoever you pass this to must remember to delete it!
return pointerToC;


// this is NOT SAFE: B lives on the stack and will be deleted when foo() returns.
// whoever uses this returned pointer will probably cause a crash!
return pointerToB;
}

For a clearer understanding of what the stack is, come at it from the other end -- rather than try to understand what the stack does in terms of a high level language, look up "call stack" and "calling convention" and see what the machine really does when you call a function. Computer memory is just a series of addresses; "heap" and "stack" are inventions of the compiler.

The choice of whether to allocate on the heap or on the stack is one that is made for you, depending on how your variable is allocated. If you allocate something dynamically, using a "new" call, you are allocating from the heap. If you allocate something as a global variable, or as a parameter in a function it is allocated on the stack.

You also would store an item on the heap if it needs to be used outside the scope of the function in which it is created. One idiom used with stack objects is called RAII - this involves using the stack based object as a wrapper for a resource, when the object is destroyed, the resource would be cleaned up. Stack based objects are easier to keep track of when you might be throwing exceptions - you don't need to concern yourself with deleting a heap based object in an exception handler. This is why raw pointers are not normally used in modern C++, you would use a smart pointer which can be a stack based wrapper for a raw pointer to a heap based object.

To add to the other answers, it can also be about performance, at least a little bit. Not that you should worry about this unless it's relevant for you, but:

Allocating in the heap requires finding a tracking a block of memory, which is not a constant-time operation (and takes some cycles and overhead). This can get slower as memory becomes fragmented, and/or you're getting close to using 100% of your address space. On the other hand, stack allocations are constant-time, basically "free" operations.

Another thing to consider (again, really only important if it becomes an issue) is that typically the stack size is fixed, and can be much lower than the heap size. So if you're allocating large objects or many small objects, you probably want to use the heap; if you run out of stack space, the runtime will throw the site titular exception. Not usually a big deal, but another thing to consider.

I would say:

Store it on the stack, if you CAN.

Store it on the heap, if you NEED TO.

Therefore, prefer the stack to the heap. Some possible reasons that you can't store something on the stack are:

  • It's too big - on multithreaded programs on 32-bit OS, the stack has a small and fixed (at thread-creation time at least) size (typically just a few megs. This is so that you can create lots of threads without exhausting address space. For 64-bit programs, or single threaded (Linux anyway) programs, this is not a major issue. Under 32-bit Linux, single threaded programs usually use dynamic stacks which can keep growing until they reach the top of the heap.
  • You need to access it outside the scope of the original stack frame - this is really the main reason.

It is possible, with sensible compilers, to allocate non-fixed size objects on the heap (usually arrays whose size is not known at compile time).

In my opinion there are two deciding factors

1) Scope of variable
2) Performance.

I would prefer to use stack in most cases but if you need access to variable outside scope you can use heap.

To enhance performance while using heaps you can also use the functionality to create heap block and that can help in gaining performance rather than allocating each variable in different memory location.

It's more subtle than the other answers suggest. There is no absolute divide between data on the stack and data on the heap based on how you declare it. For example:

std::vector<int> v(10);

In the body of a function, that declares a vector (dynamic array) of ten integers on the stack. But the storage managed by the vector is not on the stack.

Ah, but (the other answers suggest) the lifetime of that storage is bounded by the lifetime of the vector itself, which here is stack-based, so it makes no difference how it's implemented - we can only treat it as a stack-based object with value semantics.

Not so. Suppose the function was:

void GetSomeNumbers(std::vector<int> &result)
{
std::vector<int> v(10);


// fill v with numbers


result.swap(v);
}

So anything with a swap function (and any complex value type should have one) can serve as a kind of rebindable reference to some heap data, under a system which guarantees a single owner of that data.

Therefore the modern C++ approach is to never store the address of heap data in naked local pointer variables. All heap allocations must be hidden inside classes.

If you do that, you can think of all variables in your program as if they were simple value types, and forget about the heap altogether (except when writing a new value-like wrapper class for some heap data, which ought to be unusual).

You merely have to retain one special bit of knowledge to help you optimise: where possible, instead of assigning one variable to another like this:

a = b;

swap them like this:

a.swap(b);

because it's much faster and it doesn't throw exceptions. The only requirement is that you don't need b to continue to hold the same value (it's going to get a's value instead, which would be trashed in a = b).

The downside is that this approach forces you to return values from functions via output parameters instead of the actual return value. But they're fixing that in C++0x with rvalue references.

In the most complicated situations of all, you would take this idea to the general extreme and use a smart pointer class such as shared_ptr which is already in tr1. (Although I'd argue that if you seem to need it, you've possibly moved outside Standard C++'s sweet spot of applicability.)

For completeness, you may read Miro Samek's article about the problems of using the heap in the context of embedded software.

A Heap of Problems

Stack is more efficient, and easier to managed scoped data.

But heap should be used for anything larger than a few KB (it's easy in C++, just create a boost::scoped_ptr on the stack to hold a pointer to the allocated memory).

Consider a recursive algorithm that keeps calling into itself. It's Very hard to limit and or guess the total stack usage! Whereas on the heap, the allocator (malloc() or new) can indicate out-of-memory by returning NULL or throw ing.

Source: Linux Kernel whose stack is no larger than 8KB!

probably this has been answered quite well. I would like to point you to the below series of articles to have a deeper understanding of low level details. Alex Darby has a series of articles, where he walks you through with a debugger. Here is Part 3 about the Stack. http://www.altdevblogaday.com/2011/12/14/c-c-low-level-curriculum-part-3-the-stack/