Does using heap memory (malloc/new) create a non-deterministic program?

I started developing software for real-time systems a few months ago in C for space applications, and also for microcontrollers with C++. There's a rule of thumb in such systems that one should never create heap objects (so no malloc/new), because it makes the program non-deterministic. I wasn't able to verify the correctness of this statement when people tell me that. So, Is this a correct statement?

The confusion for me is that as far as I know, determinism means that running a program twice will lead to the exact, same execution path. From my understanding this is an issue with multithreaded systems, since running the same program multiple times could have different threads running in different order every time.

8784 次浏览

The deal with real-time systems is that program must strictly meet certain computation and memory restrictions regardless of the execution path taken (which may still vary considerably depending on input). So what does use of generic dynamic memory allocation (such as malloc/new) mean in this context? It means that developer at some point is not able to determine exact memory consumption and it would be impossible to tell whether resulting program will be able to meet the requirements, both for memory and for computation power.

Nothing in the C11 standard or in n1570 says that malloc is deterministic (or is not); and neither some other documentation like malloc(3) on Linux. BTW, many malloc implementations are free software.

But malloc can (and does) fail, and its performance is not known (a typical call to malloc on my desktop would practically take less than a microsecond, but I could imagine weird situations where it might take much more, perhaps many milliseconds on a very loaded computer; read about thrashing). And my Linux desktop has ASLR (address space layout randomization) so runnning the same program twice gives different malloc-ed addresses (in the virtual address space of the process). BTW here is a deterministic (under specific assumptions that you need to elaborate) but practically useless malloc implementation.

determinism means that running a program twice will lead to the exact, same execution path

This is practically wrong in most embedded systems, because the physical environment is changing; for example, the software driving a rocket engine cannot expect that the thrust, or the drag, or the wind speed, etc... is exactly the same from one launch to the next one.

(so I am surprised that you believe or wish that real-time systems are deterministic; they never are! Perhaps you care about WCET, which is increasingly difficult to predict because of caches)

BTW some "real-time" or "embedded" systems are implementing their own malloc (or some variant of it). C++ programs can have their allocator-s, usable by standard containers. See also this and that, etc, etc.....

And high-level layers of embedded software (think of an autonomous automobile and its planning software) are certainly using heap allocation and perhaps even garbage collection techniques (some of which are "real-time"), but are generally not considered safety critical.

Yes it is correct. For the kind of applications you mention, everything that can occur must be specified in detail. The program must handle the worst-case scenario according to specification and set aside exactly that much memory, no more, no less. The situation where "we don't know how many inputs we get" does not exist. The worst-case scenario is specified with fixed numbers.

Your program must be deterministic in a sense that it can handle everything up to the worst-case scenario.

The very purpose of the heap is to allow several unrelated applications to share RAM memory, such as in a PC, where the amount of programs/processes/threads running isn't deterministic. This scenario does not exist in a real-time system.

In addition, the heap is non-deterministic in its nature, as segments get added or removed over time.

More info here: https://electronics.stackexchange.com/a/171581/6102

In the context of realtime systems, there is more to determinism than a repeatable "execution path". Another required property is that timing of key events is bounded. In hard realtime systems, an event that occurs outside its allowed time interval (either before the start of that interval, or after the end) represents a system failure.

In this context, usage of dynamic memory allocation can cause non-determinism, particularly if the program has a varying pattern of allocating, deallocating, and reallocating. The timing of allocations, deallocation, and reallocation can vary over time - and therefore making timings for the system as a whole unpredictable.

There is a trade-off always. It's the program's running environment and the tasks it performs that should be the basis to decide whether HEAP should be used or not.

Heap object is efficient when you want to share the data between multiple function calls. You just need to pass the pointer since heap is globally accessible. There are disadvantages as well. Some function might free up this memory but still some references may exist at other places as well.

If the heap memory is not freed after it's work is done and the program keeps on allocating more memory, at some point HEAP will run out of memory and affects the deterministic character of the program.

The comment, as stated, is incorrect.

Using a heap manager with non-deterministic behavior creates a program with non-deterministic behavior. But that is obvious.

Slightly less obvious is the existence of heap managers with deterministic behavior. Perhaps the most well-known example is the pool allocator. It has an array of N*M bytes, and an available[] mask of N bits. To allocate, it checks for the first available entry (bit test, O(N), deterministic upper bound). To deallocate, it sets the available bit (O(1)). malloc(X) will round up X to the next biggest value of M to choose the right pool.

This might not be very efficient, especially if your choices of N and M are too high. And if you choose too low, your program can fail. But the limits for N and M can be lower than for an equivalent program without dynamic memory allocation.

The problem with using heap in hard realtime software is heap allocations can fail. What do you when you run out of heap?

You are talking about space applications. You have pretty hard no-fail requirements. You must have no possibility of leaking memory so there is not enough for at least the safe mode code to run. You must not fall over. You must not throw exceptions that have no catch block. You probably don't have an OS with protected memory so one crashing application can in theory take out everything.

You probably don't want to use heap at all. The benefits don't outweigh the whole-program costs.

Non-determinsitic normally means something else but in this case the best read is they want the entire program behavior completely predictable.

Even if your heap allocator has repeatable behavior (the same sequence of allocation and free calls yield the same sequence of blocks, hence (hopefully) the same internal heap state), the state of the heap may vary drastically if the sequence of calls is changed, potentially leading to fragmentation that will cause memory allocation failures in an unpredictable way.

The reason heap allocation is frowned upon of downright forbidden in embedded systems, esp. mission critical systems such as aircraft or spacecraft guidance or life support systems is there is no way to test all possible variations in the sequence of malloc/free calls that can happen in response to intrinsically asynchronous events.

The solution is for each handler to have its one memory set aside for its purpose and it does not matter anymore (at least as far as memory use is concerned) in what order these handlers are invoked.

tl;dr: It's not that dynamic memory allocation is inherently non-deterministic (as you defined it in terms of identical execution paths); it's that it generally makes your program unpredictable. Specifically, you can't predict whether the allocator might fail in the face of an arbitrary sequence of inputs.

You could have a non-deterministic allocator. This is actually common outside of your real-time world, where operating systems use things like address layout randomization. Of course, that would make your program non-deterministic.

But that's not an interesting case, so let's assume a perfectly deterministic allocator: the same sequence of allocations and deallocations will always result in the same blocks in the same locations and those allocations and deallocations will always have a bounded running time.

Now your program can be deterministic: the same set of inputs will lead to exactly the same execution path.

The problem is that if you're allocating and freeing memory in response to inputs, you can't predict whether an allocation will ever fail (and failure is not an option).

First, your program could leak memory. So if it needs to run indefinitely, eventually an allocation will fail.

But even if you can prove there are no leaks, you would need to know that there's never an input sequence that could demand more memory than is available.

But even if you can prove that the program will never need more memory than is available, the allocator might, depending on the sequence of allocations and frees, fragment memory and thus eventually be unable to find a contiguous block to satisfy an allocation, even though there is enough free memory overall.

It's very difficult to prove that there's no sequence of inputs that will lead to pathological fragmentation.

You can design allocators to guarantee there won't be fragmentation (e.g., by allocating blocks of only one size), but that puts a substantial constraint on the caller and possibly increases the amount of memory required due to the waste. And the caller must still prove that there are no leaks and that there's a satiable upper-bound on total memory required regardless of the sequence of inputs. This burden is so high that it's actually simpler to design the system so that it doesn't use dynamic memory allocation.

Short answer

There are some effects on the data values or their statistical uncertainty distributions of, e.g, a first or second level trigger scintillator device that can derive from the non-reproducible quantity of time that you may have to wait for malloc/free.

The worst aspect is that they are not related to the physical phenomenon either with the hardware but somehow with the state of the memory (and its history).

Your goal, in that case, is to reconstruct the original sequence of events from the data affected by those errors. The reconstructed/guessed sequence will be affected by errors too. Not always this iteration will convergence on a stable solution; it is not said it will be the correct one; your data is not any more independent... You risk a logical short-circuit...

Longer answer

You stated "I wasn't able to verify the correctness of this statement when people tell me that".
I will try to give you a purely hypothetical situation/ case study.

Let's we imagine you deal with a CCD or with some 1st and 2nd level scintillator triggers on a system that have to economize resources (you're in space).
The acquisition rate will be set so that the background will be at x% of the MAXBINCOUNT.

  • There's a burst, you have a spike in the counts and an overflow in the bin counter.
    I want it all: you switch to the max acquisition rate and you finish your buffer.
    You go to free/allocate more memory meanwhile you finish the extra buffer.
    What will you do?

    1. You will keep the counteractive risking the overflow (the second level will try to count properly the timing of the data-packages) but in this case, you will go to underestimate the counts for that period?
    2. you will stop the counter introducing a hole in the time series?

    Note that:

    • Waiting for allocation you will lose the transient (or at least its beginning).
    • Whatever you do it depends on the state of your memory and it is not reproducible.
  • Now instead the signal is variable around the maxbincount at the maximum acquisition rate allowed from your hardware, and the event is longer than usual.
    You finish the space and ask for more... meanwhile, you incur in the same problem above.
    Overflow and systematic peaks counts underestimation or holes in the time series?

Let we move a second level (it can be on the 1st level trigger too).

From your hardware, you receive more data than you can stock or transmit.
You have to cluster the data in time or space (2x2, 4x4, ... 16x16 ... 256x256... pixel scaling...).

The uncertainty from the previous problem may affect the error distribution.
There are CCD setting for which you have the pixels of the border with counts close to the maxbincount (it depends from "where" you want to see better).
Now you can have a shower on your CCD or a single big spot with the same total number of counts but with a different statistical uncertainty (the part that is introduced by the waiting time)...

So for example where you are expecting a Lorentzian profile you can obtain its convolution with a Gaussian one (a Voigt), or if the second it's really dominant with a dirty Gaussian...

Introduce Integrity RTOS from GHS:

https://www.ghs.com/products/rtos/integrity.html

and LynxOS:

http://www.lynx.com/products/real-time-operating-systems/lynxos-178-rtos-for-do-178b-software-certification/

LynxOS and Integrity RTOS are among the software used in space applications, missiles, aircraft etc as many others are not approved or certified by authorities (eg, FAA).

https://www.ghs.com/news/230210r.html

To meet the stringent criteria of space applications, Integrity RTOS actually provide formal verification, ie, mathematically proven logic, that their software behave as according to specification.

Among these criteria, to quote from here:

https://en.wikipedia.org/wiki/Integrity_(operating_system)

and here:

Green Hills Integrity Dynamic memory allocation

is this:

enter image description here

I am not a specialist in formal methods, but perhaps one of the requirements for this verification is to remove the uncertainties in the timing required for memory allocation. In RTOS, all event is precisely planned milliseconds away from each other. And dynamic memory allocation always have a problem with timing required.

Mathematically you really need to prove everything worked from most fundamental assumptions about timing and amount of memory.

And if you think of the alternatives to heap memory: static memory. The address is fixed, the size allocated is fixed. The position in memory is fixed. So it is very easy to reason about memory sufficiency, reliability, availability etc.