Why is C++ template use not recommended in a space/radiated environment?

By reading this question, I understood, for instance, why dynamic allocation or exceptions are not recommended in environments where radiation is high, like in space or in a nuclear power plant. Concerning templates, I don't see why. Could you explain it to me?

Considering this answer, it says that it is quite safe to use.

Note: I'm not talking about complex standard library stuff, but purpose-made custom templates.

12878 次浏览

Notice that space-compatible (radiation-hardened, aeronautics compliant) computing devices are very expensive (including to launch in space, since their weight exceeds kilograms), and that a single space mission costs perhaps hundred million € or US$. Losing the mission because of software or computer concerns has generally a prohibitive cost so is unacceptable and justifies costly development methods and procedures that you won't even dream using for developing your mobile phone applet, and using probabilistic reasoning and engineering approaches is recommended, since cosmic rays are still somehow an "unusual" event. From a high-level point of view, a cosmic ray and the bit flip it produces can be considered as noise in some abstract form of signal or of input. You could look at that "random bit-flip" problem as a signal-to-noise ratio problem, then randomized algorithms may provide a useful conceptual framework (notably at the meta level, that is when analyzing your safety-critical source code or compiled binary, but also, at critical system run-time, in some sophisticated kernel or thread scheduler), with an information theory viewpoint.

Why C++ template use is not recommended in space/radiated environment?

That recommendation is a generalization, to C++, of MISRA C coding rules and of Embedded C++ rules, and of DO178C recommendations, and it is not related to radiation, but to embedded systems. Because of radiation and vibration constraints, the embedded hardware of any space rocket computer has to be very small (e.g. for economical and energy-consumption reasons, it is more -in computer power- a Raspberry Pi-like system than a big x86 server system). Space hardened chips cost 1000x much as their civilian counterparts. And computing the WCET on space-embedded computers is still a technical challenge (e.g. because of CPU cache related issues). Hence, heap allocation is frowned upon in safety-critical embedded software-intensive systems (how would you handle out-of-memory conditions in these? Or how would you prove that you have enough RAM for MISRA C0 real run time cases?)

Remember that in the safety-critical software world, you not only somehow "guarantee" or "promise", and certainly assess (often with some clever probabilistic reasoning), the quality of your own software, but also of all the software tools used to build it (in particular: your compiler and your linker; Boeing or Airbus won't change their version of GCC cross-compiler used to compile their flight control software without prior written approval from e.g. FAA or DGAC). Most of your software tools need to be somehow approved or certified.

Be aware that, in practice, most C++ (but certainly not all) templates internally use the heap. And standard C++ containers certainly do. Writing templates which never use the heap is a difficult exercise. If you are capable of that, you can use templates safely (assuming you do trust your C++ compiler and its template expansion machinery, which is the trickiest part of the C++ front-end of most recent C++ compilers, such as GCC or Clang).

I guess that for similar (toolset reliability) reasons, it is frowned upon to use many source code generation tools (doing some kind of metaprogramming, e.g. emitting C++ or C code). Observe, for example, that if you use bison (or RPCGEN) in some safety critical software (compiled by make and gcc), you need to assess (and perhaps exhaustively test) not only gcc and make, but also bison. This is an engineering reason, not a scientific one. Notice that some embedded systems may use make0, in particular to cleverly deal with make1 input signals (perhaps even random bit flips due to rare-enough cosmic rays). Proving, testing, or analyzing (or just assessing) such random-based algorithms is a quite difficult topic.

Look also into Frama-Clang and CompCert and observe the following:

  • C++11 (or following) is an horribly complex programming language. It has no complete formal semantics. The people expert enough in C++ are only a few dozens worldwide (probably, most of them are in its standard committee). I am capable of coding in C++, but not of explaining all the subtle corner cases of move semantics, or of the C++ memory model. Also, C++ requires in practice many optimizations to be used efficiently.

  • It is very difficult to make an error-free C++ compiler, in particular because C++ practically requires tricky optimizations, and because of the complexity of the C++ specification. But current ones (like recent GCC or Clang) are in practice quite good, and they have few (but still some) residual compiler bugs. There is no CompCert++ for C++ yet, and making one requires several millions of € or US$ (but if you can collect such an amount of money, please contact me by email, e.g. to basile.starynkevitch@cea.fr, my work email). And the space software industry is extremely conservative.

  • It is difficult to make a good C or C++ heap memory allocator. Coding one is a matter of trade-offs. As a joke, consider adapting this C heap allocator to C++.

  • proving safety properties (in particular, lack of race conditions or undefined behavior such as buffer overflow at run-time) of template-related C++ code is still, in 2Q2019, slightly ahead of the state of the art of static program analysis of C++ code. My draft Bismon technical report (it is a draft H2020 deliverable, so please skip pages for European bureaucrats) has several pages explaining this in more details. Be aware of Rice's theorem.

  • a whole system C++ embedded software test could require a rocket launch (a la Ariane 5 test flight 501, or at least complex and heavy experimentation in lab). It is very expensive. Even testing, on Earth, a Mars rover takes a lot of money.

Think of it: you are coding some safety-critical embedded software (e.g. for train braking, autonomous vehicles, autonomous drones, big oil platform or oil refinery, missiles, etc...). You naively use some C++ standard container, e.g. some std::map<std::string,long>. What should happen for out of memory conditions? How do you "prove", or at least "convince", to the people working in organizations funding a 100M€ space rocket, that your embedded software (including the compiler used to build it) is good enough? A decade-year old rule was to forbid any kind of dynamic heap allocation.

I'm not talking about complex standard library stuff but purposed-made custom templates.

Even these are difficult to prove, or more generally to assess their quality (and you'll probably want to use your own allocator inside them). In space, the code space is a strong constraint. So you would compile with, for example, g++ -Os -Wall or clang++ -Os -Wall. But how did you prove -or simply test- all the subtle optimizations done by -Os (and these are specific to your version of GCC or of Clang)? Your space funding organization will ask you that, since any run-time bug in embedded C++ space software can crash the mission (read again about Ariane 5 first flight failure - coded in some dialect of Ada which had at that time a "better" and "safer" type system than C++17 today), but don't laugh too much at Europeans. Boeing 737 MAX with its MACS is a similar mess).


My personal recommendation (but please don't take it too seriously. In 2019 it is more a pun than anything else) would be to consider coding your space embedded software in Rust. Because it is slightly safer than C++. Of course, you'll have to spend 5 to 10 M€ (or MUS$) in 5 or 7 years to get a fine Rust compiler, suitable for space computers (again, please contact me professionally, if you are capable of spending that much on a free software Compcert/Rust like compiler). But that is just a matter of software engineering and software project managements (read both the Mythical Man-Month and Bullshit jobs for more, be also aware of Dilbert principle: it applies as much to space software industry, or embedded compiler industry, as to anything else).

My strong and personal opinion is that the European Commission should fund (e.g. through Horizon Europe) a free software CompCert++ (or even better, a Compcert/Rust) like project (and such a project would need more than 5 years and more than 5 top-class, PhD researchers). But, at the age of 60, I sadly know it is not going to happen (because the E.C. ideology -mostly inspired by German policies for obvious reasons- is still the illusion of the End of History, so H2020 and Horizon Europe are, in practice, mostly a way to implement tax optimizations for corporations in Europe through European tax havens), and that after several private discussions with several members of CompCert project. I sadly expect DARPA or NASA to be much more likely to fund some future CompCert/Rust project (than the E.C. funding it).


NB. The European avionics industry (mostly Airbus) is using much more formal methods approaches that the North American one (Boeing). Hence some (not all) unit tests are avoided (since replaced by formal proofs of source code, perhaps with tools like Frama-C or Astrée - neither have been certified for C++, only for a subset of C forbidding C dynamic memory allocation and several other features of C). And this is permitted by DO-178C (not by the predecessor DO-178B) and approved by the French regulator, DGAC (and I guess by other European regulators).

Also notice that many SIGPLAN conferences are indirectly related to the OP's question.

The argumentation against the usage of templates in safety code is that they are considered to increase the complexity of your code without real benefit. This argumentation is valid if you have bad tooling and a classic idea of safety. Take the following example:

template<class T>  fun(T t){
do_some_thing(t);
}

In the classic way to specify a safety system you have to provide a complete description of each and every function and structure of your code. That means you are not allowed to have any code without specification. That means you have to give a complete description of the functionality of the template in its general form. For obvious reasons that is not possible. That is BTW the same reason why function-like macros are also forbidden. If you change the idea in a way that you describe all actual instantiations of this template, you overcome this limitation, but you need proper tooling to prove that you really described all of them.

The second problem is that one:

fun(b);

This line is not a self-contained line. You need to look up the type of b to know which function is actually called. Proper tooling which understands templates helps here. But in this case it is true that it makes the code harder to check manually.

This statement about templates being a cause of vulnerability seems completely surrealistic to me. For two main reasons:

  • templates are "compiled away", i.e. instantiated and code-generated like any other function/member, and there is no behavior specific to them. Just as if they never existed;

  • no construction in any language is neither safe or vulnerable; if an ionizing particule changes a single bit of memory, be it in code or in data, anything is possible (from no noticeable problem occurring up to processor crash). The way to shield a system against this is by adding hardware memory error detection/correction capabilities. Not by modifying the code !