“似是而非”原则到底是什么?

正如标题所说:

“似是而非”原则到底是什么?

一个典型的答案是:

允许不改变程序可观察行为的任何和所有代码转换的规则

我们不时地从某些实现中获得行为,这些行为归因于这条规则,但很多时候是错误的。

那么,这条规则到底是什么?标准没有明确提到这条规则作为一个章节或段落,那么究竟什么属于这条规则的职权范围?

在我看来,这似乎是一个没有被标准详细定义的灰色区域。谁能详细说明一下细节,引用一下标准中的参考文献?

注意: 将这两种语言都标记为 C 和 C + + ,因为它们都与这两种语言相关。

11401 次浏览

What is the "as-if" rule?

The "as-if" rule basically defines what transformations an implementation is allowed to perform on a legal C++ program. In short, all transformations that do not affect a program's "observable behavior" (see below for a precise definition) are allowed.

The goal is to give implementations freedom to perform optimizations as long as the behavior of the program remains compliant with the semantics specified by the C++ Standard in terms of an abstract machine.


Where does the Standard introduce this rule?

The C++11 Standard introduces the "as-if" rule in Paragraph 1.9/1:

The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.

Also, an explanatory footnote adds:

This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance, an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the observable behavior of the program are produced.


What does the rule mandate exactly?

Paragraph 1.9/5 further specifies:

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

It is worth stressing that this constraint applies when "executing a well-formed program" only, and that the possible outcomes of executing a program which contains undefined behavior are unconstrained. This is made explicit in Paragraph 1.9/4 as well:

Certain other operations are described in this International Standard as undefined (for example, the effect of attempting to modify a const object). [ Note: This International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]

Finally, concerning the definition of "observable behavior", Paragraph 1.9/8 goes as follows:

The least requirements on a conforming implementation are:

— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

— The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.

These collectively are referred to as the observable behavior of the program. [ Note: More stringent correspondences between abstract and actual semantics may be defined by each implementation. —end note ]


Are there situations where this rule does not apply?

To the best of my knowledge, the only exception to the "as-if" rule is copy/move elision, which is allowed even though the copy constructor, move constructor, or destructor of a class have side effects. The exact conditions for this are specified in Paragraph 12.8/31:

When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the constructor selected for the copy/move operation and/or the destructor for the object have side effects. [...]

In C11 the rule is never called by that name. However C, just like C++, defines the behaviour in terms of abstract machine. The as-if rule is in C11 5.1.2.3p4 and p6:

  1. In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).

  2. [...]

  3. The least requirements on a conforming implementation are:

    • Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
    • At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
    • The input and output dynamics of interactive devices shall take place as specified in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.

     

    This is the observable behavior of the program.

In C, C++, Ada, Java, SML... in any programming language well specified by describing the (usually many possible, non-deterministic) behavior(s) of a program (exposed to series of interactions on I/O ports), there is no distinct as-if rule.

An example of distinct rule is the one that says that a division by zero raises an exception (Ada, Caml) or a null dereference raises an exception (Java). You could change the rule to specify something else and you would end up with a different language (that some people would rather call a "dialect"(*). A distinct rule is there to specify some distinct uses of a programming language like a distinct grammatical rule cover some syntax constructs.

(*) A dialect according to some linguists is a language with an "army". in that context, that could mean a programming language without a committee and a specific industry of compiler editors.

The as-if rule is not a distinct rule; it doesn't cover any program in particular and is not even a rule that could be discussed, removed, or altered in any way: the so called "rule" simply reiterates that program semantics is defined, and can only be portably (universally) defined, in term of the visible interactions of an execution of the program with the "external" world.

The external world can be I/O interfaces (stdio), a GUI, even an interactive interpreter that output the resulting value of a pure applicative language. In C and C++ is includes the (vaguely specified) accesses to volatile objects, which is another way of saying that some objects at given point must be represented in memory strictly according to the ABI (Application Binary Interface) without ever mentioning the ABI explicitly.

The definition of what is a trace of execution, also called the visible or observable behavior defines what is meant by "as-if rule". The as-if rule tries to explain it, but by doing so, it confuses people more than it clarifies things as it gives the expression of being an additional semantic rule giving more leeway to the implementation.

Summary:

  • The so called "as-if rule" does not relax any constraints on implementations.
  • You cannot remove the as-if rule in any programming language specified in term of visible behavior (execution traces composed for interaction with the external world) to get a distinct dialect.
  • You cannot add the as-if rule to any programming language not specified in term of visible behavior.