纯函数: “无副作用”是否意味着“总是相同的输出,给予相同的输入”?

将函数定义为 pure的两个条件如下:

  1. 没有副作用(即只允许对本地范围进行更改)
  2. 给定相同的输入,总是返回相同的输出

如果第一个条件总是为真,那么第二个条件有没有不为真的时候呢?

也就是说,真的只有在第一个条件下才有必要吗?

8834 次浏览

Here are a few counterexamples that do not change the outer scope but are still considered impure:

  • function a() { return Date.now(); }
  • function b() { return window.globalMutableVar; }
  • function c() { return document.getElementById("myInput").value; }
  • function d() { return Math.random(); } (which admittedly does change the PRNG, but is not considered observable)

Accessing non-constant non-local variables is enough to be able to violate the second condition.

I always think of the two conditions for purity as complementary:

  • the result evaluation must not have effects on side state
  • the evaluation result must not be affected by side state

The term side effect only refers to the first, the function modifying the non-local state. However, sometimes read operations are considered as side effects as well: when they are operations and involve writing as well, even if their primary purpose is to access a value. Examples for that are generating a pseudo-random number that modifies the generator's internal state, reading from an input stream that advances the read position, or reading from an external sensor that involves a "take measurement" command.

There could be a source of randomness from outside the system. Suppose that part of your calculation includes the room temperature. Then executing the function will yield different results each time depending on the random external element of room temperature. The state is not changed by executing the program.

All I can think of, anyway.

It seems to me that the second condition you have described is a weaker constraint than the first.

Let me give you an example, suppose you have a function to add one that also logs to the console:

function addOneAndLog(x) {
console.log(x);
return x + 1;
}

The second condition you supplied is satisfied: this function always returns the same output when given the same input. It is, however, not a pure function because it includes the side effect of logging to the console.

A pure function is, strictly speaking, a function that satisfies the property of referential transparency. That is the property that we can replace a function application with the value it produces without changing the behaviour of the program.

Suppose we have a function that simply adds:

function addOne(x) {
return x + 1;
}

We can replace addOne(5) with 6 anywhere in our program and nothing will change.

By contrast, we cannot replace addOneAndLog(x) with the value 6 anywhere in our program without changing behaviour because the first expression results in something being written to the console whereas the second one does not.

We consider any of this extra behaviour that addOneAndLog(x) performs besides returning output as a side-effect.

The "normal" way of phrasing what a pure function is, is in terms of referential transparency. A function is pure if it is referentially transparent.

Referential Transparency, roughly, means that you can replace the call to the function with its return value or vice versa at any point in the program, without changing the meaning of the program.

So, for example, if C's printf were referentially transparent, these two programs should have the same meaning:

printf("Hello");

and

5;

and all of the following programs should have the same meaning:

5 + 5;


printf("Hello") + 5;


printf("Hello") + printf("Hello");

Because printf returns the number of characters written, in this case 5.

It gets even more obvious with void functions. If I have a function void foo, then

foo(bar, baz, quux);

should be the same as

;

I.e. since foo returns nothing, I should be able to replace it with nothing without changing the meaning of the program.

It is clear, then, that neither printf nor foo are referentially transparent, and thus neither of them are pure. In fact, a void function can never be referentially transparent, unless it is a no-op.

I find this definition much easier to handle as the one you gave. It also allows you to apply it at any granularity you want: you can apply it to individual expressions, to functions, to entire programs. It allows you, for example, to talk about a function like this:

func fib(n):
return memo[n] if memo.has_key?(n)
return 1 if n <= 1
return memo[n] = fib(n-1) + fib(n-2)

We can analyze the expressions that make up the function and easily conclude that they are not referentially transparent and thus not pure, since they use a mutable data structure, namely the memo array. However, we can also look at the function and can see that it is referentially transparent and thus pure. This is sometimes called external purity, i.e. a function that appears pure to the outside world, but is implemented impure internally.

Such functions are still useful, because while impurity infects everything around it, the external pure interface builds a kind of "purity barrier", where the impurity only infects the three lines of the function, but does not leak out into the rest of the program. These three lines are much easier to analyze for correctness than the entire program.

If the first condition is always true, are there any times the second condition is not true?

Yes

Consider simple code snippet below

public int Sum(int a, int b) {
Random rnd = new Random();
return rnd.Next(1, 10);
}

This code will return random output for same given set of inputs - however it does not have any side effect.

The overall effect of both the points #1 & #2 you mentioned when combined together means : At any point of time if function Sum with same i/p is replaced with its result in a program, overall meaning of program does not change. This is nothing but Referential transparency.

Problem with FP definitions is that they are very artificial. Each evaluation/calculation has side-effects on the evaluator. It's theoretically true. A deny of this shows only that FP apologists ignore philosophy and logic: an "evaluation" means changing of the state of some intelligent environment (machine, brain, etc). This is the nature of the evaluation process. No changes - no "calculi". The effect can be very visible: heating the CPU or its failure, shutting down the motherboard in case of overheating, and so on.

When you talk about referential transparency, you should understand that information about such transparency is available for human as a creator of the whole system and holder of semantical information and may be not available for the compiler. For example, a function can read some external resource and it will have IO monad in its signature but it will return the same value all the time (for example, the result of current_year > 0). The compiler does not know that function will return always the same result, so the function is impure but has referentially transparent property and can be substituted with True constant.

So, to avoid such inaccuracy, we should distinguish mathematical functions and "functions" in programming languages. Functions in Haskell are always impure and definition of purity related to them is always very conditionally: they are running on real hardware with real side-effects and physical properties, which is wrong for mathematical functions. This means that the example with "printf" function is totally incorrect.

But not all mathematical functions are pure too: each function which has t (time) as a parameter may be impure: t holds all effects and stochastic nature of the function: in common case you have input signal and have not idea about actual values, it can be even a noise.