Scala 赋值评估 Unit 而不是赋值的动机是什么?

Scala 赋值评估 Unit 而不是赋值的动机是什么?

I/O 编程中的一个常见模式是这样做:

while ((bytesRead = in.read(buffer)) != -1) { ...

但这在 Scala 中是不可能的,因为..。

bytesRead = in.read(buffer)

. . 返回 Unit,而不是 bytesRead 的新值。

离开函数式语言似乎是一件有趣的事情。 我想知道为什么会这样?

8279 次浏览

I'd guess this is in order to keep the program / the language free of side effects.

What you describe is the intentional use of a side effect which in the general case is considered a bad thing.

It is not the best style to use an assignment as a boolean expression. You perform two things at the same time which leads often to errors. And the accidential use of "=" instead of "==" is avoided with Scalas restriction.

I'm not privy to inside information on the actual reasons, but my suspicion is very simple. Scala makes side-effectful loops awkward to use so that programmers will naturally prefer for-comprehensions.

It does this in many ways. For instance, you don't have a for loop where you declare and mutate a variable. You can't (easily) mutate state on a while loop at the same time you test the condition, which means you often have to repeat the mutation just before it, and at the end of it. Variables declared inside a while block are not visible from the while test condition, which makes do { ... } while (...) much less useful. And so on.

Workaround:

while ({bytesRead = in.read(buffer); bytesRead != -1}) { ...

For whatever it is worth.

As an alternate explanation, perhaps Martin Odersky had to face a few very ugly bugs deriving from such usage, and decided to outlaw it from his language.

EDIT

David Pollack has answered with some actual facts, which are clearly endorsed by the fact that Martin Odersky himself commented his answer, giving credence to the performance-related issues argument put forth by Pollack.

This happened as part of Scala having a more "formally correct" type system. Formally-speaking, assignment is a purely side-effecting statement and therefore should return Unit. This does have some nice consequences; for example:

class MyBean {
private var internalState: String = _


def state = internalState


def state_=(state: String) = internalState = state
}

The state_= method returns Unit (as would be expected for a setter) precisely because assignment returns Unit.

I agree that for C-style patterns like copying a stream or similar, this particular design decision can be a bit troublesome. However, it's actually relatively unproblematic in general and really contributes to the overall consistency of the type system.

I advocated for having assignments return the value assigned rather than unit. Martin and I went back and forth on it, but his argument was that putting a value on the stack just to pop it off 95% of the time was a waste of byte-codes and have a negative impact on performance.

Perhaps this is due to the command-query separation principle?

CQS tends to be popular at the intersection of OO and functional programming styles, as it creates an obvious distinction between object methods that do or do not have side-effects (i.e., that alter the object). Applying CQS to variable assignments is taking it further than usual, but the same idea applies.

A short illustration of why CQS is useful: Consider a hypothetical hybrid F/OO language with a List class that has methods Sort, Append, First, and Length. In imperative OO style, one might want to write a function like this:

func foo(x):
var list = new List(4, -2, 3, 1)
list.Append(x)
list.Sort()
# list now holds a sorted, five-element list
var smallest = list.First()
return smallest + list.Length()

Whereas in more functional style, one would more likely write something like this:

func bar(x):
var list = new List(4, -2, 3, 1)
var smallest = list.Append(x).Sort().First()
# list still holds an unsorted, four-element list
return smallest + list.Length()

These seem to be trying to do the same thing, but obviously one of the two is incorrect, and without knowing more about the behavior of the methods, we can't tell which one.

Using CQS, however, we would insist that if Append and Sort alter the list, they must return the unit type, thus preventing us from creating bugs by using the second form when we shouldn't. The presence of side effects therefore also becomes implicit in the method signature.

By the way: I find the initial while-trick stupid, even in Java. Why not somethign like this?

for(int bytesRead = in.read(buffer); bytesRead != -1; bytesRead = in.read(buffer)) {
//do something
}

Granted, the assignment appears twice, but at least bytesRead is in the scope it belongs to, and I'm not playing with funny assignment tricks...

You can have a workaround for this as long as you have a reference type for indirection. In a naïve implementation, you can use the following for arbitrary types.

case class Ref[T](var value: T) {
def := (newval: => T)(pred: T => Boolean): Boolean = {
this.value = newval
pred(this.value)
}
}

Then, under the constraint that you’ll have to use ref.value to access the reference afterwards, you can write your while predicate as

val bytesRead = Ref(0) // maybe there is a way to get rid of this line


while ((bytesRead := in.read(buffer)) (_ != -1)) { // ...
println(bytesRead.value)
}

and you can do the checking against bytesRead in a more implicit manner without having to type it.