State,ST,IORef 和 MVar 之间的差异

我通过 在48小时内给自己写一个计划工作(我大约85小时) ,我已经得到了关于 添加变量和赋值的部分。在本章中有一个很大的概念跳跃,我希望它是在两个步骤中完成的,中间有一个良好的重构,而不是直接跳到最终的解决方案。不管怎样。

我已经迷失在一些不同的类,似乎服务于相同的目的: StateSTIORef,和 MVar。文中提到了前三个问题,而最后一个问题似乎是关于前三个问题的许多 StackOverflow 问题的最佳答案。它们似乎都在连续调用之间带有一种状态。

这些都是什么? 它们彼此之间有什么不同?


特别是这些句子没有意义:

相反,我们使用一个称为 状态线程的特性,让 Haskell 为我们管理聚合状态。这使我们可以像对待其他编程语言一样对待可变变量,使用函数来获取或设置变量。

还有

IORef 模块允许使用有状态变量 在 IO 单子内

所有这些都使得行 type ENV = IORef [(String, IORef LispVal)]令人困惑-为什么是第二个 IORef? 如果我改写为 type ENV = State [(String, LispVal)]将会破坏什么?

14269 次浏览

The State Monad : a model of mutable state

The State monad is a purely functional environment for programs with state, with a simple API:

  • get
  • put

Documentation in the mtl package.

The State monad is commonly used when needing state in a single thread of control. It doesn't actually use mutable state in its implementation. Instead, the program is parameterized by the state value (i.e. the state is an additional parameter to all computations). The state only appears to be mutated in a single thread (and cannot be shared between threads).

The ST monad and STRefs

The ST monad is the restricted cousin of the IO monad.

It allows arbitrary mutable state, implemented as actual mutable memory on the machine. The API is made safe in side-effect-free programs, as the rank-2 type parameter prevents values that depend on mutable state from escaping local scope.

It thus allows for controlled mutability in otherwise pure programs.

Commonly used for mutable arrays and other data structures that are mutated, then frozen. It is also very efficient, since the mutable state is "hardware accelerated".

Primary API:

  • Control.Monad.ST
  • runST -- start a new memory-effect computation.
  • And STRefs: pointers to (local) mutable cells.
  • ST-based arrays (such as vector) are also common.

Think of it as the less dangerous sibling of the IO monad. Or IO, where you can only read and write to memory.

IORef : STRefs in IO

These are STRefs (see above) in the IO monad. They don't have the same safety guarantees as STRefs about locality.

MVars : IORefs with locks

Like STRefs or IORefs, but with a lock attached, for safe concurrent access from multiple threads. IORefs and STRefs are only safe in a multi-threaded setting when using atomicModifyIORef (a compare-and-swap atomic operation). MVars are a more general mechanism for safely sharing mutable state.

Generally, in Haskell, use MVars or TVars (STM-based mutable cells), over STRef or IORef.

Ok, I'll start with IORef. IORef provides a value which is mutable in the IO monad. It's just a reference to some data, and like any reference, there are functions which allow you to change the data it refers to. In Haskell, all of those functions operate in IO. You can think of it like a database, file, or other external data store - you can get and set the data in it, but doing so requires going through IO. The reason IO is necessary at all is because Haskell is pure; the compiler needs a way to know which data the reference points to at any given time (read sigfpe's "You could have invented monads" blogpost).

MVars are basically the same thing as an IORef, except for two very important differences. MVar is a concurrency primitive, so it's designed for access from multiple threads. The second difference is that an MVar is a box which can be full or empty. So where an IORef Int always has an Int (or is bottom), an MVar Int may have an Int or it may be empty. If a thread tries to read a value from an empty MVar, it will block until the MVar gets filled (by another thread). Basically an MVar a is equivalent to an MVar0 with extra semantics that are useful for concurrency.

State is a monad which provides mutable state, not necessarily with IO. In fact, it's particularly useful for pure computations. If you have an algorithm that uses state but not IO, a State monad is often an elegant solution.

There is also a monad transformer version of State, StateT. This frequently gets used to hold program configuration data, or "game-world-state" types of state in applications.

ST is something slightly different. The main data structure in ST is the STRef, which is like an IORef but with a different monad. The ST monad uses type system trickery (the "state threads" the docs mention) to ensure that mutable data can't escape the monad; that is, when you run an ST computation you get a pure result. The reason ST is interesting is that it's a primitive monad like IO, allowing computations to perform low-level manipulations on bytearrays and pointers. This means that ST can provide a pure interface while using low-level operations on mutable data, meaning it's very fast. From the perspective of the program, it's as if the ST computation runs in a separate thread with thread-local storage.

Others have done the core things, but to answer the direct question:

All this makes the line type ENV = IORef [(String, IORef LispVal)] confusing. Why the second IORef? What will break if I do type ENV = State [(String, LispVal)] instead?

Lisp is a functional language with mutable state and lexical scope. Imagine you've closed over a mutable variable. Now you've got a reference to this variable hanging around inside some other function -- say (in haskell-style pseudocode) (printIt, setIt) = let x = 5 in (\ () -> print x, \y -> set x y). You now have two functions -- one prints x, and one sets its value. When you evaluate printIt, you want to lookup the name of x in the initial environment in which printIt was defined, but you want to lookup the value that name is bound to in the environment in which printIt is called (after setIt may have been called any number of times).

There are ways besids the two IORefs to do this, but you certainly need more than the latter type you've proposed, which doesn't allow you to alter the values that names are bound to in a lexically-scoped fashion. Google the "funargs problem" for a whole lot of interesting prehistory.