State，ST，IORef 和 MVar 之间的差异

小开

最佳答案

The State Monad : a model of mutable state

The State monad is a purely functional environment for programs with state, with a simple API:

get
put

The State monad is commonly used when needing state in a single thread of control. It doesn't actually use mutable state in its implementation. Instead, the program is parameterized by the state value (i.e. the state is an additional parameter to all computations). The state only appears to be mutated in a single thread (and cannot be shared between threads).

The ST monad and STRefs

The ST monad is the restricted cousin of the IO monad.

It allows arbitrary mutable state, implemented as actual mutable memory on the machine. The API is made safe in side-effect-free programs, as the rank-2 type parameter prevents values that depend on mutable state from escaping local scope.

It thus allows for controlled mutability in otherwise pure programs.

Commonly used for mutable arrays and other data structures that are mutated, then frozen. It is also very efficient, since the mutable state is "hardware accelerated".

Primary API:

Control.Monad.ST
runST -- start a new memory-effect computation.
And STRefs: pointers to (local) mutable cells.
ST-based arrays (such as vector) are also common.

Think of it as the less dangerous sibling of the IO monad. Or IO, where you can only read and write to memory.

IORef : STRefs in IO

These are STRefs (see above) in the IO monad. They don't have the same safety guarantees as STRefs about locality.

MVars : IORefs with locks

Like STRefs or IORefs, but with a lock attached, for safe concurrent access from multiple threads. IORefs and STRefs are only safe in a multi-threaded setting when using atomicModifyIORef (a compare-and-swap atomic operation). MVars are a more general mechanism for safely sharing mutable state.

Generally, in Haskell, use MVars or TVars (STM-based mutable cells), over STRef or IORef.

小开

Ok, I'll start with IORef. IORef provides a value which is mutable in the IO monad. It's just a reference to some data, and like any reference, there are functions which allow you to change the data it refers to. In Haskell, all of those functions operate in IO. You can think of it like a database, file, or other external data store - you can get and set the data in it, but doing so requires going through IO. The reason IO is necessary at all is because Haskell is pure; the compiler needs a way to know which data the reference points to at any given time (read sigfpe's "You could have invented monads" blogpost).

MVars are basically the same thing as an IORef, except for two very important differences. MVar is a concurrency primitive, so it's designed for access from multiple threads. The second difference is that an MVar is a box which can be full or empty. So where an IORef Int always has an Int (or is bottom), an MVar Int may have an Int or it may be empty. If a thread tries to read a value from an empty MVar, it will block until the MVar gets filled (by another thread). Basically an MVar a is equivalent to an MVar0 with extra semantics that are useful for concurrency.

State is a monad which provides mutable state, not necessarily with IO. In fact, it's particularly useful for pure computations. If you have an algorithm that uses state but not IO, a State monad is often an elegant solution.

There is also a monad transformer version of State, StateT. This frequently gets used to hold program configuration data, or "game-world-state" types of state in applications.

ST is something slightly different. The main data structure in ST is the STRef, which is like an IORef but with a different monad. The ST monad uses type system trickery (the "state threads" the docs mention) to ensure that mutable data can't escape the monad; that is, when you run an ST computation you get a pure result. The reason ST is interesting is that it's a primitive monad like IO, allowing computations to perform low-level manipulations on bytearrays and pointers. This means that ST can provide a pure interface while using low-level operations on mutable data, meaning it's very fast. From the perspective of the program, it's as if the ST computation runs in a separate thread with thread-local storage.

小开

Others have done the core things, but to answer the direct question:

All this makes the line type ENV = IORef [(String, IORef LispVal)] confusing. Why the second IORef? What will break if I do type ENV = State [(String, LispVal)] instead?

Lisp is a functional language with mutable state and lexical scope. Imagine you've closed over a mutable variable. Now you've got a reference to this variable hanging around inside some other function -- say (in haskell-style pseudocode) (printIt, setIt) = let x = 5 in (\ () -> print x, \y -> set x y). You now have two functions -- one prints x, and one sets its value. When you evaluate printIt, you want to lookup the name of x in the initial environment in which printIt was defined, but you want to lookup the value that name is bound to in the environment in which printIt is called (after setIt may have been called any number of times).

There are ways besids the two IORefs to do this, but you certainly need more than the latter type you've proposed, which doesn't allow you to alter the values that names are bound to in a lexically-scoped fashion. Google the "funargs problem" for a whole lot of interesting prehistory.