为什么异步状态机类(而不是结构)在罗斯林?

让我们考虑一下这个非常简单的异步方法:

static async Task myMethodAsync()
{
await Task.Delay(500);
}

当我用 VS2013(前 Roslyn 编译器)编译这个时,生成的状态机是一个 struct。

private struct <myMethodAsync>d__0 : IAsyncStateMachine
{
...
void IAsyncStateMachine.MoveNext()
{
...
}
}

当我用 VS2015(Roslyn)编译它时,生成的代码如下:

private sealed class <myMethodAsync>d__1 : IAsyncStateMachine
{
...
void IAsyncStateMachine.MoveNext()
{
...
}
}

正如您所看到的,Roslyn 生成一个类(而不是一个 struct)。如果我没有记错的话,旧编译器(CTP2012)中第一个异步/等待支持的实现也生成了类,然后由于性能原因将其改为 struct。(在某些情况下,您可以完全避免装箱和堆分配...)(参见 这个)

有人知道为什么罗斯林又发生了变化吗?(我对此没有任何问题,我知道这个更改是透明的,不会改变任何代码的行为,我只是好奇)

编辑:

The answer from @Damien_The_Unbeliever (and the source code :) ) imho explains everything. The described behaviour of Roslyn only applies for debug build (and that's needed because of the CLR limitation mentioned in the comment). In Release it also generates a struct (with all the benefits of that..). So this seems to be a very clever solution to support both Edit and Continue and better performance in production. Interesting stuff, thanks for everyone who participated!

3925 次浏览

I didn't have any foreknowledge of this, but since Roslyn is open-source these days, we can go hunting through the code for an explanation.

And here, on line 60 of the AsyncRewriter, we find:

// The CLR doesn't support adding fields to structs, so in order to enable EnC in an async method we need to generate a class.
var typeKind = compilationState.Compilation.Options.EnableEditAndContinue ? TypeKind.Class : TypeKind.Struct;

So, whilst there's some appeal to using structs, the big win of allowing Edit and Continue to work within async methods was obviously chosen as the better option.

It's hard to give a definitive answer for something like this (unless someone from the compiler team drops in :)), but there's a few points you can consider:

The performance "bonus" of structs is always a tradeoff. Basically, you get the following:

  • Value semantics
  • Possible stack (maybe even register?) allocation
  • Avoiding indirection

What does this mean in the await case? Well, actually... nothing. There's only a very short time period during which the state machine is on the stack - remember, await effectively does a return, so the method stack dies; the state machine must be preserved somewhere, and that "somewhere" is definitely on the heap. Stack lifetime doesn't fit asynchronous code well :)

Apart from this, the state machine violates some good guidelines for defining structs:

  • structs should be at most 16-bytes large - the state machine contains two pointers, which on their own fill the 16-byte limit neatly on 64-bit. Apart from that, there's the state itself, so it goes over the "limit". This is not a big deal, since it's quite likely only ever passed by reference, but note how that doesn't quite fit the use case for structs - a struct that's basically a reference type.
  • structs should be immutable - well, this probably doesn't need much of a comment. It's a state machine. Again, this is not a big deal, since the struct is auto-generated code and private, but...
  • structs should logically represent a single value. Definitely not the case here, but that already kind of follows from having a mutable state in the first place.
  • It shouldn't be boxed frequently - not a problem here, since we're using generics everywhere. The state is ultimately somewhere on the heap, but at least it's not being boxed (automatically). Again, the fact that it's only used internally makes this pretty much void.

And of course, all this is in a case where there's no closures. When you have locals (or fields) that traverse the awaits, the state is further inflated, limiting the usefulness of using a struct.

Given all this, the class approach is definitely cleaner, and I wouldn't expect any noticeable performance increase from using a struct instead. All of the objects involved have similar lifetime, so the only way to improve memory performance would be to make all of them structs (store in some buffer, for example) - which is impossible in the general case, of course. And most cases where you'd use await in the first place (that is, some asynchronous I/O work) already involve other classes - for example, data buffers, strings... It's rather unlikely you would await something that simply returns 42 without doing any heap allocations.

In the end, I'd say the only place where you'd really see a real performance difference would be benchmarks. And optimizing for benchmarks is a silly idea, to say the least...