Roslyn SyntaxNodes 是否重用？

小开

最佳答案

更新: 这个问题是这是我2012年6月8日博客的主题。谢谢你的好问题！

Great question. We debated the issues you raise for a long, long time.

We would like to have a data structure that has the following characteristics:

一成不变。
树的形状。
从子节点对父节点的廉价访问。
可以从树中的节点映射到文本中的字符偏移量。
坚持。

我所说的 坚持不懈是指对文本缓冲区进行编辑时 重用树中的大部分现有节点的能力。因为节点是不可变的，所以重用它们没有障碍。为了提高性能，我们需要这样做; 我们不能在每次按下一个键时重新解析文件的大量片段。我们只需要对受到编辑影响的树的部分进行 relex 和 reparse。

现在，当您试图将所有这五个东西放入一个数据结构中时，您会立即遇到问题:

首先，如何构建一个节点？父元素和子元素都互相引用，并且是不可变的，那么哪一个先构建呢？

Supposing you manage to solve that problem: how do you make it persistent? You cannot re-use a child node in a different parent because that would involve telling the child that it has a new parent. But the child is immutable.

假设您设法解决了这个问题: 当您在编辑缓冲区中插入一个新字符时，every node that is mapped to a position after that point的绝对位置会发生变化。这使得制作持久性数据结构非常困难，因为任何编辑都可以改变大多数节点的范围！

But on the Roslyn team we routinely do impossible things. We actually do the impossible by keeping 二 parse trees. The "green" tree is immutable, persistent, has no parent references, is built "bottom-up", and every node tracks its 宽度 but not its 绝对位置. When an edit happens we rebuild only the portions of the green tree that were affected by the edit, which is typically about O(log n) of the total parse nodes in the tree.

“红色”树是一个不可变的正面，它是围绕绿色树构建的; 它是“自顶向下”构建的 随叫随到，每次编辑时都会丢弃它。它通过 当你从树顶下来的时候，根据需要制造它们计算父引用。当你下降时，它通过从宽度计算来制造绝对位置。

You, the user, only ever see the red tree; the green tree is an implementation detail. If you peer into the internal state of a parse node you'll in fact see that there is a reference to 另一个 parse node in there of a different type; that's the green tree node.

顺便说一句，这些被称为“红/绿树”，因为这些是我们在设计会议上用来绘制数据结构的白板标记颜色。这些颜色没有别的含义。

这种策略的好处是我们得到了所有这些好东西: 不变性、持久性、父引用等等。这样做的代价是，这个系统很复杂，如果“红色”外观变大，就会消耗大量内存。我们目前正在做实验，看看能否在不损失收益的情况下降低一些成本。