Scala 中的模式匹配是如何在字节码级别实现的?

Scala 中的模式匹配是如何在字节码级别实现的?

它是像一系列的 if (x instanceof Foo)结构,还是其他什么东西? 它的性能含义是什么?

例如,给定以下代码(来自 Scala 示例页46-48) ,eval方法的等效 Java 代码是什么样子的?

abstract class Expr
case class Number(n: Int) extends Expr
case class Sum(e1: Expr, e2: Expr) extends Expr


def eval(e: Expr): Int = e match {
case Number(x) => x
case Sum(l, r) => eval(l) + eval(r)
}

另外,我可以读 Java 字节码,所以字节码表示对我来说已经足够好了,但是对于其他读者来说,知道它看起来像 Java 代码可能会更好。

P.P.S. 这本书 Scala 编程是否回答了这个问题以及类似的 Scala 如何实现的问题?我已经订购了这本书,但它还没有到。

13564 次浏览

底层可以通过反汇编程序进行探索,但简短的回答是,它是一组 if/else,其中谓词取决于模式

case Sum(l,r) // instance of check followed by fetching the two arguments and assigning to two variables l and r but see below about custom extractors
case "hello" // equality check
case _ : Foo // instance of check
case x => // assignment to a fresh variable
case _ => // do nothing, this is the tail else on the if/else

您可以使用类似于模式或类似于“ case Foo (45,x)”的模式和组合来做更多的事情,但通常这些只是我刚才描述的逻辑扩展。模式还可以有保护,这是对谓词的附加约束。也有一些情况下,编译器可以优化模式匹配,例如,当有一些重叠的情况下,它可能会合并一些东西。高级模式和优化是编译器中一个活跃的工作领域,所以如果字节代码在当前和未来版本的 Scala 中比这些基本规则有很大改进,也不要感到惊讶。

除此之外,您还可以编写自己的自定义提取器,以补充或取代 Scala 用于 case 类的默认提取器。如果是这样,那么模式匹配的成本就是提取器执行的任何操作的成本。在 http://lamp.epfl.ch/~emir/written/MatchingObjectsWithPatterns-TR.pdf中可以找到一个很好的概述

詹姆斯(上图)说得最好。但是,如果您感到好奇,查看已分解的字节码总是一个很好的练习。您还可以使用 -print选项调用 scalac,它将删除所有特定于 Scala 的特性来打印您的程序。它基本上就是 Scala 服装里的 Java。下面是您给出的代码片段的相关 scalac -print输出:

def eval(e: Expr): Int = {
<synthetic> val temp10: Expr = e;
if (temp10.$isInstanceOf[Number]())
temp10.$asInstanceOf[Number]().n()
else
if (temp10.$isInstanceOf[Sum]())
{
<synthetic> val temp13: Sum = temp10.$asInstanceOf[Sum]();
Main.this.eval(temp13.e1()).+(Main.this.eval(temp13.e2()))
}
else
throw new MatchError(temp10)
};

从2.8版本开始,Scala 就有了 @ switch注释。我们的目标是确保这些模式匹配被编译成 表开关或查找开关,而不是一系列有条件的 if语句。

扩展@Zifre 的评论: 如果你将来读到这篇文章,Scala 编译器已经采用了新的编译策略,你想知道它们是什么,下面是如何找到它的功能的。

match代码复制粘贴到一个自包含的示例文件中。对该文件运行 scalac,然后运行 javap -v -c theClassName$.class

例如,我在 /tmp/question.scala中加入了以下内容:

object question {
abstract class Expr
case class Number(n: Int) extends Expr
case class Sum(e1: Expr, e2: Expr) extends Expr


def eval(e: Expr): Int = e match {
case Number(x) => x
case Sum(l, r) => eval(l) + eval(r)
}
}

然后我运行 scalac question.scala,它生成了一系列 *.class文件。随便翻了翻,我在 question$.class中找到了 match 语句。javap -c -v question$.class输出如下。

因为我们正在寻找一个条件控制流结构,了解 java 字节码指令集表明,寻找“ if”应该是一个很好的起点。

在两个位置,我们发现一对连续的行在形式 isinstanceof <something>; ifeq <somewhere>上,这意味着: 如果最近计算的值是 没有一个 something的实例,然后转到 somewhere。(ifeqjump if zero,而 isinstanceof给出一个0表示 false。)

如果遵循控制流,您将看到它与@Jorge Ortiz 给出的答案一致: 我们做 if (blah isinstanceof something) { ... } else if (blah isinstanceof somethingelse) { ... }

以下是 javap -c -v question$.class的输出:

Classfile /tmp/question$.class
Last modified Nov 20, 2020; size 956 bytes
MD5 checksum cfc788d4c847dad0863a797d980ad2f3
Compiled from "question.scala"
public final class question$
minor version: 0
major version: 50
flags: (0x0031) ACC_PUBLIC, ACC_FINAL, ACC_SUPER
this_class: #2                          // question$
super_class: #4                         // java/lang/Object
interfaces: 0, fields: 1, methods: 3, attributes: 4
Constant pool:
#1 = Utf8               question$
#2 = Class              #1             // question$
#3 = Utf8               java/lang/Object
#4 = Class              #3             // java/lang/Object
#5 = Utf8               question.scala
#6 = Utf8               MODULE$
#7 = Utf8               Lquestion$;
#8 = Utf8               <clinit>
#9 = Utf8               ()V
#10 = Utf8               <init>
#11 = NameAndType        #10:#9         // "<init>":()V
#12 = Methodref          #2.#11         // question$."<init>":()V
#13 = Utf8               eval
#14 = Utf8               (Lquestion$Expr;)I
#15 = Utf8               question$Number
#16 = Class              #15            // question$Number
#17 = Utf8               n
#18 = Utf8               ()I
#19 = NameAndType        #17:#18        // n:()I
#20 = Methodref          #16.#19        // question$Number.n:()I
#21 = Utf8               question$Sum
#22 = Class              #21            // question$Sum
#23 = Utf8               e1
#24 = Utf8               ()Lquestion$Expr;
#25 = NameAndType        #23:#24        // e1:()Lquestion$Expr;
#26 = Methodref          #22.#25        // question$Sum.e1:()Lquestion$Expr;
#27 = Utf8               e2
#28 = NameAndType        #27:#24        // e2:()Lquestion$Expr;
#29 = Methodref          #22.#28        // question$Sum.e2:()Lquestion$Expr;
#30 = NameAndType        #13:#14        // eval:(Lquestion$Expr;)I
#31 = Methodref          #2.#30         // question$.eval:(Lquestion$Expr;)I
#32 = Utf8               scala/MatchError
#33 = Class              #32            // scala/MatchError
#34 = Utf8               (Ljava/lang/Object;)V
#35 = NameAndType        #10:#34        // "<init>":(Ljava/lang/Object;)V
#36 = Methodref          #33.#35        // scala/MatchError."<init>":(Ljava/lang/Object;)V
#37 = Utf8               this
#38 = Utf8               e
#39 = Utf8               Lquestion$Expr;
#40 = Utf8               x
#41 = Utf8               I
#42 = Utf8               l
#43 = Utf8               r
#44 = Utf8               question$Expr
#45 = Class              #44            // question$Expr
#46 = Methodref          #4.#11         // java/lang/Object."<init>":()V
#47 = NameAndType        #6:#7          // MODULE$:Lquestion$;
#48 = Fieldref           #2.#47         // question$.MODULE$:Lquestion$;
#49 = Utf8               question
#50 = Class              #49            // question
#51 = Utf8               Sum
#52 = Utf8               Expr
#53 = Utf8               Number
#54 = Utf8               Code
#55 = Utf8               LocalVariableTable
#56 = Utf8               LineNumberTable
#57 = Utf8               StackMapTable
#58 = Utf8               SourceFile
#59 = Utf8               InnerClasses
#60 = Utf8               ScalaInlineInfo
#61 = Utf8               Scala
{
public static final question$ MODULE$;
descriptor: Lquestion$;
flags: (0x0019) ACC_PUBLIC, ACC_STATIC, ACC_FINAL


public static {};
descriptor: ()V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: new           #2                  // class question$
3: invokespecial #12                 // Method "<init>":()V
6: return


public int eval(question$Expr);
descriptor: (Lquestion$Expr;)I
flags: (0x0001) ACC_PUBLIC
Code:
stack=3, locals=9, args_size=2
0: aload_1
1: astore_2
2: aload_2
3: instanceof    #16                 // class question$Number
6: ifeq          27
9: aload_2
10: checkcast     #16                 // class question$Number
13: astore_3
14: aload_3
15: invokevirtual #20                 // Method question$Number.n:()I
18: istore        4
20: iload         4
22: istore        5
24: goto          69
27: aload_2
28: instanceof    #22                 // class question$Sum
31: ifeq          72
34: aload_2
35: checkcast     #22                 // class question$Sum
38: astore        6
40: aload         6
42: invokevirtual #26                 // Method question$Sum.e1:()Lquestion$Expr;
45: astore        7
47: aload         6
49: invokevirtual #29                 // Method question$Sum.e2:()Lquestion$Expr;
52: astore        8
54: aload_0
55: aload         7
57: invokevirtual #31                 // Method eval:(Lquestion$Expr;)I
60: aload_0
61: aload         8
63: invokevirtual #31                 // Method eval:(Lquestion$Expr;)I
66: iadd
67: istore        5
69: iload         5
71: ireturn
72: new           #33                 // class scala/MatchError
75: dup
76: aload_2
77: invokespecial #36                 // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
80: athrow
LocalVariableTable:
Start  Length  Slot  Name   Signature
0      81     0  this   Lquestion$;
0      81     1     e   Lquestion$Expr;
20      61     4     x   I
47      34     7     l   Lquestion$Expr;
54      27     8     r   Lquestion$Expr;
LineNumberTable:
line 6: 0
line 7: 2
line 8: 27
line 6: 69
StackMapTable: number_of_entries = 3
frame_type = 252 /* append */
offset_delta = 27
locals = [ class question$Expr ]
frame_type = 254 /* append */
offset_delta = 41
locals = [ top, top, int ]
frame_type = 248 /* chop */
offset_delta = 2
}
SourceFile: "question.scala"
InnerClasses:
public static #51= #22 of #50;          // Sum=class question$Sum of class question
public static abstract #52= #45 of #50; // Expr=class question$Expr of class question
public static #53= #16 of #50;          // Number=class question$Number of class question
ScalaInlineInfo: length = 0xE (unknown attribute)
01 01 00 02 00 0A 00 09 01 00 0D 00 0E 01
Scala: length = 0x0 (unknown attribute)