为什么 JVM 仍然不支持尾部调用优化？

小开

Java is the least functional language you could possibly imagine (well, OK, perhaps not!) but this would be a great advantage for JVM languages, like Scala, which are.

My observations are that making the JVM a platform for other languages has never seemed to be at the top of the priority list for Sun and I guess, now for Oracle.

小开

Perhaps you know this already, but the feature is not as trivial as it may sound since the Java language actually exposes the stack trace to the programmer.

Consider the following program:

public class Test {


public static String f() {
String s = Math.random() > .5 ? f() : g();
return s;
}


public static String g() {
if (Math.random() > .9) {
StackTraceElement[] ste = new Throwable().getStackTrace();
return ste[ste.length / 2].getMethodName();
}
return f();
}


public static void main(String[] args) {
System.out.println(f());
}
}

Even though this has a "tail-call" it may not be optimized. (If it is optimized, it still requires book-keeping of the entire call-stack since the semantics of the program relies on it.)

Basically, this means that it's hard to support this while still being backward compatible.

小开

最佳答案

Diagnosing Java Code: Improving the Performance of Your Java Code (alt) explains why the JVM does not support tail-call optimization.

But although it is well known how to automatically transform a tail-recursive function into a simple loop, the Java specification doesn't require that this transformation be made. Presumably, one reason it is not a requirement is that, in general, the transformation can't be made statically in an object-oriented language. Instead, the transformation from tail-recursive function to simple loop must be done dynamically by a JIT compiler.

It then gives an example of Java code that won't transform.

So, as the example in Listing 3 shows, we cannot expect static compilers to perform transformation of tail recursion on Java code while preserving the semantics of the language. Instead, we must rely on dynamic compilation by the JIT. Depending on the JVM, the JIT may or may not do this.

Then it gives a test you can use to figure out if your JIT does this.

Naturally, since this is an IBM paper, it includes a plug:

I ran this program with a couple of the Java SDKs, and the results were surprising. Running on Sun's Hotspot JVM for version 1.3 reveals that Hotspot doesn't perform the transformation. At default settings, the stack space is exhausted in less than a second on my machine. On the other hand, IBM's JVM for version 1.3 purrs along without a problem, indicating that it does transform the code in this way.

小开

One reason I've seen in the past for not implementing TCO (and it being seen as difficult) in Java is that the permission model in the JVM is stack-sensitive and thus tail-calls must handle the security aspects.

I believe this was shown to not be an obstacle by Clements and Felleisen [1] [2] and I'm pretty sure the MLVM patch mentioned in the question deals with it as well.

I realize this does not answer your question; just adding interesting information.

小开

It's not a Java problem... it's one of the JVM. Java is just the grand-grand-ol'-pa of JVM languages.

Making a TCO is jumping to the next Stack Frame while deleting the current one, in between the running program and current stack calling vars should be somewhere else... ;)

The best way would be to have a new special call opcode for a jump-call in other frames that makes the stuff. They already did that for the virtual call. Not really a problem in interpretation, JIT perhaps rises other problems, and the JVM is bloated enough.

In Java or other languages, as there no proper TCO, the other way is trampolining, but it adds a lot of code. Or using specific exceptions, but it messes a lot. And it is in your code, but not in others' libraries...

Ah! If Rich Hickey added a (recur...) stuff (it's not a function), it's because of lack of real TCO, he doesn't want people to think there was one. He could very easily make an automatic TCO in an internal tail call. It also helps to detect bad tail calls that are not in tail position.

There's also a (trampoline...) stuff for external TCO, but it's messy (as a trampoline), and is quite not used except in awful stack situations.

But yes, a lot of VM manage TCO. I've heard that CLR will. I've even seen a paying JVM that manages it (a time ago, don't remember...)

Trampolining example in js: https://codeinjavascript.com/2020/06/13/tail-call-optimization-tco/

An old Thesis on TCO on HotSpot VM with Frame overwrite: https://ssw.jku.at/Research/Papers/Schwaighofer09Master/schwaighofer09master.pdf