Compiling Lambda Expressions: Scala vs. Java 8

By Tal Weiss —  January 16, 2014 — 18 Comments

lambda Compiling Lambda Expressions: Scala vs. Java 8

Lambda expressions have taken the programming world by storm in the last few years. Most modern languages have adopted them as a fundamental part of functional programming. JVM based languages such as Scala, Groovy and Clojure have integrated them as key part of the language. And now, Java 8 is (finally) joining in on the fun.

What’s interesting about Lambda expressions is that from the JVM’s perspective they’re completely invisible. It has no notion of what an anonymous function or a Lambda expression is. It only knows bytecode which is a strict OO specification. It’s up to the makers of the language and its compiler to work within these constraints to create newer, more advanced language elements.

We first encountered this when we were working on adding Scala support to Takipi and had to dive deep into the Scala compiler. With Java 8 right around the corner, I thought it would be interesting to see how the Scala and Java compilers implement Lambda expressions. The results were pretty surprising.

To get things going I took a simple Lambda expression that converts a list of Strings to a list of their lengths.

In Java -


List names = Arrays.asList("1", "2", "3");
Stream lengths = names.stream().map(name -> name.length());

In Scala -


val names = List("1", "2", "3")
val lengths = names.map(name => name.length)

Don’t be tricked its simplicity – behind the scenes some complex stuff is going on.

Let’s start with Scala

SCalaLam 1 Compiling Lambda Expressions: Scala vs. Java 8

The Code

I used javap to view the bytecode contents of the .class produced by the Scala compiler. Let’s look at the result bytecode (this is what the JVM will actually execute).


// this loads the names var into the stack (the JVM thinks
// of it as variable #2).
// It’s going to stay there for a while till it gets used
// by the <em>.map</em> function.

aload_2

Next, things gets more interesting – a new instance of a synthetic class generated by the compiler is created and initialized. This is the object that from the JVM’s perspective holds the Lambda method. It’s funny that while the Lambda is defined as an integral part of our method, in reality it lives completely outside of our class.

new myLambdas/Lambda1$$anonfun$1 //instantiate the Lambda object
dup //put it into the stack again

// finally, invoke the c’tor. Remember - it’s just a plain object
// from the JVM’s perspective.
invokespecial myLambdas/Lambda1$$anonfun$1/()V

// these two (long) lines loads the immutable.List CanBuildFrom factory
// which will create the new list. This factory pattern is a part of
// Scala’s collections architecture
getstatic scala/collection/immutable/List$/MODULE$
Lscala/collection/immutable/List$;
invokevirtual scala/collection/immutable/List$/canBuildFrom()
Lscala/collection/generic/CanBuildFrom;

// Now we have on the stack the Lambda object and the factory.
// The next phase is to call the .<em>map</em>() function.
// If you remember, we loaded the <em>names</em> var onto
// the stack in the beginning. Now it’ll gets used as the
// this for the .<em>map</em>() call, which will also
// accept the Lambda object and the factory to produce the
// new list of lengths.

invokevirtual scala/collection/immutable/List/map(Lscala/Function1;
Lscala/collection/generic/CanBuildFrom;)Ljava/lang/Object;

But hold on – what’s going on inside that Lambda object?

The Lambda object

The Lambda class is derived from scala.runtime.AbstractFunction1. Through this the map() function can polymorphically invoke the overridden apply() whose code is below -

// this code loads this and the target object on which to act,
// checks that it’s a String, and then calls another apply overload
// to do the actual work and boxes its return value.
aload_0 //load this
aload_1 //load the string arg
checkcast java/lang/String //make sure it’s a String - we got an Object

// call another apply() method in the synthetic class
invokevirtual myLambdas/Lambda1$$anonfun$1/apply(Ljava/lang/String;)I

//box the result
invokestatic scala/runtime/BoxesRunTime/boxToInteger(I)Ljava/lang/Integer
areturn

The actual code to perform the .length() operation is nested in that additional apply method which simply returns the length of the String as we expected.

Phew.. it was quite a long way to get here icon smile Compiling Lambda Expressions: Scala vs. Java 8

aload_1
invokevirtual java/lang/String/length()I
ireturn

For a line as simple as we write above, quite a lot of bytecode is generated – an additional class and a bunch of new methods. This of course isn’t meant to dissuade us from using Lambdas (we’re writing in Scala, not C). It just goes to show the complexity behind these constructs. Just think of the amount of code and complexity that goes into compiling complex chains of Lambda expressions!

I was quite expecting Java 8 to implement this the same way, but was quite surprised to see that they took another approach completely.

Java 8 – A new approach

JavaLam 1 Compiling Lambda Expressions: Scala vs. Java 8

The bytecode here is a bit shorter but does something rather surprising. It begins quite simply by loading the names var and invokes its .stream() method, but then it does something quite elegant. Instead of creating a new object that will wrap the Lambda function, it uses the new invokeDynamic instruction which was added in Java 7 to dynamically link this call site to the actual Lambda function.

aload_1 //load the names var

// call its stream() func
invokeinterface java/util/List.stream:()Ljava/util/stream/Stream;

//invokeDynamic magic!
invokedynamic #0:apply:()Ljava/util/function/Function;

//call the map() func
invokeinterface java/util/stream/Stream.map:
(Ljava/util/function/Function;)Ljava/util/stream/Stream;

InvokeDynamic magic. This JVM instruction was added in Java 7 to make the JVM less strict, and allows dynamic languages to bind symbols at run-time, vs. doing all the linkage statically when the code is compiled by the JVM.

Dynamic Linking. If you look at the actual invokedynamic instruction you’ll see there’s no reference of the actual Lambda function (called lambda$0). The answer lies in the way invokedynamic is designed (which in itself deserves a full post), but the short answer is the name and signature of the Lambda, which in our case is -

// a function named lamda$0 that gets a String and returns an Integer
lambdas/Lambda1.lambda$0:(Ljava/lang/String;)Ljava/lang/Integer;

are stored in an entry in a separate table in the .class to which the #0 parameter passed to the instruction points. This new table actually changed the structure of the bytecode specification for the first time after a good few years, requiring us to adapt Takipi’s error analysis engine to it as well.

The Lambda code

This is the code for the actual Lambda expression. It’s very cookie-cutter – simply load the String parameter, call length() and box the result. Notice it was compiled as a static function to avoid having to pass an additional this object to it like we saw in Scala.

aload_0
invokevirtual java/lang/String.length:()
invokestatic java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
areturn

This is another advantage of the invokedynamic approach, as it allows us to invoke the method in a way which is polymorphic from the .map() function’s perspective, but without having to allocate a wrapper object or invoke a virtual override method. Pretty cool!

Summary. It’s fascinating to see how Java, the most “strict” of modern languages is now using dynamic linkage to power its new Lambda expressions. It’s also an efficient approach, as no additional class loading and compilation is needed – the Lambda method is simply another private method in our class.

Java 8 has really done a very elegant job in using new technology introduced in Java 7 to implement Lambda expressions in what is a very straightforward way. It’s pleasant in a way to see that even a “venerable” lady such as Java can teach us all some new tricks icon smile Compiling Lambda Expressions: Scala vs. Java 8

This post is also available in German, Spanish, French and Portuguese

 

fredForBlog Compiling Lambda Expressions: Scala vs. Java 8

Takipi detects all your exceptions and errors and tells you why they happen. Even across multiple threads and machines. Installs in 1min. Less than 2% overhead – Try it free

Duke T shirt 268x300 Compiling Lambda Expressions: Scala vs. Java 8

Java developers - Deploy Takipi now and get a free T-shirt

Tal Weiss

Posts Twitter

Tal is the CEO of Takipi. Tal has been designing scalable, real-time Java and C++ applications for the past 15 years. He still enjoys analyzing a good bug though, and instrumenting code. In his free time Tal plays Jazz drums.
  • Stefan Wagner

    Off topic: The .js-Code on this site is blocking my browser (FF 26.0, 32bit, Linux), flooding me with notification boxes. Works on chromium.

    • http://www.takipi.com/ Tal Weiss

      Thanks. We’ll have a look and see if we can sort it out. Too many people here on Chrome :)

  • theromansky

    Another important aspect of lambdas is how they also create a closure inside them, somehow this point is missing from your writeup, can you explain which facility you mentioned above takes care of this? Or is the compiler producing additional code when some of the values inside are not provided by “applying”?

    • http://www.takipi.com/ Tal Weiss

      Good question. See the discussion here on Closures in J8- http://stackoverflow.com/questions/17204279/does-java-8-support-closures. If you post a specific construct you’re interested in, with regards to scoping of variables, post the snippet and I’ll update to reflect how it’s implemented from the compiler’s perspective.

      • theromansky

        Let me know if I’m close- In the case of Scala, the “new ..” construct probably creates the closure (via the JVM), in the case of Java, there is no code that does something similar and by introducing a scoped final variable it will generate additional code to “lift” it outside of the lambda.
        What about the example in your SO link (provided the int is final), wonder how that would look :)

  • Grzegorz Kossakowski

    Hi Tal,

    The Java 8′s lambda translation strategy is indeed very clever and it’s described in full detail here: http://cr.openjdk.java.net/~briangoetz/lambda/lambda-translation.html. However, I think it’s clever for other reasons than you mention in the blog post.

    From that document you can see that Java 8 does create an anonymous class that implements SAM interface for for each lambda separately. The difference between Scala and Java 8 is that Java 8 will create those anonymous classes at runtime using LambdaMetaFactory. The class is named “meta factory” for two reasons:

    1. LambdaMetaFactory is being called during program linkage (so at meta level).
    2. LambdaMetaFactory is “a factory of factories” which means, it creates classes that have constructors and those constructors are factories for each lambda they represent.

    Therefore, the invokedynamic instruction is there to get the code that will create an instance of a lambda. As mentioned above, that invokedynamic instruction will get expanded to a call to a constructor of anonymous class that LambdaMetaFactory creates at runtime.

    This means that at runtime the bytecode you get for Scala and Java lambdas looks very similar. It also means that Java 8 doesn’t have any more efficient way of implementing lambdas if you care about runtime performance. Java 8′s strategy does result in smaller JAR files because all anonymous classes are created on the fly.

    However, the key thing about invokedynamic is that it’s essentially a JVM-level macro that defers lambda translation strategy to LambdaMetaFactory which is a library class. If Java 9 or 10 gets more direct (and performant) support for lambdas that won’t require classes and object allocations then it can swap implementation of LambdaMetaFactory and all _existing_ code written for Java 8 will get performance boost. That is the brilliance of using invokedynamic in context of translating lambdas. We’ll have to wait for future versions of Java to experience those benefits, though.

    Grzegorz,
    Scala compiler hacker at Typesafe

    • http://www.takipi.com/ Tal Weiss

      Thanks for the great feedback. You’re totally right that the LambdaFactory which is defined by the Java 8 compiler as the class that will generate the call site for the invokeDynamic instruction will generate an internal class that will make the call into the Lamda function. Looking at the final bytecode result from the JVM’s perspective, both approaches are similiar in the sense that an internal class is generated behind the scenes whether at the compiler level as with Scala, or at the framework level as with Java 8. One noteworthy aspect of the Java 8 approach is that the bootstrap method is called once before the binding by the JVM is made to the target method. As a result there will only be one Lambda holder object which will make the call into the Lambda function, regardless of how many times its containing method is called, so there’s some value to using the invokeDynamic approach there. With the Scala approach you’re allocating a new Lambda holder object every time you enter the scope. Correct me if I’m wrong here. Regardless, it’s not a material difference.

      • Grzegorz Kossakowski

        When it comes to allocation of lambda objects that’s a bit different story. Both Java 8 and Scala will allocate lambda’s objects each time you enter given scope if lambda closes over some variable. You can see this in Java done here:

        http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/3e9b46280c16/src/share/classes/java/lang/invoke/InnerClassLambdaMetafactory.java (I can’t link to specific line, check buildCallSite() method).

        Not, you can see that buildCallSite checks length of the dynamic argument list passed to invokedynamic instruction. If it’s zero, it will allocate lambda object once and return ConstantCallSite that simply points at that instance. That’s an optimization that Scala compiler currently doesn’t perform.

        I believe we’ve neglected this optimization because short-lived objects are cheap. They are not free so it makes sense to perform this optimization. However, we can do that (and we’ll) in Scala using plain old Java’s Singleton pattern (implemented by Scala’s object construct). There’s no need for invokedynamic here.

        The most important point I wanted to make is that you don’t gain any performance benefits from using invokedynamic instruction for lambda translation. Invokedynamic is there so future Java version can swap different implementation of lambda representation (not class based). As things stand now, Java 8 and Scala have very comparable performance (modulo optimization for lambdas that do not close over the environment).

        • http://harmeetsingh13.blogspot.in/ Harmeet Singh

          Hello Grzegorz Kossakowski, you information is really wonder full and with the help of this information my several doubts are clear. But still i have some confusion about InvokeDynamic, how the InvokeDynamic work and what the changes are done in InvokeDynamic for Java 8. If you give some detail about this, i am really thank full to you.

          • Esko Luontola

            There were no changes to invokedynamic in Java 8.

            With invokestatic et al. the instruction itself tells directly that to what method the instruction should be linked. But the invokedynamic instruction tells it indirectly by telling what bootstrap method to call, and that boostrap method will then return the actual method to which the instruction should be linked.

            You can read more about invokedynamic here:

            http://docs.oracle.com/javase/7/docs/api/java/lang/invoke/package-summary.html

            http://docs.oracle.com/javase/7/docs/technotes/guides/vm/multiple-language-support.html

          • http://harmeetsingh13.blogspot.in/ Harmeet Singh

            @eskoluontola:disqus thanks for reply, but in java 8 they introduce lambdas for that, there is also some improvements are done for invokedynamic. In java 7 the invoke dynamic is not mature, there are many loopholes are in there, but in java 8 many changes are done. so i thought, in java 8 the invoke dynamic work is different.

          • Esko Luontola

            In http://openjdk.java.net/projects/jdk8/features there is listed JEP 160 which includes performance and quality improvements to the implementation of method handles, but the API is still the same as in Java 7. At http://openjdk.java.net/jeps/160 it even says that one of JEP 160′s goals is “Complete compatibility with the Java SE 7 specification for JSR 292″ and later it also says that “We expect to back-port this code to JDK 7.”

  • serguei_p

    I am not convinced Java necessary had to jump on this bandwagon.

    • LukasEder

      I’m sure, thousands of .NET developers said the same when C# 3.0 introduced all those features that hardly any .NET developer nowadays wants to miss out on…

  • http://www.fromdev.com/ Java Developer

    Thanks for excellent details. This makes me think of Java being more usable against scala in coming days. I like scala to some extent however I am a java developer at heart :)

  • Eko Suhariyadi

    Hi Tal, sorry this is OOT,
    I love the artworks used in this post (Duke lambda, and “Scala Guy” lambda).

    May i used it, to be printed on t-shirt?

    I live in Indonesia

  • Дмитрий Александров

    1) still really misunderstood why to use dynamicinvoke (why not just make it syntax shugar in compiler for anonymous classes)
    2) the great tool in .NET is ExpressionTrees which represent not invokable code but AST still absent (easy to implement at compiler level as syntax shugar)

  • Gal Levinsky

    Excellent post