Uploaded image for project: 'Scala Programming Language'
  1. Scala Programming Language
  2. SI-10049

Substantial slowdown in groupBy (all collections)

    Details

    • Type: Bug
    • Status: CLOSED
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Scala 2.12.0
    • Fix Version/s: Scala 2.12.1
    • Component/s: Library (Misc)
    • Environment:

      Intel Core7 4790K CPU

      openjdk version "1.8.0_111"
      OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-2ubuntu0.16.04.2-b14)
      OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)

      Description

      groupBy is substantially slower in 2.12, with throughput down to as little as 60% of 2.11 on my machine. This seems to affect all collections (Vector, List, and Array tested).

      Manually inlining the groupBy code does not seem to help. Indeed, benchmarking of HashMap.getOrElseUpdate shows a considerable decrease in throughput, possibly enough to account for the entire effect. AnyRefMap does not appear to be affected.

      Manually inlining the get, match, and update calls from getOrElseUpdate does appear to solve the issue (at least mostly).

      The old bytecode in scala.collection.mutable.MapLike$class was:

        public static java.lang.Object getOrElseUpdate(scala.collection.mutable.MapLike, java.lang.Object, scala.Function0);
          Code:
             0: aload_0
             1: aload_1
             2: invokeinterface #35,  2           // InterfaceMethod scala/collection/mutable/MapLike.get:(Ljava/lang/Object;)Lscala/Option;
             7: astore_3
             8: aload_3
             9: instanceof    #119                // class scala/Some
            12: ifeq          31
            15: aload_3
            16: checkcast     #119                // class scala/Some
            19: astore        4
            21: aload         4
            23: invokevirtual #123                // Method scala/Some.x:()Ljava/lang/Object;
            26: astore        5
            28: goto          62
            31: getstatic     #128                // Field scala/None$.MODULE$:Lscala/None$;
            34: aload_3
            35: invokevirtual #132                // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
            38: ifeq          65
            41: aload_2
            42: invokeinterface #137,  1          // InterfaceMethod scala/Function0.apply:()Ljava/lang/Object;
            47: astore        6
            49: aload_0
            50: aload_1
            51: aload         6
            53: invokeinterface #39,  3           // InterfaceMethod scala/collection/mutable/MapLike.update:(Ljava/lang/Object;Ljava/lang/Object;)V
            58: aload         6
            60: astore        5
            62: aload         5
            64: areturn
            65: new           #139                // class scala/MatchError
            68: dup
            69: aload_3
            70: invokespecial #142                // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
            73: athrow
      

      Now, however, the bytecode is as follows, with extra instructions highlighted with >>>>>:

        public V getOrElseUpdate(K, scala.Function0<V>);
          Code:
             0: aload_0
             1: aload_1
             2: invokeinterface #109,  2          // InterfaceMethod get:(Ljava/lang/Object;)Lscala/Option;
             7: astore        4
             9: aload         4
            11: instanceof    #221                // class scala/Some
            14: ifeq          37
            17: aload         4
            19: checkcast     #221                // class scala/Some
            22: astore        5
            24: aload         5
            26: invokevirtual #224                // Method scala/Some.value:()Ljava/lang/Object;
            29: astore        6
      >>>>> 31: aload         6
      >>>>> 33: astore_3
            34: goto          87
      >>>>> 37: goto          40
            40: getstatic     #229                // Field scala/None$.MODULE$:Lscala/None$;
            43: aload         4
            45: invokevirtual #233                // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
            48: ifeq          74
            51: aload_2
            52: invokeinterface #237,  1          // InterfaceMethod scala/Function0.apply:()Ljava/lang/Object;
            57: astore        7
            59: aload_0
            60: aload_1
            61: aload         7
            63: invokeinterface #113,  3          // InterfaceMethod update:(Ljava/lang/Object;Ljava/lang/Object;)V
            68: aload         7
            70: astore_3
      >>>>> 71: goto          87
      >>>>> 74: goto          77
            77: new           #239                // class scala/MatchError
            80: dup
            81: aload         4
            83: invokespecial #242                // Method scala/MatchError."<init>":(Ljava/lang/Object;)V
            86: athrow
            87: aload_3
            88: areturn
      

      Whether these extra jumps themselves are the extra cost, or whether they prevent some JIT optimization, I have not yet investigated. However, the bytecode appears less efficiently organized (note the swap of the exception vs return which requires one of the extra jumps in the success case).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                paplorinc Pap Lőrinc
                Reporter:
                ichoran Rex Kerr
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: