Scala Programming Language
  1. Scala Programming Language
  2. SI-4633

For loops with -optimize have much worse performance than while loops

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: Scala 2.8.1, Scala 2.9.0
    • Fix Version/s: Backlog
    • Component/s: Optimizer
    • Labels:
    • Environment:

      Ubuntu 10.10 with Sun Java 1.6.0_24-b07

      Description

      According to Tiark Rompf, "With -optimize, scalac should compile simple for comprehensions to the equivalent of a while loop. If it does not it's a bug (either in the library or in the compiler) and it needs fixing. So if there is indeed a bug, gather solid performance numbers [and] submit a ticket"

      Here are performance numbers for loops nested 1, 3, 5, or a variable number of levels deep; in every case, the innermost operation occurs one billion times. The leftmost number is a checksum (the first three should be the same; the fourth is different):

      // These are while loops
      -1243309312 While1 Elapsed: 0.314 s
      -1243309312 While3 Elapsed: 0.301 s
      -1243309312 While5 Elapsed: 0.316 s
      -17609343 While? Elapsed: 1.151 s

      // These are for loops with ranges
      -1243309312 Limit1 Elapsed: 0.923 s
      -1243309312 Limit3 Elapsed: 0.885 s
      -1243309312 Limit5 Elapsed: 2.527 s
      -17609343 Limit? Elapsed: 3.308 s

      // These use a C-for-like-method that takes an anonymous function
      // Note that the JVM gets confused about how to optimize repeated uses
      -1243309312 Cfor1 Elapsed: 9.794 s
      -1243309312 Cfor3 Elapsed: 0.324 s
      -1243309312 Cfor5 Elapsed: 0.450 s
      -17609343 Cfor?? Elapsed: 5.662 s

      Code is attached.

      1. Cruel.scala
        4 kB
        Rex Kerr
      2. ForWhile.scala
        1 kB
        Rex Kerr

        Activity

        Hide
        Matthew Pocock added a comment -

        The simple case of a for loop over a range is so common and so slow that it deserves optimizing, ideally in a standard scalac run, but as a last resort, when -optimize is turned on. The current situation makes scala look bad.

        Show
        Matthew Pocock added a comment - The simple case of a for loop over a range is so common and so slow that it deserves optimizing, ideally in a standard scalac run, but as a last resort, when -optimize is turned on. The current situation makes scala look bad.
        Hide
        Rex Kerr added a comment -

        Attached is another benchmark demonstrating the difference, including array access also.

        With -optimise, I get timings:

        while range 1 : -536870912 took 0.339 s
        while range 2 : -536870912 took 0.328 s
        while array 1024: -536870912 took 0.497 s
        for range 1 : -536870912 took 0.658 s
        for range 2 : -536870912 took 1.335 s
        for array 1024: -536870912 took 4.533 s

        demonstrating a 2x-9x slowdown using for instead of while. Without -optimise:

        while range 1 : -536870912 took 0.337 s
        while range 2 : -536870912 took 0.321 s
        while array 1024: -536870912 took 0.494 s
        for range 1 : -536870912 took 0.997 s
        for range 2 : -536870912 took 1.061 s
        for array 1024: -536870912 took 3.375 s

        gives a 3x-7x slowdown.

        (These timings are 2.9.0, again Ubuntu 10.10 with Sun Java 1.6.0_24-b07; build 1.7.0-ea-b143 gives similar results.)

        Incidentally, JRockit (Oracle JRockit(R) (build R28.0.1-21-133393-1.6.0_20-20100512-2126-linux-x86_64, compiled mode) makes it look like the for loop is often as good as the while loop:

        while range 1 : -536870912 took 0.928 s
        while range 2 : -536870912 took 0.903 s
        while array 1024: -536870912 took 1.199 s
        for range 1 : -536870912 took 0.898 s
        for range 2 : -536870912 took 2.132 s
        for array 1024: -536870912 took 0.953 s

        but of course the dirty little secret here is that even the while loops are horribly slow--everything is 3x worse than it needs to be.

        Show
        Rex Kerr added a comment - Attached is another benchmark demonstrating the difference, including array access also. With -optimise, I get timings: while range 1 : -536870912 took 0.339 s while range 2 : -536870912 took 0.328 s while array 1024: -536870912 took 0.497 s for range 1 : -536870912 took 0.658 s for range 2 : -536870912 took 1.335 s for array 1024: -536870912 took 4.533 s demonstrating a 2x-9x slowdown using for instead of while. Without -optimise: while range 1 : -536870912 took 0.337 s while range 2 : -536870912 took 0.321 s while array 1024: -536870912 took 0.494 s for range 1 : -536870912 took 0.997 s for range 2 : -536870912 took 1.061 s for array 1024: -536870912 took 3.375 s gives a 3x-7x slowdown. (These timings are 2.9.0, again Ubuntu 10.10 with Sun Java 1.6.0_24-b07; build 1.7.0-ea-b143 gives similar results.) Incidentally, JRockit (Oracle JRockit(R) (build R28.0.1-21-133393-1.6.0_20-20100512-2126-linux-x86_64, compiled mode) makes it look like the for loop is often as good as the while loop: while range 1 : -536870912 took 0.928 s while range 2 : -536870912 took 0.903 s while array 1024: -536870912 took 1.199 s for range 1 : -536870912 took 0.898 s for range 2 : -536870912 took 2.132 s for array 1024: -536870912 took 0.953 s but of course the dirty little secret here is that even the while loops are horribly slow--everything is 3x worse than it needs to be.
        Hide
        Ismael Juma added a comment -

        An option is to simply include the part of the ScalaCL plugin that tranforms certain operations to while loops with the Scala distribution. This way the compiler doesn't become more complex and the fear that the plugin would not evolve together with Scala would go away.

        Show
        Ismael Juma added a comment - An option is to simply include the part of the ScalaCL plugin that tranforms certain operations to while loops with the Scala distribution. This way the compiler doesn't become more complex and the fear that the plugin would not evolve together with Scala would go away.
        Hide
        Ismael Juma added a comment -
        Show
        Ismael Juma added a comment - I forgot the link: http://code.google.com/p/scalacl/wiki/ScalaCLPlugin
        Hide
        Ryan Schmitt added a comment -

        I agree with Ismael. The ScalaCL plugin is a useful start is a useful modification to the compiler.

        Show
        Ryan Schmitt added a comment - I agree with Ismael. The ScalaCL plugin is a useful start is a useful modification to the compiler.
        Hide
        Ryan Schmitt added a comment -

        Accidental click

        Show
        Ryan Schmitt added a comment - Accidental click
        Hide
        Ryan Schmitt added a comment -

        fixed the mistake

        Show
        Ryan Schmitt added a comment - fixed the mistake
        Hide
        Adriaan Moors added a comment -

        Unassigning as milestone deadline was reached.

        Show
        Adriaan Moors added a comment - Unassigning as milestone deadline was reached.

          People

          • Assignee:
            Unassigned
            Reporter:
            Rex Kerr
          • Votes:
            36 Vote for this issue
            Watchers:
            31 Start watching this issue

            Dates

            • Created:
              Updated:

              Development