Details

      Description

      The version of ForkJoin that we bundle [1] appears to prone to underutilize available cores when using `fork` (as opposed to `execute`). Our ExecutionContextImpl does exactly this as an optimization in nested parallelism.

      A test case that demonstrates the problem with ForkJoin:

      https://gist.github.com/retronym/5420028

      import annotation.tailrec
      import java.util.concurrent.CountDownLatch
       
      object ConcurrentTest {
       
        import scala.concurrent.forkjoin._
       
        def banner(str: String) = {
          val line = "=" * 80
          println(s"\n$line\n$str\n$line")
        }
       
        def expensiveCalc() {
          @tailrec
          def loop(i: Int, accum: Double): Double =
            if (i == 0) accum else loop(i - 1, math.pow(math.sqrt(accum), 2.00001))
          loop(500000000, 42)
        }
       
        val N = Runtime.getRuntime.availableProcessors()
        val numTasks = N
       
        def nestedParallelismBench(pool: Pool, useFork: Boolean) {
          import pool.forkJoin
          val countDown = new CountDownLatch(numTasks)
          forkJoin.execute(new RecursiveAction {
            def compute() {
              (1 to numTasks) foreach { i =>
                val action = new RecursiveAction {
                  def compute() {
                    println(s"i = $i: ${Thread.currentThread.getName} ${forkJoin.toString.dropWhile(_ != '[')}")
                    expensiveCalc()
                    countDown.countDown()
                  }
                }
                // See https://github.com/scala/scala/blob/2.10.1/src/library/scala/concurrent/impl/ExecutionContextImpl.scala#L118-#L120
                // The default ExecutionContext in Scala uses `fork` for nested parallelism.
                // This seems to leave a lot of cores idle. Why is this?
                if (useFork)
                  action.fork()
                else
                  forkJoin.execute(action)
              }
            }
          })
          countDown.await()
        }
        def nestedParallelismBenchRepeat(pool: Pool, useFork: Boolean) {
          (1 to 3).foreach { i =>
            banner(s"${pool.id}, useFork = $useFork, iteration $i")
            nestedParallelismBench(pool, useFork = useFork)
          }
        }
       
        case class Pool(forkJoin: ForkJoinPool, id: String)
       
        val pool = Pool(new ForkJoinPool(N), s"new ForkJoinPool($N)")
       
        def main(args: Array[String]) {
          nestedParallelismBenchRepeat(pool, useFork = false)
          nestedParallelismBenchRepeat(pool, useFork = true)
          nestedParallelismBenchRepeat(pool, useFork = false)
        }
       
      }
      

      The most recent packages of jsr166e and jsr166y do not suffer from this bug.

      Suggestion:

      • On Scala 2.10.x, change our code to avoid using `fork`. We could restore the this with a system property in case someone really needs that behaviour back.
      • On master, we should update to the latest jsr166 code. We should also consider just using the official version bundled with the JRE 7.

      [1]

      https://github.com/scala/scala/pull/407
      
      This was this commit, from http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/jsr166y/ForkJoinTask.java?view=log
      
      Revision 1.89 - (view) (download) (annotate) - [select for diffs] 
      Mon Apr 9 13:11:44 2012 UTC (12 months, 2 weeks ago) by dl 
      Branch: MAIN 
      Changes since 1.88: +66 -46 lines 
      Diff to previous 1.88
      Add CountedCompleter; improve tryHelpStealer
      

        Issue Links

          Activity

          Hide
          Simon Ochsenreither added a comment -

          Would upgrading the library be too disruptive for 2.10? Having to maintain different ForkJoin versions for different versions of Scala doesn't sound too great to me.

          Considering that the JSR166 stuff is all in public domain, starting to use the bundled version of ForkJoin (where available) seems fine (imho).

          Show
          Simon Ochsenreither added a comment - Would upgrading the library be too disruptive for 2.10? Having to maintain different ForkJoin versions for different versions of Scala doesn't sound too great to me. Considering that the JSR166 stuff is all in public domain, starting to use the bundled version of ForkJoin (where available) seems fine (imho).
          Hide
          Jason Zaugg added a comment -

          We might update FJ for 2.10.2. But we must carefully weigh the risks of doing that in a point release, as compared to either cherry picking a targetted fix for this bug, or just avoiding the calls to fork.

          Show
          Jason Zaugg added a comment - We might update FJ for 2.10.2. But we must carefully weigh the risks of doing that in a point release, as compared to either cherry picking a targetted fix for this bug, or just avoiding the calls to fork .
          Hide
          Jason Zaugg added a comment -

          The problem may have been the same as: http://cs.oswego.edu/pipermail/concurrency-interest/2012-May/009487.html

          On 05/26/12 04:55, Peter De Maeyer wrote:
          > We have a multithreaded application which relies heavily on the F/J framework.
          > When threadprofiling the application we've observed that not all F/J threads are
          > kept busy. Some threads spend a lot of time waiting, thus making suboptimal use
          > of the available CPUs in the system.

          As far as I can tell by running your example code, the underlying issues
          have been addressed for upcoming jdk8 version, as well as in the
          standalone jsr166y version that you can use/run now by grabbing
          jar file and changing imports from "java.util.concurrent" to "jsr166y".
          (See http://gee.cs.oswego.edu/dl/concurrency-interest/index.html)
          Please try this and let me know if not.

          The main underlying issue is that thread compensation was overly
          conservative, which avoided unbounded spare thread construction
          in the worst cases of misuse, but at the expense of overly limiting
          parallelism in other cases.

          -Doug

          Show
          Jason Zaugg added a comment - The problem may have been the same as: http://cs.oswego.edu/pipermail/concurrency-interest/2012-May/009487.html On 05/26/12 04:55, Peter De Maeyer wrote: > We have a multithreaded application which relies heavily on the F/J framework. > When threadprofiling the application we've observed that not all F/J threads are > kept busy. Some threads spend a lot of time waiting, thus making suboptimal use > of the available CPUs in the system. As far as I can tell by running your example code, the underlying issues have been addressed for upcoming jdk8 version, as well as in the standalone jsr166y version that you can use/run now by grabbing jar file and changing imports from "java.util.concurrent" to "jsr166y". (See http://gee.cs.oswego.edu/dl/concurrency-interest/index.html ) Please try this and let me know if not. The main underlying issue is that thread compensation was overly conservative, which avoided unbounded spare thread construction in the worst cases of misuse, but at the expense of overly limiting parallelism in other cases. -Doug
          Hide
          Viktor Klang added a comment - - edited

          "We should also consider just using the official version bundled with the JRE 7" <-- I'd veto that one. The JDK7 version has terrible scalability.

          I propose the following action plan:

          1) Switch to using straight execute for 2.10.2 and 2.9.4 in ExecutionContextImpl
          2) Update to the JDK8 version for Scala 2.11 (making sure to retain our Unsafe Android detection)
          3) Drop the embedded one once Java6 support is discontinued and Java8 support exists

          Thoughts?

          Show
          Viktor Klang added a comment - - edited "We should also consider just using the official version bundled with the JRE 7" <-- I'd veto that one. The JDK7 version has terrible scalability. I propose the following action plan: 1) Switch to using straight execute for 2.10.2 and 2.9.4 in ExecutionContextImpl 2) Update to the JDK8 version for Scala 2.11 (making sure to retain our Unsafe Android detection) 3) Drop the embedded one once Java6 support is discontinued and Java8 support exists Thoughts?
          Hide
          Jason Zaugg added a comment -

          I like that plan. Can you comment on what workloads / benchmarks might be negatively affected by just using execute?

          The JDK7 version has terrible scalability.

          Can you point to particular versions/commits that are critical? I'd like to be sure that they exist in our fork.

          Would we have the same problem with all JDK7 builds, or just the earlier ones?

          Sorry if these questions are a bit basic, but I'm not a jsr166 expert, so I need precision in the advice here.

          Show
          Jason Zaugg added a comment - I like that plan. Can you comment on what workloads / benchmarks might be negatively affected by just using execute ? The JDK7 version has terrible scalability. Can you point to particular versions/commits that are critical? I'd like to be sure that they exist in our fork. Would we have the same problem with all JDK7 builds, or just the earlier ones? Sorry if these questions are a bit basic, but I'm not a jsr166 expert, so I need precision in the advice here.
          Hide
          Philipp Haller added a comment -
          Show
          Philipp Haller added a comment - I followed up on concurrency-interest: http://cs.oswego.edu/pipermail/concurrency-interest/2013-May/011254.html
          Show
          Jason Zaugg added a comment - https://github.com/scala/scala/pull/2482
          Hide
          Jason Zaugg added a comment -

          Spawned SI-7442 for the JSR166 update.

          Show
          Jason Zaugg added a comment - Spawned SI-7442 for the JSR166 update.
          Show
          Jason Zaugg added a comment - https://github.com/scala/scala/pull/2500

            People

            • Assignee:
              Philipp Haller
              Reporter:
              Jason Zaugg
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development