Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork Join's idle hands #7438

Closed
scabug opened this issue Apr 30, 2013 · 10 comments
Closed

Fork Join's idle hands #7438

scabug opened this issue Apr 30, 2013 · 10 comments
Assignees
Milestone

Comments

@scabug
Copy link

scabug commented Apr 30, 2013

The version of ForkJoin that we bundle [1] appears to prone to underutilize available cores when using fork (as opposed to execute). Our ExecutionContextImpl does exactly this as an optimization in nested parallelism.

A test case that demonstrates the problem with ForkJoin:

https://gist.github.com/retronym/5420028

import annotation.tailrec
import java.util.concurrent.CountDownLatch
 
object ConcurrentTest {
 
  import scala.concurrent.forkjoin._
 
  def banner(str: String) = {
    val line = "=" * 80
    println(s"\n$line\n$str\n$line")
  }
 
  def expensiveCalc() {
    @tailrec
    def loop(i: Int, accum: Double): Double =
      if (i == 0) accum else loop(i - 1, math.pow(math.sqrt(accum), 2.00001))
    loop(500000000, 42)
  }
 
  val N = Runtime.getRuntime.availableProcessors()
  val numTasks = N
 
  def nestedParallelismBench(pool: Pool, useFork: Boolean) {
    import pool.forkJoin
    val countDown = new CountDownLatch(numTasks)
    forkJoin.execute(new RecursiveAction {
      def compute() {
        (1 to numTasks) foreach { i =>
          val action = new RecursiveAction {
            def compute() {
              println(s"i = $i: ${Thread.currentThread.getName} ${forkJoin.toString.dropWhile(_ != '[')}")
              expensiveCalc()
              countDown.countDown()
            }
          }
          // See https://github.com/scala/scala/blob/2.10.1/src/library/scala/concurrent/impl/ExecutionContextImpl.scala#L118-#L120
          // The default ExecutionContext in Scala uses `fork` for nested parallelism.
          // This seems to leave a lot of cores idle. Why is this?
          if (useFork)
            action.fork()
          else
            forkJoin.execute(action)
        }
      }
    })
    countDown.await()
  }
  def nestedParallelismBenchRepeat(pool: Pool, useFork: Boolean) {
    (1 to 3).foreach { i =>
      banner(s"${pool.id}, useFork = $useFork, iteration $i")
      nestedParallelismBench(pool, useFork = useFork)
    }
  }
 
  case class Pool(forkJoin: ForkJoinPool, id: String)
 
  val pool = Pool(new ForkJoinPool(N), s"new ForkJoinPool($N)")
 
  def main(args: Array[String]) {
    nestedParallelismBenchRepeat(pool, useFork = false)
    nestedParallelismBenchRepeat(pool, useFork = true)
    nestedParallelismBenchRepeat(pool, useFork = false)
  }
 
}

The most recent packages of jsr166e and jsr166y do not suffer from this bug.

Suggestion:

  • On Scala 2.10.x, change our code to avoid using fork. We could restore the this with a system property in case someone really needs that behaviour back.
  • On master, we should update to the latest jsr166 code. We should also consider just using the official version bundled with the JRE 7.

[1]

https://github.com/scala/scala/pull/407

This was this commit, from http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/jsr166y/ForkJoinTask.java?view=log

Revision 1.89 - (view) (download) (annotate) - [select for diffs] 
Mon Apr 9 13:11:44 2012 UTC (12 months, 2 weeks ago) by dl 
Branch: MAIN 
Changes since 1.88: +66 -46 lines 
Diff to previous 1.88
Add CountedCompleter; improve tryHelpStealer
@scabug
Copy link
Author

scabug commented Apr 30, 2013

Imported From: https://issues.scala-lang.org/browse/SI-7438?orig=1
Reporter: @retronym
Affected Versions: 2.10.1
See #7442

@scabug
Copy link
Author

scabug commented Apr 30, 2013

@soc said:
Would upgrading the library be too disruptive for 2.10? Having to maintain different ForkJoin versions for different versions of Scala doesn't sound too great to me.

Considering that the JSR166 stuff is all in public domain, starting to use the bundled version of ForkJoin (where available) seems fine (imho).

@scabug
Copy link
Author

scabug commented May 1, 2013

@retronym said:
We might update FJ for 2.10.2. But we must carefully weigh the risks of doing that in a point release, as compared to either cherry picking a targetted fix for this bug, or just avoiding the calls to fork.

@scabug
Copy link
Author

scabug commented May 1, 2013

@retronym said:
The problem may have been the same as: http://cs.oswego.edu/pipermail/concurrency-interest/2012-May/009487.html

On 05/26/12 04:55, Peter De Maeyer wrote:
> We have a multithreaded application which relies heavily on the F/J framework.
> When threadprofiling the application we've observed that not all F/J threads are
> kept busy. Some threads spend a lot of time waiting, thus making suboptimal use
> of the available CPUs in the system.

As far as I can tell by running your example code, the underlying issues
have been addressed for upcoming jdk8 version, as well as in the
standalone jsr166y version that you can use/run now by grabbing
jar file and changing imports from "java.util.concurrent" to "jsr166y".
(See http://gee.cs.oswego.edu/dl/concurrency-interest/index.html)
Please try this and let me know if not.

The main underlying issue is that thread compensation was overly
conservative, which avoided unbounded spare thread construction
in the worst cases of misuse, but at the expense of overly limiting
parallelism in other cases.

-Doug

@scabug
Copy link
Author

scabug commented May 1, 2013

@viktorklang said (edited on May 1, 2013 6:15:16 AM UTC):
"We should also consider just using the official version bundled with the JRE 7" <-- I'd veto that one. The JDK7 version has terrible scalability.

I propose the following action plan:

  1. Switch to using straight execute for 2.10.2 and 2.9.4 in ExecutionContextImpl
  2. Update to the JDK8 version for Scala 2.11 (making sure to retain our Unsafe Android detection)
  3. Drop the embedded one once Java6 support is discontinued and Java8 support exists

Thoughts?

@scabug
Copy link
Author

scabug commented May 1, 2013

@retronym said:
I like that plan. Can you comment on what workloads / benchmarks might be negatively affected by just using execute?

The JDK7 version has terrible scalability.

Can you point to particular versions/commits that are critical? I'd like to be sure that they exist in our fork.

Would we have the same problem with all JDK7 builds, or just the earlier ones?

Sorry if these questions are a bit basic, but I'm not a jsr166 expert, so I need precision in the advice here.

@scabug
Copy link
Author

scabug commented May 1, 2013

@phaller said:
I followed up on concurrency-interest: http://cs.oswego.edu/pipermail/concurrency-interest/2013-May/011254.html

@scabug
Copy link
Author

scabug commented May 1, 2013

@retronym said:
scala/scala#2482

@scabug
Copy link
Author

scabug commented May 1, 2013

@retronym said:
Spawned #7442 for the JSR166 update.

@scabug
Copy link
Author

scabug commented May 12, 2013

@retronym said:
scala/scala#2500

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants