Scala Programming Language
  1. Scala Programming Language
  2. SI-4678

flatMap is broken for parallel collections in 2.9.0 & 2.9.0-1

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: Scala 2.9.0
    • Fix Version/s: None
    • Component/s: Misc Library
    • Labels:
      None
    • Environment:

      RHEL 5
      Java version 1.6.0_21
      Scala 2.9.0.1

      Description

      When the attached program below is run, it crashes with a horrific stack trace:

      scala.collection.parallel.package$CompositeThrowable: Multiple exceptions thrown during a parallel computation: (java.lang.ArrayIndexOutOfBoundsException: 1074347069
      . . .

      at scala.collection.parallel.package$$anon$3.alongWith(package.scala:114)
      at scala.collection.parallel.Tasks$Task$class.mergeThrowables(Tasks.scala:98)
      at scala.collection.parallel.mutable.ParHashSetCombiner$FillBlocks.mergeThrowables(ParHashSet.scala:230)
      at scala.collection.parallel.Tasks$Task$class.tryMerge(Tasks.scala:84)
      at scala.collection.parallel.mutable.ParHashSetCombiner$FillBlocks.tryMerge(ParHashSet.scala:230)
      at scala.collection.parallel.AdaptiveWorkStealingTasks$TaskImpl$class.internal(Tasks.scala:186)
      at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.internal(Tasks.scala:509)
      at scala.collection.parallel.AdaptiveWorkStealingTasks$TaskImpl$class.compute(Tasks.scala:164)
      at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.compute(Tasks.scala:509)
      at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:147)
      at scala.concurrent.forkjoin.ForkJoinTask.tryExec(ForkJoinTask.java:404)
      at scala.concurrent.forkjoin.ForkJoinTask.join(ForkJoinTask.java:494)
      at scala.collection.parallel.ForkJoinTasks$TaskImpl$class.sync(Tasks.scala:439)
      at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.sync(Tasks.scala:509)
      at scala.collection.parallel.ForkJoinTasks$class.executeAndWaitResult(Tasks.scala:487)
      at scala.collection.parallel.ForkJoinTaskSupport.executeAndWaitResult(TaskSupport.scala:20)
      at scala.collection.parallel.mutable.ParHashSetCombiner.parPopulate(ParHashSet.scala:145)
      at scala.collection.parallel.mutable.ParHashSetCombiner.result(ParHashSet.scala:138)
      at scala.collection.parallel.mutable.ParHashSetCombiner.result(ParHashSet.scala:116)
      at scala.collection.parallel.ParIterableLike$$anonfun$flatMap$1$$anonfun$apply$3.apply(ParIterableLike.scala:513)
      at scala.collection.parallel.ParIterableLike$$anonfun$flatMap$1$$anonfun$apply$3.apply(ParIterableLike.scala:513)
      at scala.collection.parallel.ParIterableLike$$anon$13$$anon$4.map(ParIterableLike.scala:297)
      at scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:884)
      at scala.collection.parallel.Tasks$Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:66)
      at scala.collection.parallel.Tasks$Task$class.tryLeaf(Tasks.scala:68)
      at scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:879)
      at scala.collection.parallel.AdaptiveWorkStealingTasks$TaskImpl$class.compute(Tasks.scala:164)
      at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.compute(Tasks.scala:509)
      at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:147)
      at scala.concurrent.forkjoin.ForkJoinTask.quietlyExec(ForkJoinTask.java:422)
      at scala.concurrent.forkjoin.ForkJoinWorkerThread.mainLoop(ForkJoinWorkerThread.java:340)
      at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:325)

      Where the ellipses denote around 100 lines of exceptions. Playing with the N, and M values in the attached program affect the behavior. If they are both significantly smaller, then the program runs to completion and produces the expected results. I imagine there is some sort of error condition in the parallel logic, because, behind the scenes there are attempted accesses to outrageous array indices.

        Activity

        Hide
        Rob Patro added a comment -

        Note that the exact same code behaves as expected if using sequential collections.

        Show
        Rob Patro added a comment - Note that the exact same code behaves as expected if using sequential collections.
        Hide
        Daniel Sobral added a comment -

        Note that decreasing either N or M to 1400 is enough to get this code to work. I'd venture there's an Int wrap-around happening somewhere, or, perhaps, just a memory exaustion of some type being masked by the out of range errors.

        Show
        Daniel Sobral added a comment - Note that decreasing either N or M to 1400 is enough to get this code to work. I'd venture there's an Int wrap-around happening somewhere, or, perhaps, just a memory exaustion of some type being masked by the out of range errors.
        Hide
        Rob Patro added a comment -

        I've tested the same code replacing the mutable ParHashMap with an immutable ParMap, and this also solves the problem. I also ran this test on a fairly large server, and set the JVM heap space to 32Gb; the same error listed above occurs, so I'd wager against memory exhaustion. I'm not sure how the parallel hashset combiners work in the backend, so an Int wrap-around is possible, but it would have to be something that doesn't occur in the immutable ParMap case.

        Show
        Rob Patro added a comment - I've tested the same code replacing the mutable ParHashMap with an immutable ParMap, and this also solves the problem. I also ran this test on a fairly large server, and set the JVM heap space to 32Gb; the same error listed above occurs, so I'd wager against memory exhaustion. I'm not sure how the parallel hashset combiners work in the backend, so an Int wrap-around is possible, but it would have to be something that doesn't occur in the immutable ParMap case.
        Hide
        Aleksandar Prokopec added a comment -

        This is an issue with the `sizeForThreshold` method in `FlatHashTable`s as far as I can tell - the integer load factor and it's denominator rounding causes an arithmetic overflow, so the target size for the hashtable array is miscomputed.

        Show
        Aleksandar Prokopec added a comment - This is an issue with the `sizeForThreshold` method in `FlatHashTable`s as far as I can tell - the integer load factor and it's denominator rounding causes an arithmetic overflow, so the target size for the hashtable array is miscomputed.
        Hide
        Commit Message Bot added a comment -

        (prokopec in r25112) Fixed an overflow which occurs in hashtable size computations. Fixes #4678.

        No review.

        Show
        Commit Message Bot added a comment - (prokopec in r25112 ) Fixed an overflow which occurs in hashtable size computations. Fixes #4678. No review.
        Hide
        Dan Peebles added a comment -

        What version was this fixed in? I'm encountering a very similar-looking issue:

        scala> scala.util.Properties.versionString
        res35: java.lang.String = version 2.9.1.final
        
        scala> HashSet(1 to 100: _*).par.map(identity)
        java.lang.ArrayIndexOutOfBoundsException: 256
                at scala.collection.parallel.mutable.ParFlatHashTable$ParFlatHashTableIterator.scan(ParFlatHashTable.scala:36)
                at scala.collection.parallel.mutable.ParFlatHashTable$ParFlatHashTableIterator.next(ParFlatHashTable.scala:53)
                at scala.collection.parallel.AugmentedIterableIterator$class.map2combiner(RemainsIterator.scala:115)
                at scala.collection.parallel.mutable.ParFlatHashTable$ParFlatHashTableIterator.map2combiner(ParFlatHashTable.scala:26)
                at scala.collection.parallel.ParIterableLike$Map.leaf(ParIterableLike.scala:971)
                at scala.collection.parallel.Tasks$Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:66)
                at scala.collection.parallel.Tasks$Task$class.tryLeaf(Tasks.scala:68)
                at scala.collection.parallel.ParIterableLike$Map.tryLeaf(ParIterableLike.scala:968)
                at scala.collection.parallel.AdaptiveWorkStealingTasks$TaskImpl$class.internal(Tasks.scala:179)
                at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.internal(Tasks.scala:509)
                at scala.collection.parallel.AdaptiveWorkStealingTasks$TaskImpl$class.compute(Tasks.scala:164)
                at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.compute(Tasks.scala:509)
                at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:147)
                at scala.concurrent.forkjoin.ForkJoinTask.quietlyExec(ForkJoinTask.java:422)
                at scala.concurrent.forkjoin.ForkJoinWorkerThread.mainLoop(ForkJoinWorkerThread.java:340)
                at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:325)
        
        Show
        Dan Peebles added a comment - What version was this fixed in? I'm encountering a very similar-looking issue: scala> scala.util.Properties.versionString res35: java.lang.String = version 2.9.1.final scala> HashSet(1 to 100: _*).par.map(identity) java.lang.ArrayIndexOutOfBoundsException: 256 at scala.collection.parallel.mutable.ParFlatHashTable$ParFlatHashTableIterator.scan(ParFlatHashTable.scala:36) at scala.collection.parallel.mutable.ParFlatHashTable$ParFlatHashTableIterator.next(ParFlatHashTable.scala:53) at scala.collection.parallel.AugmentedIterableIterator$class.map2combiner(RemainsIterator.scala:115) at scala.collection.parallel.mutable.ParFlatHashTable$ParFlatHashTableIterator.map2combiner(ParFlatHashTable.scala:26) at scala.collection.parallel.ParIterableLike$Map.leaf(ParIterableLike.scala:971) at scala.collection.parallel.Tasks$Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:66) at scala.collection.parallel.Tasks$Task$class.tryLeaf(Tasks.scala:68) at scala.collection.parallel.ParIterableLike$Map.tryLeaf(ParIterableLike.scala:968) at scala.collection.parallel.AdaptiveWorkStealingTasks$TaskImpl$class.internal(Tasks.scala:179) at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.internal(Tasks.scala:509) at scala.collection.parallel.AdaptiveWorkStealingTasks$TaskImpl$class.compute(Tasks.scala:164) at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$TaskImpl.compute(Tasks.scala:509) at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:147) at scala.concurrent.forkjoin.ForkJoinTask.quietlyExec(ForkJoinTask.java:422) at scala.concurrent.forkjoin.ForkJoinWorkerThread.mainLoop(ForkJoinWorkerThread.java:340) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:325)

          People

          • Assignee:
            Aleksandar Prokopec
            Reporter:
            Rob Patro
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development