Scala Programming Language
  1. Scala Programming Language
  2. SI-5632

deserializing large mutable.HashMaps results in ArrayIndexOutOfBoundsExceptions

    Details

      Description

      Basically, put 2147484 entries in a hashmap. write to disk. try to read from disk. kaboom! ArrayIndexOutOfBoundsExceptions. 2147483 is fine. Yeah, I did the binary search.

      
      import scala.collection.mutable.HashMap
      import java.io._
      import java.util.zip._
      
      case class Feature(x: Int) 
      
      object Foo extends App {
        def test(size: Int) { 
          println(size)
          val map = new HashMap[Feature,Double]() ++= (0 until size map {Feature(_) -> 1.0})
          val oout = new ObjectOutputStream(new GZIPOutputStream(new BufferedOutputStream(new FileOutputStream("test.ser.gz"))))
          oout.writeObject(map)
          oout.close()
          val in = new ObjectInputStream(new GZIPInputStream(new BufferedInputStream(new FileInputStream("test.ser.gz"))))
          in.readObject()
          in.close()
        }
        test(2147483)
        test(2147484)
      }
      
      java.lang.ArrayIndexOutOfBoundsException: -3733
      	at scala.collection.mutable.HashTable$class.addEntry(HashTable.scala:119)
      	at scala.collection.mutable.HashMap.addEntry(HashMap.scala:43)
      	at scala.collection.mutable.HashTable$class.init(HashTable.scala:81)
      	at scala.collection.mutable.HashMap.init(HashMap.scala:43)
      	at scala.collection.mutable.HashMap.readObject(HashMap.scala:130)
      

        Activity

        Hide
        Daniel Sobral added a comment -

        Collections of size greater than Int.MaxValue are technically illegal, even if some of them work for some things.

        Show
        Daniel Sobral added a comment - Collections of size greater than Int.MaxValue are technically illegal, even if some of them work for some things.
        Hide
        david hall added a comment - - edited

        That's all well and good, but this is .1% of that value (almost exactly, so this is probably the problem!)

        scala> 2147484 * 1.0 / Int.MaxValue
        res5: Double = 0.0010000001643784345
        
        scala> 
        
        Show
        david hall added a comment - - edited That's all well and good, but this is .1% of that value (almost exactly, so this is probably the problem!) scala> 2147484 * 1.0 / Int.MaxValue res5: Double = 0.0010000001643784345 scala>
        Hide
        david hall added a comment -

        Bingo. sizeForThreshold multiplies by loadFactorDenum ( == 1000) without converting to a Long.

          // Hashtable.scala
          private[collection] final def defaultLoadFactor: Int = 750 // corresponds to 75%
          private[collection] final def loadFactorDenum = 1000;  
        
          // ...
          
          private[collection] final def newThreshold(_loadFactor: Int, size: Int) = ((size.toLong * _loadFactor) / loadFactorDenum).toInt
          
          private[collection] final def sizeForThreshold(_loadFactor: Int, thr: Int) = thr * loadFactorDenum / _loadFactor
          
        
        Show
        david hall added a comment - Bingo. sizeForThreshold multiplies by loadFactorDenum ( == 1000) without converting to a Long. // Hashtable.scala private[collection] final def defaultLoadFactor: Int = 750 // corresponds to 75% private[collection] final def loadFactorDenum = 1000; // ... private[collection] final def newThreshold(_loadFactor: Int, size: Int) = ((size.toLong * _loadFactor) / loadFactorDenum).toInt private[collection] final def sizeForThreshold(_loadFactor: Int, thr: Int) = thr * loadFactorDenum / _loadFactor
        Hide
        Erik Osheim added a comment -

        Good catch! Seems like an easy fix.

        Show
        Erik Osheim added a comment - Good catch! Seems like an easy fix.
        Hide
        david hall added a comment -

        yup. Assuming I can make Scala build I'll make a patch and send a pull request real soon now.

        Show
        david hall added a comment - yup. Assuming I can make Scala build I'll make a patch and send a pull request real soon now.
        Hide
        Erik Osheim added a comment -

        OK. Let me know if you need help with any of it.

        I encourage you to write a test that catches this issue to help confirm that you fixed it and also prevent future regressions. Makes testing your patch easier on the commiter.

        Show
        Erik Osheim added a comment - OK. Let me know if you need help with any of it. I encourage you to write a test that catches this issue to help confirm that you fixed it and also prevent future regressions. Makes testing your patch easier on the commiter.
        Show
        david hall added a comment - https://github.com/scala/scala/pull/345

          People

          • Assignee:
            david hall
            Reporter:
            david hall
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development