scala.collection.immutable.HashMap.merge does not handle collisions correctly #5879

scabug · 2012-06-04T17:18:58Z

The merge function of scala.collections.immutable.HashMap can result in an inconsistent HashMap when there is a hash code collision. The easiest way to get this behavior is to merge two maps and defining a collision function.

But this also occasionally happens when no collision function is provided!

    import collection.immutable.HashMap
    val a = HashMap(1 -> "1")
    val b = HashMap(1 -> "2")
    def collision(a: (Int, String), b: (Int, String)) = {
      // shouldn't this be called with reversed order?
      println(a)
      a
    }
    val r = a.merge(b, collision)
    println(r)
    println(r(1))

Output:

    (1,2)
    Map(1 -> 2)
    1

Explanation:

HashMap1 caches both the value and a key/value pair. As a result of the merge the value field is inconsistent to the kv field. When printing the map, the kv field is used, so the result is correct. When calling apply, the value field is used, and the result is wrong.

Expected behavior:

HashMap1.value and HashMap1.kv should never be able to become inconsistent. To fix this in HashMap1.updated0:

      if (hash == this.hash && key == this.key ) {
        val newKv = if (merger eq null) kv else merger(this.kv, kv)
        new HashMap1(key, hash, newKv ._2, newKv)
      } else {
        ...

But there might be other places where this issue (inconsistency between kv and value) occurs.

scabug · 2012-06-04T17:18:58Z

Imported From: https://issues.scala-lang.org/browse/SI-5879?orig=1
Reporter: @rklaehn
Affected Versions: 2.9.2

scabug · 2012-08-09T18:31:32Z

@rklaehn said:
There is a bug in the fix. Or at least the behavior of the merged method if the merger is null is pretty nondeterministic and thus useless.

Given two maps a and b, a.merged(b)(null) should only take elements from b if there is no collision, right? So it should be equivalent with b ++ a, where all elements in a replace elements in b. But it is not. See test case.

The reason is the following: if a merger is given, the number of times invert has been called on it keeps track of the number of times the order of the arguments has been reversed. It essentially acts as a flag that is negated every time the arguments are swapped, like e.g. in

    protected override def merge0[B1 >: B](that: HashMap[A, B1], level: Int, merger: Merger[A, B1]): HashMap[A, B1] = {
      that.updated0(key, hash, level, value, kv, if (merger ne null) merger.invert else null)
    }

But when the merger is null there is no way to keep track of the number of flips, so whether you get something from the first or the second map depends on the structure of the map (depth, number of collisions, etc).

Here is some code to reproduce the bug:

  val n = 10
  val keys = (0 until n)
  val sa = keys.map(key => key -> "a")
  val sb = keys.map(key => key -> "b")
  for (i <- 0 until n) {
    val a = HashMap(sa: _*).take(i)
    val b = HashMap(sb: _*).take(n-i)

    val r1 = a ++ b
    val r2 = b.merged(a)(null)

    if (r1 != r2) {
      println(a)
      println(b)
      println(r1)
      println(r2)
    }
    println(r1 == r2)
  }

I would suggest dropping the merger stuff and just having an additional boolean argument "flipped" that just keeps track of whether the order of the arguments was flipped. That way it works even if the merger is null.

I think that would be faster as well since it would not require lifting the function to some helper object.

scabug · 2012-08-09T18:36:04Z

@rklaehn said:
See comment

scabug · 2012-08-09T20:31:36Z

@rklaehn said:
Added a pull request. Not sure if this is the best solution, but it should be reasonably efficient. When the merge function is null, it gets lifted to a default merger that just takes the first argument.

scala/scala#1108

scabug closed this as completed Aug 20, 2012

scabug added this to the 2.10.0-M3 milestone Apr 7, 2017

scabug assigned axel22 Apr 7, 2017

This was referenced Apr 7, 2017

Improvements for HashMap #6261

Closed

immutable.HashMap: Investigate if keeping the kv pair in HashMap1 is worth it #8108

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scala.collection.immutable.HashMap.merge does not handle collisions correctly #5879

scala.collection.immutable.HashMap.merge does not handle collisions correctly #5879

scabug commented Jun 4, 2012

scabug commented Jun 4, 2012

scabug commented Aug 9, 2012

scabug commented Aug 9, 2012

scabug commented Aug 9, 2012

scala.collection.immutable.HashMap.merge does not handle collisions correctly #5879

scala.collection.immutable.HashMap.merge does not handle collisions correctly #5879

Comments

scabug commented Jun 4, 2012

scabug commented Jun 4, 2012

scabug commented Aug 9, 2012

scabug commented Aug 9, 2012

scabug commented Aug 9, 2012