Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashMap getOrElseUpdate changed behaviour in 2.12.1 if the callback modifies the map #10187

Closed
scabug opened this issue Feb 13, 2017 · 5 comments

Comments

@scabug
Copy link

scabug commented Feb 13, 2017

scala> {val hm = scala.collection.mutable.HashMap[String, String](); def add() = 1 to 100000 foreach (i => hm(i.toString) =  ""); hm.getOrElseUpdate("0", {        "" }); hm.contains("0")}
res15: Boolean = true

scala> {val hm = scala.collection.mutable.HashMap[String, String](); def add() = 1 to 100000 foreach (i => hm(i.toString) =  ""); hm.getOrElseUpdate("0", { add(); "" }); hm.contains("0")}
res16: Boolean = false

These used to both be true in 2.12.0 and earlier. The change to optimize HashMap#getOrElseUpdate in 2.12.1 uses stale values in the update part.

Note that Java 9 has explicitly specced Map#computeIfAbsent to outlaw mutation of the map in the callback.

@scabug
Copy link
Author

scabug commented Feb 13, 2017

Imported From: https://issues.scala-lang.org/browse/SI-10187?orig=1
Reporter: Jan Vanek (j3vanek)
See #10049

@scabug
Copy link
Author

scabug commented Feb 16, 2017

@retronym said:
@Ichoran @szeiger Could I get your opinion on this?

I'm in favour of explicitly documenting the "don't mutate in the callback" restriction. Maybe this could be documented as a general note for mutable collections that results are undefined if they are mutated during queries or iteration.

But I'd be happy to revert to the current behaviour in the 2.12.x series to be conservative.

@scabug
Copy link
Author

scabug commented Feb 17, 2017

@Ichoran said:
Ugh. I wonder if there's a way to cheaply detect that there is mutation during the closure and fall back to the slow way if not? I'd imagine it's just a couple of eqs, which should be pretty fast. I'm wary of not keeping this behavior consistent with 2.12 since it is a semi-obvious way to do things (i.e. if you don't find something you expect to, run an initialization routine that covers more than just the single one you missed).

@scabug
Copy link
Author

scabug commented Feb 17, 2017

@szeiger said:
For the record, the change in 2.12.1 was scala/scala#5528, a fix for #10049.

Documenting the new behavior as the default in 2.13 (and reverting in 2.12) would be a natural choice. OTOH, getOrElseUpdate is much more convenient than doing the same thing manually, so it's understandable why people would want to use it even though they do not care about the improved performance (with the limitation of not being allowed to modify the map in the callback). The only obvious places in the Scala repo where the map is mutated while computing the default value are in ProdConsAnalyzerImpl.scala and SetMapConsistencyTest.scala, in both cases using an AnyRefMap.

The latter is a regression test for #8213 where the new behavior that we now have in HashMap was classified as a bug and consequently fixed for AnyRefMap. For the sake of consistency I suggest that we revert the change and allow mutation. Since the motivation for #10049 was performance of groupBy, we could explore a more directed fix, e.g. overriding groupBy in HashMap and using a private copy of getOrElseUpdate with the new semantics.

@scabug
Copy link
Author

scabug commented Feb 19, 2017

@retronym said:
Hopefully this will keep this performance gain for the common case: scala/scala#5719

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants