Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work out -XcheckInit issues for serialization of immutable HashMap and HashSet #2751

Closed
scabug opened this issue Dec 2, 2009 · 7 comments
Closed
Assignees

Comments

@scabug
Copy link

scabug commented Dec 2, 2009

The serialization code commited in r19964 causes the tests to fail check -XcheckInit is enabled. Paul committed two changes (r19970 and r19975) to get them to pass again. The second commit is fine (even though it's strange why it's needed) while the first one may cause a reference to the outer map.

I'll take a look and see what's going on.

@scabug
Copy link
Author

scabug commented Dec 2, 2009

Imported From: https://issues.scala-lang.org/browse/SI-2751?orig=1
Reporter: @ijuma

@scabug
Copy link
Author

scabug commented Dec 2, 2009

@ijuma said:
Reassigning to me.

@scabug
Copy link
Author

scabug commented Dec 4, 2009

@ijuma said:
OK, so it doesn't cause a reference to the outer map (as it just references a final local variable instead of a field or method of the outer map).

However it introduces a new non-transient field. The nice thing about this field is that it's set before the call to super unlike any field that is defined inside the inner class body. A negative aspect is that the field is not transient, so it gets serialized.

An alternative is to use early initialization syntax to get the best of both worlds (at the cost of ugly code). The patch is like:

--- a/src/library/scala/collection/immutable/HashMap.scala
+++ b/src/library/scala/collection/immutable/HashMap.scala
@@ -125,12 +125,15 @@ class HashMap[A, +B] extends Map[A,B] with MapLike[A, B, HashMap[A, B]] with mut
   private def logLimit: Int = math.sqrt(table.length).toInt
 
   private[this] def markUpdated(key: A, ov: Option[B], delta: Int) { 
-    val lf = loadFactor
-    later = new HashMap[A, B] {
+    later = new {
+      @transient private var lf = HashMap.this.loadFactor
+    } with HashMap[A, B] {
+      /* 
+       * If this object is deserialized, lf will be 0, so we use _loadFactor. Otherwise we use lf
+       * to avoid initialization order issues.
+       */
+      override def loadFactor = if (lf == 0) _loadFactor else lf
       override def initialSize = 0
-      /* We need to do this to avoid a reference to the outer HashMap */
-      def _newLoadFactor = lf
-      override def loadFactor = _newLoadFactor
       table     = HashMap.this.table
       tableSize = HashMap.this.tableSize
       threshold = HashMap.this.threshold
diff --git a/src/library/scala/collection/immutable/HashSet.scala b/src/library/scala/collection/immutable/HashSet.scala
index 3d66c90..9d2e298 100644
--- a/src/library/scala/collection/immutable/HashSet.scala
+++ b/src/library/scala/collection/immutable/HashSet.scala
@@ -99,12 +99,15 @@ class HashSet[A] extends Set[A]
   private def logLimit: Int = math.sqrt(table.length).toInt
 
   private def markUpdated(elem: A, del: Boolean) { 
-    val lf = loadFactor
-    later = new HashSet[A] {
+    later = new {
+      @transient private var lf = HashSet.this.loadFactor
+    } with HashSet[A] {
+      /* 
+       * If this object is deserialized, lf will be 0, so we use _loadFactor. Otherwise we use lf
+       * to avoid initialization order issues.
+       */
+      override def loadFactor = if (lf == 0) _loadFactor else lf
       override def initialSize = 0
-      /* We need to do this to avoid a reference to the outer HashMap */
-      def _newLoadFactor = lf
-      override def loadFactor = _newLoadFactor
       table     = HashSet.this.table
       tableSize = HashSet.this.tableSize
       threshold = HashSet.this.threshold

This seems to do the job, see the relevant bytecode for the anonymous inner class in immutable.HashSet:

{
private transient int lf;


public scala.collection.immutable.HashSet$$$$anon$$1(scala.collection.immutable.HashSet);
  Code:
   Stack=2, Locals=3, Args_size=2
   0:	aload_1
   1:	invokeinterface	SI-14,  1; //InterfaceMethod scala/collection/mutable/FlatHashTable.loadFactor:()I
   6:	istore_2
   7:	aload_0
   8:	iload_2
   9:	putfield	SI-18; //Field lf:I
   12:	aload_0
   13:	invokespecial	SI-23; //Method scala/collection/immutable/HashSet."<init>":()V
   16:	aload_0
   17:	aload_1
   18:	invokeinterface	SI-27,  1; //InterfaceMethod scala/collection/mutable/FlatHashTable.table:()[Ljava/lang/Object;
   23:	invokeinterface	SI-31,  2; //InterfaceMethod scala/collection/mutable/FlatHashTable.table_$$eq:([Ljava/lang/Object;)V
   28:	aload_0
   29:	aload_1
   30:	invokeinterface	SI-34,  1; //InterfaceMethod scala/collection/mutable/FlatHashTable.tableSize:()I
   35:	invokeinterface	SI-38,  2; //InterfaceMethod scala/collection/mutable/FlatHashTable.tableSize_$$eq:(I)V
   40:	aload_0
   41:	aload_1
   42:	invokeinterface	SI-41,  1; //InterfaceMethod scala/collection/mutable/FlatHashTable.threshold:()I
   47:	invokeinterface	SI-44,  2; //InterfaceMethod scala/collection/mutable/FlatHashTable.threshold_$$eq:(I)V
   52:	return
  LineNumberTable: 
   line 103: 0
   line 102: 12
   line 111: 16
   line 112: 28
   line 113: 40

  LocalVariableTable: 
   Start  Length  Slot  Name   Signature
   0      53      0    this       Lscala/collection/immutable/HashSet$$$$anon$$1;
   0      53      1    $$outer       Lscala/collection/immutable/HashSet;
   7      45      2    lf       I

  Signature: length = 0x2
   00 32 

public int initialSize();
  Code:
   Stack=1, Locals=1, Args_size=1
   0:	iconst_0
   1:	ireturn
  LineNumberTable: 
   line 110: 0


public int loadFactor();
  Code:
   Stack=2, Locals=1, Args_size=1
   0:	aload_0
   1:	invokespecial	SI-54; //Method lf:()I
   4:	iconst_0
   5:	if_icmpne	17
   8:	aload_0
   9:	invokeinterface	SI-57,  1; //InterfaceMethod scala/collection/mutable/FlatHashTable._loadFactor:()I
   14:	goto	21
   17:	aload_0
   18:	invokespecial	SI-54; //Method lf:()I
   21:	ireturn
  LineNumberTable: 
   line 109: 0


private void lf_$$eq(int);
  Code:
   Stack=2, Locals=2, Args_size=2
   0:	aload_0
   1:	iload_1
   2:	putfield	SI-18; //Field lf:I
   5:	return
  LineNumberTable: 
   line 103: 0


private int lf();
  Code:
   Stack=1, Locals=1, Args_size=1
   0:	aload_0
   1:	getfield	SI-18; //Field lf:I
   4:	ireturn
  LineNumberTable: 
   line 103: 0


}

@scabug
Copy link
Author

scabug commented Dec 4, 2009

@paulp said:
Replying to [comment:2 ijuma]:

OK, so it doesn't cause a reference to the outer map (as it just references a final local variable instead of a field or method of the outer map).

In that case, more important than the choice of solution here is to at least document the situation. If it's this tricky to avoid both making the outer map uncollectable and inducing out of order initialization, not too many are going to get it right. I am inclined to leave the code as it is right now if we're only looking at a bonus int being serialized.

@scabug
Copy link
Author

scabug commented Dec 7, 2009

@ijuma said:
Weird, I never received a notification for Paul's comment.

Replying to [comment:3 extempore]:

In that case, more important than the choice of solution here is to at least document the situation.

Yes, the comment that is in the code at the moment (added by me before you fixed the initialisation problem) is misleading, so we should fix that no matter what.

If it's this tricky to avoid both making the outer map uncollectable and inducing out of order initialization, not too many are going to get it right. I am inclined to leave the code as it is right now if we're only looking at a bonus int being serialized.

The main issue with this is that it may restrict evolution. Let's say we get a real and nice implementation of an immutable HashMap. That field may disappear and we don't have control on how it gets inserted into the serialization stream. I am not clear on what the behaviour is if the field were to disappear from the class, but I can find out.

By the way, the main difficulty here is that deserialization does not invoke the constructor so we have different paths during normal instantiation and deserialization. Further, there is no way to set the _loadFactor variable from the subclass before the trait (HashTable or FlatHashTable) "constructor" has access to it (i.e. no constructor parameters).

@scabug
Copy link
Author

scabug commented Mar 8, 2010

@ijuma said:
Yay, the old immutable.HashMap has been removed in r21091. immutable.HashSet is still there though, I asked in ticket #1610 whether there are plans to remove that too.

@scabug
Copy link
Author

scabug commented Mar 15, 2010

@ijuma said:
And a new immutable.HashSet lands in r21177, so we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants