Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SortedSet/SortedMap defect: invariance violation at Empty #3796

Closed
scabug opened this issue Aug 25, 2010 · 22 comments
Closed

SortedSet/SortedMap defect: invariance violation at Empty #3796

scabug opened this issue Aug 25, 2010 · 22 comments
Assignees

Comments

@scabug
Copy link

scabug commented Aug 25, 2010

This is Scala 2.8.0.final or equally a more recent snapshot like scala-2.8.0.r22830-b20100824020153.

The following example makes the Redblack implementation behind SortedSet/SortedMap crash:

import scala.collection.immutable.SortedSet

abstract class OP
case class A(i: Int) extends OP
case class D(i: Int) extends OP
case class R(i: Int, j: Int) extends OP

val ops = List(A(6), A(10), A(12), A(14), A(16), A(17), A(19), A(21), A(22), A(24), A(29), D(10), A(10), D(10), A(10), D(12), A(12), D(12), A(12), R(16,30), D(17), D(19))

(SortedSet.empty[Int] /: ops) { case (s, A(i)) => s + i case (s, D(i)) => s - i case (s, R(i, j)) => s.range(i, j) }

Here is the JVM dump:

java.lang.RuntimeException: Defect: invariance violation at Empty
	at scala.collection.immutable.RedBlack$$NonEmpty.balRight$$1(RedBlack.scala:120)
	at scala.collection.immutable.RedBlack$$NonEmpty.delRight$$1(RedBlack.scala:127)
	at scala.collection.immutable.RedBlack$$NonEmpty.del(RedBlack.scala:149)
	at scala.collection.immutable.RedBlack$$NonEmpty.delLeft$$1(RedBlack.scala:123)
	at scala.collection.immutable.RedBlack$$NonEmpty.del(RedBlack.scala:148)
	at scala.collection.immutable.RedBlack$$Tree.delete(RedBlack.scala:36)
	at scala.collection.immutable.TreeSet.$$minus(TreeSet.scala:93)
	at scala.collection.immutable.TreeSet.$$minus(TreeSet.scala:47)
	at $$anonfun$$1.apply(<console>:17)
	at $$anonfun$$1.apply(<console>:17)
	at scala.collection.LinearSeqOptimized$$class.foldLeft(LinearSeqOptimized.sca...

Maybe some mistake from some literature has been copied here. Something like this has happened to other libraries before.

It might help to look at the following proven implementation
http://isabelle.in.tum.de/dist/library/HOL/Library/RBT_Impl.html

@scabug
Copy link
Author

scabug commented Aug 25, 2010

Imported From: https://issues.scala-lang.org/browse/SI-3796?orig=1
Reporter: Makarius (makarius)
Attachments:

  • patch.txt (created on Sep 4, 2010 9:32:41 PM UTC, 11638 bytes)
  • redblack.scala (created on Oct 9, 2010 8:14:51 PM UTC, 7743 bytes)

@scabug
Copy link
Author

scabug commented Aug 31, 2010

@lrytz said:
Daniel, can you have a look if this is related to SI-2849?

@scabug
Copy link
Author

scabug commented Aug 31, 2010

@dcsobral said:
I have looked at it, and this is not related to #2849, except to the extent that the patch I submitted had a few sanity checks, which were triggered here.

The problem is solely with the naive implementation of range. Before calling that method, the tree is balanced:

scala> res49
res54: scala.collection.immutable.RedBlack[Int]#BlackTree[Int] = BlackTree(17,(),BlackTree(14,(),BlackTree(10,(),RedTree(6,(),Empty,Empty),RedTree(12,(),Empty,Empty)),BlackTree(16,(),Empty,Empty)),BlackTree(21,(),BlackTree(19,(),Empty,Empty),RedTree(24,(),BlackTree(22,(),Empty,Empty),BlackTree(29,(),Empty,Empty))))

scala> res49.left
res57: scala.collection.immutable.RedBlack[Int]#Tree[Int] = BlackTree(14,(),BlackTree(10,(),RedTree(6,(),Empty,Empty),RedTree(12,(),Empty,Empty)),BlackTree(16,(),Empty,Empty))

scala> res49.right
res58: scala.collection.immutable.RedBlack[Int]#Tree[Int] = BlackTree(21,(),BlackTree(19,(),Empty,Empty),RedTree(24,(),BlackTree(22,(),Empty,Empty),BlackTree(29,(),Empty,Empty)))

Now, when range(Some(16), Some(30)) is called on it, all the conditions at the start are false, so the code path beginning with val newLeft = left.range(from, None) is taken:

176	   override def range(from: Option[A], until: Option[A]): Tree[B] = {
177	      if (from == None && until == None) return this
178	      if (from != None && isSmaller(key, from.get)) return right.range(from, until);
179	      if (until != None && (isSmaller(until.get,key) || !isSmaller(key,until.get)))
180	        return left.range(from, until);
181	      val newLeft = left.range(from, None)
182	      val newRight = right.range(None, until)
183	      if ((newLeft eq left) && (newRight eq right)) this
184	      else if (newLeft eq Empty) newRight.upd(key, value);
185	      else if (newRight eq Empty) newLeft.upd(key, value);
186	      else mkTree(isBlack, key, value, newLeft, newRight)
187	    }

None of the conditions following it are true either, so mkTree is called, which simply creates a new tree with the branches as indicated. However, the branches returned by newLeft and newRight are not balanced:

scala> res49.left.range(Some(16), None)
res55: scala.collection.immutable.RedBlack[Int]#Tree[Int] = BlackTree(16,(),Empty,Empty)

scala> res49.right.range(None, Some(30))
res56: scala.collection.immutable.RedBlack[Int]#Tree[Int] = BlackTree(21,(),BlackTree(19,(),Empty,Empty),RedTree(24,(),BlackTree(22,(),Empty,Empty),BlackTree(29,(),Empty,Empty)))

In fact, such a thing could happen for trees of any depth, which makes me doubtful such a join can be easily and efficiently done. The append on del depends on both subtrees having the same depth. However, I haven't researched it beyond identifying the bug.

@scabug
Copy link
Author

scabug commented Aug 31, 2010

@dcsobral said:
By the way, the proven implementation given doesn't contain an implementation of range either.

@scabug
Copy link
Author

scabug commented Sep 1, 2010

@lrytz said:
thanks for your investigation, daniel

@scabug
Copy link
Author

scabug commented Sep 1, 2010

Makarius (makarius) said:
The discussion of range above touches general library design issues. I was first looking for a way to get the next entry after a given one, or enumerate an interval between two entries. I was then surprised to see the full range operation that does the job together with toStream. Later I became more ambitious and started using range more agressively to tune my application, which is here BTW: http://isabelle.in.tum.de/repos/isabelle/file/a4a465dc89d9/src/Pure/PIDE/markup_tree.scala#l59

From a user's point of view, I would say plain enumeration of ranges would be fine as a start ...

@scabug
Copy link
Author

scabug commented Sep 1, 2010

Makarius (makarius) said:
... and for the record, this one is the real situation that was producing the crash: http://isabelle.in.tum.de/repos/isabelle/rev/9f63135f3cbb

@scabug
Copy link
Author

scabug commented Sep 1, 2010

@dcsobral said:
Replying to [comment:6 makarius]:

The discussion of range above touches general library design issues. I was first looking for a way to get the next entry after a given one, or enumerate an interval between two entries. I was then surprised to see the full range operation that does the job together with toStream. Later I became more ambitious and started using range more agressively to tune my application, which is here BTW: http://isabelle.in.tum.de/repos/isabelle/file/a4a465dc89d9/src/Pure/PIDE/markup_tree.scala#l59

From a user's point of view, I would say plain enumeration of ranges would be fine as a start ...

You can get an iterator for the SerotedSet, on which you can apply dropWhile and takeWhile, which would result in the same effect. For instance:

import scala.collection.immutable.SortedSet

def SortedSetRangeIterator[A](set: SortedSet[A], from: A, until: A)(implicit ord: Ordering[A]) = {
  import ord._
  set.iterator.dropWhile(_ < from).takeWhile(_ < until)
}

Testing this I noticed that TreeSet's range is broken. All range methods in Scala take an inclusive lower bound and exclusive upper bound, which is not happening here.

I wonder if I should open a new ticket, or leave this to be fixed here, which we'll have to touch range extensively anyway.

@scabug
Copy link
Author

scabug commented Sep 1, 2010

Makarius (makarius) said:

From a user's point of view, I would say plain enumeration of ranges would be fine as a start ...

You can get an iterator for the SerotedSet, on which you can apply dropWhile and takeWhile, which would result in the same effect. For instance:

import scala.collection.immutable.SortedSet

def SortedSetRangeIterator[A](set: SortedSet[A], from: A, until: A)(implicit ord: Ordering[A]) = {
  import ord._
  set.iterator.dropWhile(_ < from).takeWhile(_ < until)
}
}}

Err, I wanted to exploit the tree structure, navigating quickly to the starting point and then continuing from there in a linear fashion. The dropWhile above incurs a linear penalty in the start, right?

@scabug
Copy link
Author

scabug commented Sep 1, 2010

Makarius (makarius) said:

Testing this I noticed that {{TreeSet}}'s {{range}} is broken. All {{range}} methods in Scala take an inclusive lower bound and exclusive upper bound, which is not happening here.

I wonder if I should open a new ticket, or leave this to be fixed here, which we'll have to touch range extensively anyway.

Yes, I did stumble across this unexpected behaviour, too. Although I thought it was intentional. Exclusive end range is better of course, and it will save me one special case in chopping off my tree.

@scabug
Copy link
Author

scabug commented Sep 4, 2010

@dcsobral said:
Fix for range

@scabug
Copy link
Author

scabug commented Sep 4, 2010

@dcsobral said:
I have produced a patch for this, taking advantage of some methods used for del, which have now been moved outside that method, and an algorithm I devised for appending trees of arbitrary depth differences. I've tested this with a scalacheck suite of tests, also attached.

Since I have worked on this patch on a cloned github repo, I have also made a pull request there for the patch, in the change that that's easier.

@scabug
Copy link
Author

scabug commented Sep 5, 2010

@dcsobral said:
Ignore that patch. The tests were not correctly written and did not detect other errors still present in the code.

@scabug
Copy link
Author

scabug commented Oct 5, 2010

@dcsobral said:
I have [http://github.com/dcsobral/scala/commit/3fd85f6f52e5f5685ce22cb5cbbfd5d406a107f4 committed] a simple fix on github that relies on dropWhile, takeWhile, foldLeft and update to produce a new range.

The performance shouldn't be particularly bad, though it does create a whole new tree instead of taking advantage of existing elements of the existing tree. Still, this ought to be better than the broken implementation presently available.

It is still my intention to pursue a superior fix, but I lost the window to 2.8.1 already, so I'd prefer to have this fix in.

@scabug
Copy link
Author

scabug commented Oct 6, 2010

@axel22 said:
I've added the fix to trunk.

Thanks!

@scabug
Copy link
Author

scabug commented Oct 6, 2010

@dcsobral said:
Replying to [comment:16 prokopec]:

I've added the fix to trunk.

Thanks!

It is buggy. The first "forall" needs to be an "exists", I think. I have always hated figuring forall/exists logic for empty collections... I'm working on fixing the test suit to detect this problem before I work on fixing it (which ought to be trivial, but my record with this ticket has been lousy).

@scabug
Copy link
Author

scabug commented Oct 8, 2010

@dcsobral said:
I just attached a new scalacheck test suite, which should be much superior to the one I had before. Also, it tests the range method.

@scabug
Copy link
Author

scabug commented Oct 9, 2010

@dcsobral said:
ScalaCheck test suite for RedBlack

@scabug
Copy link
Author

scabug commented Oct 9, 2010

@dcsobral said:
I warned you you should have waited for my new test suite. As it happens, there was still one bug left. I made a pull request on the fix here: http://github.com/dcsobral/scala/pull/1

@scabug
Copy link
Author

scabug commented Oct 10, 2010

@dcsobral said:
Curiously enough, once I had a proper test suite in place, doing the actual work was pretty quick. I reused most of the stuff from the previous broken patch, with just a few minor modifications.

Patch is here: http://github.com/dcsobral/scala/commit/e87ed79e78378b864eacbc498554f224de177ccc

However, even though it passed all tests from my test suite (and it didn't pass some when I had something wrong -- which gives me some confidence on the test suite itself :), it is rather big and a bit complex. Commit with care.

@scabug
Copy link
Author

scabug commented Nov 13, 2010

@dcsobral said:
I had to delete my scala fork and start over, because of a corruption problem. So here is the thew new link to a more efficient implementaton of range.

dcsobral/scala@6aa05ed

@scabug
Copy link
Author

scabug commented Nov 19, 2010

@axel22 said:
(In r23551) Applying patch by Daniel Sobral for #3796. Closes #3796.

No review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants