Performance of immutable TreeMap/TreeSet implementations #5331

scabug · 2011-12-21T20:49:29Z

Recently I wanted to use the collection.immutable.TreeMap implementation with some fairly large trees (10k - 200k elements). Unfortunately it turns out that only the basic lookup/add/delete operations perform well. All others are usually O(n), even simple operations like head.

There seem to be three root causes:

Most operations use the default implementations from IterableLike/TraversableLike, so many of these operations are defined in terms of 'iterator'. If you look at the definition of iterator in TreeSet/TreeMap you'll find that they use RedBlack#Tree.toStream, which constructs a fully realized stream of elements in the tree upon creation (so it is an O(n) space/time operation even before starting any iteration).
The range related operations (range, from, to, until) construct a new tree by splitting the existing tree. Unfortunately, the size of the new tree is calculated by calling RedBlack#Tree.count. This is an O(n) operation (also see TreeMap from, to and range methods are O(n) instead O(log n) #4026).
The RedBlack implementation uses nested classes and is extended by TreeMap/TreeSet. That means that any node in the tree has a pointer to the root of the tree itself. This is an obvious source of space leaks, especially when using structural sharing. See for example collection.immutable.TreeMap leaks memory on removing the only element #4084, although that particular issue may have been fixed already.

To fix the first issue I created a custom iterator for RedBlack trees (RedBlack#TreeIterator) that is constructed on O(n log n) time/space and is also much quicker at iteration. I've also added custom implementations for head/last/headOption/lastOption/init/tail. This can be found on the "trees-iterator" branch at http://github.com/erikrozendaal/scala. The diff is here: erikrozendaal/scala@master...trees-iterator

Here's a quick overview of the performance difference of various operations. Take a large pinch of salt when looking at the numbers, since getting consistent benchmarks out of the JVM seems to be a black art. The benchmark code can be found at https://gist.github.com/1507127

Master build number is 2.10.0.rdev-4019-2011-12-20-gbba3b00.

TreeSet containing 10000 random ints.

foreach: master: 2443 ops/sec, trees-iterator: same.
head: master: 354 ops/sec, trees-iterator: 10M ops/sec
last: master: 322 ops/sec, trees-iterator: 10M ops/sec
tail: master: 41 ops/sec, trees-iterator: 180K ops/sec
init: master: 89 ops/sec, trees-iterator: 138K ops/sec
iterator: master: 360 ops/sec, trees-iterator: 1.8M ops/sec
iterator.size: master: 41 ops/sec, trees-iterator: 1428 ops/sec

(Note that 'iterator.size' is much slower than the 'last' operation, which is almost as fast as 'head'. This is due to 'head' and 'iterator.size' using an iterator, while 'last' is defined in TraversableLike and uses the faster RedBlack#Tree.foreach operation directly. 'last' would be even faster if it didn't call 'head' first).

The quick fix for the second issue is to turn the count method in RedBlack#NonEmpty into a val. This speeds up range operations and also allows O(log n) versions of drop/take/dropRight/takeRight/slice/splitAt by using structural tree sharing. Also takeWhile/dropWhile/span can be implemented more efficiently. This change may effect serialization compatibility though (is this a problem?) and also increases the memory used per RedBlack node. See the trees-ranges branch (diff:erikrozendaal/scala@trees-iterator...trees-ranges).

Compared to master and trees-iterator the performance changes are:

TreeSet containing 10000 random ints.

drop(5) : master: 33 ops/sec, trees-iterator: 113 ops/sec, trees-ranges: 270K ops/sec
take(5) : master: 381 ops/sec, trees-iterator: 1M ops/sec, trees-ranges: 1M ops/sec
drop(5000) : master: 39 ops/sec, trees-iterator: 225 ops/sec, trees-ranges: 326K ops/sec
take(5000) : master: 59 ops/sec, trees-iterator: 262 ops/sec, trees-ranges: 356K ops/sec
dropWhile(_ < 0): master: 241 ops/sec, trees-iterator: 242 ops/sec, trees-ranges: 2822 ops/sec
takeWhile(_ < 0): master: 61 ops/sec, trees-iterator: 265 ops/sec, trees-ranges: 2812 ops/sec

The only way I see to fix issue 3 is to changed the nested RedBlack classes into ordinary classes. This obviously breaks binary and source compatibility (TreeMap/TreeSet can no longer extend this class, but my suspicion is that RedBlack is not meant to be used outside of the scala library). But this does fix the space leak and brings back down the memory used by a RedBlack node. The code can be found in the trees-redblack branch and also includes some micro-optimizations for head/tail functionality. Seehttps://github.com/erikrozendaal/scala/compare/trees-ranges...trees-redblack. Also note that this breaks test/files/run/t2873.scala, since it uses RedBlack#Empty for testing for the absence of a compiler bug.

Finally, these patches are mainly about getting usable big-O performance out of TreeMap/TreeSet. A lot can still be done to increase efficiency (smarter use of the Ordering class to avoid additional comparisons, etc).

Any chance that these patches can become part of 2.10? What about 2.9 for at least the binary compatible changes?

Regards,
Erik

The text was updated successfully, but these errors were encountered:

scabug · 2011-12-21T20:49:29Z

Imported From: https://issues.scala-lang.org/browse/SI-5331?orig=1
Reporter: Erik Rozendaal (erikrozendaal)
Affected Versions: 2.9.1, 2.9.2, 2.10.0
See #4084

scabug · 2011-12-28T12:22:03Z

Erik Rozendaal (erikrozendaal) said:
Submitted pull request scala/scala#82

scabug · 2012-01-06T13:06:46Z

@dcsobral said:
So, here are my thoughts regarding the latest submission (not mentioned in this ticket). My concerns are:

Correctness
Speed
Backward compatibility at source code level (even if we keep binary here, it will be lost elsewhere)
Style

I have strong stylistic reservations about the code -- I think it will be difficult to maintain and to change, but we shouldn't be changing it much anyway. The methods it provides, particularly with your additions such as an iterator class, should be enough for its users in the Scala library.

It is difficult to assert correctness. The scalacheck test suit covers RB-specific issues, not general methods. Now, most of the new stuff is related to the new Iterator class, so a good scalacheck of that should cover it too. The initiative of providing additional scalacheck tests was really good. So, as far as I'm concerned, this is about as good as it can get.

I'm happy that this changes are being benchmarked, and the numbers are really impressive.

As for source code compatibility, it's been thrown out completely.

So what I suggest is that no change be made to RedBlack in the library. Add a new private implementation, change TreeSet and TreeMap (some compatibility changes there, unfortunately), and deprecate the RedBlack tree in the library. If think that will provide the cleanest break from what exists, and, keeping the new implementation private (as is already being done in the pull request) will allay most of my stylistic concerns: a private class needs much less maintenance than a public one, and can be changed a lot more in said maintenances.

I'd like to hear from Aleksandar Prokopec on this, since he's dealt with all my patches to RedBlack.

scabug · 2012-01-06T14:37:22Z

@paulp said:
Assigning by request! Most of aren't lucky enough to have fans asking for us by name.

scabug · 2012-01-06T22:25:09Z

Erik Rozendaal (erikrozendaal) said:
I restored the old implementation of RedBlack class in the commit erikrozendaal/scala@f656142 and marked the class as deprecated.

This required renaming the {Red,Black}Tree classes in the RedBlack object to avoid naming conflicts. I renamed them to {Red,Black}Node.

scabug · 2012-02-21T11:41:38Z

@soc said:
Would be nice to make sure that http://docs.scala-lang.org/overviews/collections/performance-characteristics.html is updated accordingly if some of characteristics have changed.

scabug · 2012-05-19T22:17:34Z

@retronym said:
I'm going to take the glory of closing this ticket, after the real work is well behind us, as gloriously chronicled in: scala/scala#82

scabug closed this as completed May 19, 2012

scabug added the performance label Apr 7, 2017

scabug added this to the 2.10.0 milestone Apr 7, 2017

scabug assigned phaller Apr 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of immutable TreeMap/TreeSet implementations #5331

Performance of immutable TreeMap/TreeSet implementations #5331

scabug commented Dec 21, 2011

scabug commented Dec 21, 2011

scabug commented Dec 28, 2011

scabug commented Jan 6, 2012

scabug commented Jan 6, 2012

scabug commented Jan 6, 2012

scabug commented Feb 21, 2012

scabug commented May 19, 2012

Performance of immutable TreeMap/TreeSet implementations #5331

Performance of immutable TreeMap/TreeSet implementations #5331

Comments

scabug commented Dec 21, 2011

scabug commented Dec 21, 2011

scabug commented Dec 28, 2011

scabug commented Jan 6, 2012

scabug commented Jan 6, 2012

scabug commented Jan 6, 2012

scabug commented Feb 21, 2012

scabug commented May 19, 2012