New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: Copying mutable.HashSet (FlatHashTable) 100x slower than java.util.HashSet. #5293
Comments
Imported From: https://issues.scala-lang.org/browse/SI-5293?orig=1
|
@dcsobral said: |
@dcsobral said: commit 2f7197c50b31386118a660833828ea720bf9d239
Author: Aleksandar Pokopec <aleksandar.prokopec@epfl.ch>
Date: Wed Oct 20 20:19:48 2010 +0000
Changed hash code strategy in hash table - now ...
Changed hash code strategy in hash table - now taking most significant bits when transforming the hash code into an index. And // Note:
// we take the most significant bits of the hashcode, not the lower ones
// this is of crucial importance when populating the table in parallel I'm surprised |
takuo said: I have misunderstood that, so that I corrected the summary. |
@axel22 said: I would be in favour of implementing Parallel flat hash table on the other hand requires that the elements are divided into groups according to certain bits in the hash code (and it's unclear which bits this should be until the size of the hash table is known, so the choice has to be agnostic in the size of the resulting hash table). Second, elements in the same groups have to end up in the same contiguous block of the resulting hash table, so that these blocks can be populated in parallel by different processors without synchronization. The problem with that is that traversing the elements in a cache-friendly way seems to always result in the elements with the same hashcode prefix being traversed together. The solution is not yet obvious to me, but I think something can be done with segmenting the hashcode into 2 parts - by taking first 5 upper bits and the rest of the necessary bits from the bottom. I have to think about this a little bit, before I proceed. |
@axel22 said: Since this value would be hash-table instance specific (could also be recomputed upon table growth), that would solve the above problem. The only issue is producing the XOR value - having a global random generator might kill scalability. A thread local random generator might help here. |
takuo said: XORing with random values seems good. protected var mask: Int = _
def addEntry(elem: A) : Boolean = {
if (tableSize == 0) {
mask = ...
// improve(mask ^ elem.hashCode) == 0xc0000000
}
...
} This guarantee the element to be placed at the third quater of the table, ensureing the value changes every time. |
@axel22 said: The random value based rotating seems to solve the problem, and it doesn't seem to affect performance. To avoid random values, one could compute this constant from the size of the table. |
Short description: Copying mutable.HashSet (adding all elements of a mutable.HashSet to an empty mutable.HashSet) is very slow.
Cause: Excessive collisions results from the characteristic of the FlatHashTable#index method and lack of the sizeHint method.
Possible fix: Modify FlatHashTable#index method and/or implement sizeHint method.
Detailed Description:
Running attached benchmark script, copying mutable.HashSet is 100x slower than java.util.HashSet.
This only occurs when adding all elements of a mutable.HashSet to an empty HashSet.
If we sort the values before adding to a HashSet, it runs fast.
After careful inspection, I found that:
The FlatHashTable#index method is modified at 2.9.x.
At 2.8.x, it uses the lower bits of improved hash code, so that the problem does not occurred.
A possible fix is to use the lower bits of improved hash code, but I am not sure about other impacts on characteristics of the method.
Another possible fix is to implement sizeHint method and call it from ++= and other methods.
The text was updated successfully, but these errors were encountered: