New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector concatenation is unreasonably slow (O(n^2), should be O(n)) #7725
Comments
Imported From: https://issues.scala-lang.org/browse/SI-7725?orig=1 |
@SethTisue said: |
@adriaanm said: |
@non said: This should be easy to test... I'll do it myself if I have time to get to it. |
@SethTisue said (edited on Aug 13, 2013 7:09:06 PM UTC):
But I don't think this should bother us more than slightly. I think getting asymptotic speedup on |
@non said: I'm going to try to get more data and find the crossover points here. Ideally we could dispatch to the right implementation based on the size of LHS/RHS. Thoughts? |
@SethTisue said: I would argue that the performance of |
@non said (edited on Aug 13, 2013 9:57:05 PM UTC): In practice, the crossover point is closer to 4:1. That is, concatenation is faster even if the LHS is 4x longer than the RHS. But to be pessimistic we could use a 1:1 heuristic, which would err on the side of doing folds. If you want a big-O justification, I'd argue that if LHS < RHS, then both concat and the fold are O( n ), since there will be O(RHS) fold iterations, and less than O(2RHS) builder iterations. (Of course, this argument works equally well for any constant instead of 2, like 5 in the 4:1 heuristic.) |
@Sciss said: |
@SethTisue said: |
@SethTisue said: |
@Ichoran said: |
@SethTisue said: |
@SethTisue said: |
@non said: I can review your fix, or collaborate on it, or whatever. |
@Ichoran said: I'll probably go with something like this barring radically superior ideas. (Note: I'm converting to Vector in the non-re-Traversable case in the hope that we'll get better-than-O(n) concat in Vector someday. Maybe from me, but don't count on it.)
|
@Ichoran said: |
@non said: I've been reviewing it. Your characterization seems right although I'm really trying to tweak the patch to see if I can preserve the previous performance for concatenating two large collections. If I don't make any progress I'll sign off later tonight. |
@pchiusano said: |
repeated ++ of a single-item collection onto a Vector is outrageously slow compared to repeatedly adding one item with + :
Paul Chiusano pointed this out over at #4442, where he comments: "Since Vector has a constant time snoc operation, it seems like the default implementation of ++ should be repeated calls to :+, which doesn't require any internal changes to Vector."
The text was updated successfully, but these errors were encountered: