Details

      Description

      See http://stackoverflow.com/questions/7500081/scala-what-is-the-best-way-to-append-an-element-to-an-array

      Using the ':+' operator on an array is about 2-3x slower than a simple append() method that uses Array.copy(), both for arrays containing object instances and for arrays containing Long or Int.

      Couldn't ':+' in SeqLike[T] be overridden with a custom implementation for arrays? The default implementation creates a builder, adds the original array, then adds the element, then renders the result into a new array.

        Activity

        Hide
        Gregor Scheidt added a comment -

        First off, I am a relative Scala newbie, so my sincerest apologies if this is either completely naive or misguided (e.g. because there is a good reason that this makes absolutely no sense or because there is no way to get the required information at the right place or because there is no way to get this routine called for Array[T] within the generic structure of the collection classes).

        However, on the remote chance that this is simply a minor thing that's slipped through the cracks and it would be possible to do it, here is a little bit more information to clarify the issue:

        (1) the default implementation for :+() that seems to be called for an Array[T] is on scala.collection.SeqLike. The implementation in 2.9.0.1 is:

        def :+[B >: A, That](elem: B)(implicit bf: CanBuildFrom[Repr,B,That]): That = {
           val b = bf(repr)
           b ++= thisCollection
           b += elem
           b.result
        }
        

        This constructs a builder, adds the original elements and then the extension element. Seems to not be optimal for an array.

        (2) here is a sketch of a specialized implementation for Array[T] that uses Array.copy(). I know that this does not have the same structure as the one on SeqLike (simply because I do not know how to get it called for an array - as I said, I'm a newbie), so this is for illustration purposes only:

        def :+[T: ClassManifest]( xs: Array[T], x: T ) : Array[T] = {
           val oldlength = xs.length
           val extended = Array.ofDim[T]( oldlength + 1 )
           Array.copy( xs, 0, 0, oldlength )
           extended( oldlength ) = x
           extended
        }
        

        (3) should something remotely resembling the implementation based on Array.copy() be possible, the gains would possibly be attractive. Here is a naive mini-benchmark:

        @Test def test() {
           val before = Array(1,2,3,4,5,6,7,8)
           for( j <- 0 until 10 ) {    // do 10 runs, consider only best time out of ten
              val t = System.currentTimeMillis
              for( i <- 0L until 10000000L ) 
                 before +: 9         // (a) SeqLike
                 // +:( before, 9 )  // (b) proposed
              println( System.currentTimeMillis - t ) 
           }
        }
        

        The results are as follows (with -server, best out of 10 runs to eliminate GC effects and allow optimizer to run):

        (a) SeqLike: 1496 ms
        (b) proposed: 633 ms

        Show
        Gregor Scheidt added a comment - First off, I am a relative Scala newbie, so my sincerest apologies if this is either completely naive or misguided (e.g. because there is a good reason that this makes absolutely no sense or because there is no way to get the required information at the right place or because there is no way to get this routine called for Array [T] within the generic structure of the collection classes). However, on the remote chance that this is simply a minor thing that's slipped through the cracks and it would be possible to do it, here is a little bit more information to clarify the issue: (1) the default implementation for :+() that seems to be called for an Array [T] is on scala.collection.SeqLike. The implementation in 2.9.0.1 is: def :+[B >: A, That](elem: B)(implicit bf: CanBuildFrom[Repr,B,That]): That = { val b = bf(repr) b ++= thisCollection b += elem b.result } This constructs a builder, adds the original elements and then the extension element. Seems to not be optimal for an array. (2) here is a sketch of a specialized implementation for Array [T] that uses Array.copy(). I know that this does not have the same structure as the one on SeqLike (simply because I do not know how to get it called for an array - as I said, I'm a newbie), so this is for illustration purposes only: def :+[T: ClassManifest]( xs: Array[T], x: T ) : Array[T] = { val oldlength = xs.length val extended = Array.ofDim[T]( oldlength + 1 ) Array.copy( xs, 0, 0, oldlength ) extended( oldlength ) = x extended } (3) should something remotely resembling the implementation based on Array.copy() be possible, the gains would possibly be attractive. Here is a naive mini-benchmark: @Test def test() { val before = Array(1,2,3,4,5,6,7,8) for( j <- 0 until 10 ) { // do 10 runs, consider only best time out of ten val t = System.currentTimeMillis for( i <- 0L until 10000000L ) before +: 9 // (a) SeqLike // +:( before, 9 ) // (b) proposed println( System.currentTimeMillis - t ) } } The results are as follows (with -server, best out of 10 runs to eliminate GC effects and allow optimizer to run): (a) SeqLike: 1496 ms (b) proposed: 633 ms
        Hide
        Paolo G. Giarrusso added a comment - - edited

        I believe the generic implementation could be sped up by adding, before b ++= thisCollection, also b.sizeHint(length + 1). Considering that this will be a WrappedArray (IIRC), b should be a WrappedArrayBuilder. To finish the implementation, one should add ++= to WrappedArrayBuilder - for instance by copying or reusing ArrayBuffer.++=. I believe the combined effect of these optimizations should be close to or indistinguishable from your proposed implementation, while being more general.

        Of course, the same change as for :+ should also be done for +:.

        Show
        Paolo G. Giarrusso added a comment - - edited I believe the generic implementation could be sped up by adding, before b ++= thisCollection , also b.sizeHint(length + 1) . Considering that this will be a WrappedArray (IIRC), b should be a WrappedArrayBuilder . To finish the implementation, one should add ++= to WrappedArrayBuilder - for instance by copying or reusing ArrayBuffer.++= . I believe the combined effect of these optimizations should be close to or indistinguishable from your proposed implementation, while being more general. Of course, the same change as for :+ should also be done for +: .
        Hide
        Paolo G. Giarrusso added a comment -

        A correction: I discussed WrappedArray before, but your example will actually wrap the Array inside ArrayOps, so one should consider that code as well - including the multiple manual specialization of ArrayBuilder. It seems that all ArrayBuilder specializations already has a specialized implementation of ++=, so adding the sizeHint to SeqLike could be enough.

        Show
        Paolo G. Giarrusso added a comment - A correction: I discussed WrappedArray before, but your example will actually wrap the Array inside ArrayOps , so one should consider that code as well - including the multiple manual specialization of ArrayBuilder . It seems that all ArrayBuilder specializations already has a specialized implementation of ++= , so adding the sizeHint to SeqLike could be enough.
        Show
        Jason Zaugg added a comment - https://github.com/scala/scala/pull/1739
        Hide
        Jason Zaugg added a comment -

        We'll have to revert this in 2.10.x as the change is not forward binary compatible; libraries compiled against this method in 2.10.1 will not work in 2.10.0.

        Show
        Jason Zaugg added a comment - We'll have to revert this in 2.10.x as the change is not forward binary compatible; libraries compiled against this method in 2.10.1 will not work in 2.10.0.
        Hide
        Jason Zaugg added a comment -

        Elevated to blocker for the bin-compat reversion.

        Show
        Jason Zaugg added a comment - Elevated to blocker for the bin-compat reversion.
        Hide
        James Iry added a comment -

        My bad, I should have caught that in review.

        Show
        James Iry added a comment - My bad, I should have caught that in review.
        Hide
        James Iry added a comment -
        Show
        James Iry added a comment - revert for 2.10.x is https://github.com/scala/scala/pull/2080

          People

          • Assignee:
            James Iry
            Reporter:
            Gregor Scheidt
          • Votes:
            2 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development