Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement in scala.runtime.ArrayCharSequence.toString #8589

Closed
scabug opened this issue May 14, 2014 · 4 comments
Closed

Performance improvement in scala.runtime.ArrayCharSequence.toString #8589

scabug opened this issue May 14, 2014 · 4 comments

Comments

@scabug
Copy link

scabug commented May 14, 2014

While using a csv parser to process large char arrays i ran into the problem that the toString method of ArrayCharSequence was very slow on large buffers (and, as it appears, also on small buffers)

The problematic method:

override def toString = xs drop start take length mkString ""

I created a faster version which is already almost 50% faster on an empty charsequence and only becomes relatively faster when the array gets bigger (already 20x as fast in the test below). It appears the implementation is used in 2.10 (and maybe earlier) until the latest version.

Code (it's all about the toString method):

final class FastArrayCharSequence(val xs: Array[Char], start: Int, end: Int) extends CharSequence {
  // yikes
  // java.lang.VerifyError: (class: scala/runtime/ArrayCharSequence, method: <init> signature: ([C)V)
  //   Constructor must call super() or this()
  //
  // def this(xs: Array[Char]) = this(xs, 0, xs.length)

  def length: Int = math.max(0, end - start)
  def charAt(index: Int): Char = {
    if (0 <= index && index < length)
      xs(start + index)
    else throw new ArrayIndexOutOfBoundsException(index)
  }
  def subSequence(start0: Int, end0: Int): CharSequence = {
    if (start0 < 0) throw new ArrayIndexOutOfBoundsException(start0)
    else if (end0 > length) throw new ArrayIndexOutOfBoundsException(end0)
    else if (end0 <= start0) new FastArrayCharSequence(xs, 0, 0)
    else {
      val newlen = end0 - start0
      val start1 = start + start0
      new FastArrayCharSequence(xs, start1, start1 + newlen)
    }
  }
  override def toString = {
    if (start >= end)
      ""
    else
      new String(xs, start, length)
  }
}

object FastArrayCharSequence {
  implicit class FastArrayCharSequenceArrayWrapper(val xs: Array[Char]) extends AnyVal {
    def fastSubSequence(start: Int, end: Int) : CharSequence = {
      new FastArrayCharSequence(xs, start, Math.min(end, xs.length))
    }
  }
}


object Test extends App {
  import FastArrayCharSequence._

  val chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".toCharArray

  val tests = for {
    i <- 0 to chars.length
    k <- 0 to chars.length
  } yield {
    chars.subSequence(i, k).toString == chars.fastSubSequence(i, k).toString
  }

  println("All equal: " + tests.forall(eq => eq))

  var index = 0
  while(index < chars.length ) {
    val someChars = new String(chars).substring(0, index).toCharArray
    var start = System.currentTimeMillis()

    println(s"Size: $index")

    for {
      c <- 0 to 1000
      i <- 0 to someChars.length
      k <- 0 to someChars.length
    } yield {
      someChars.subSequence(i, k).toString.length
    }

    val oldMillis = System.currentTimeMillis() - start

    start = System.currentTimeMillis()

    for {
      c <- 0 to 1000
      i <- 0 to someChars.length
      k <- 0 to someChars.length
    } yield {
      someChars.fastSubSequence(i, k).toString.length
    }

    val newMillis = System.currentTimeMillis() - start

    println(s"String size: $index, current: $oldMillis, new: $newMillis, improvment: ${((oldMillis.toDouble / newMillis.toDouble)).toInt}")

    index = index + 1
  }
}

@scabug
Copy link
Author

scabug commented May 14, 2014

Imported From: https://issues.scala-lang.org/browse/SI-8589?orig=1
Reporter: @jtvoorde
Assignee: @jtvoorde
Affected Versions: 2.10.0, 2.10.1, 2.10.2, 2.10.3, 2.10.4, 2.11.0

@scabug
Copy link
Author

scabug commented May 14, 2014

@retronym said:
Thanks for reporting!

Would you be interested in submitting a pull request with a patch? More details on the process: https://github.com/scala/scala/blob/2.12.x/CONTRIBUTING.md

@scabug
Copy link
Author

scabug commented May 14, 2014

@jtvoorde said:
I did! scala/scala#3750

Is this something that could be merged back to 2.10 / 2.11? Do i need to create separate pull requests for that?

Thanks,
Jeroen

@scabug
Copy link
Author

scabug commented Jul 4, 2014

@adriaanm said:
scala/scala#3752

@scabug scabug closed this as completed Jul 4, 2014
@scabug scabug added this to the 2.10.5 milestone Apr 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant