Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser Combinators - parsing String is worse than StringReader #8542

Closed
scabug opened this issue Apr 27, 2014 · 5 comments
Closed

Parser Combinators - parsing String is worse than StringReader #8542

scabug opened this issue Apr 27, 2014 · 5 comments

Comments

@scabug
Copy link

scabug commented Apr 27, 2014

When parsing s: String, using parseAll(p, new java.io.StringReader(s)) is much, much more efficient than using just parseAll(p, s) - difference in runtime's order of growth is stunning.
I've encoutered this while working on my project, so this is a real-world problem.

I've made sort of a benchmark from it, it is attached to the issue. Here are the results for me:

parseAll(new StringReader(String))
For 100 items: 7 ms
For 500 items: 33 ms
For 1000 items: 47 ms
For 5000 items: 92 ms
For 10000 items: 155 ms
For 50000 items: 1358 ms
===
parseAll(String)
For 100 items: 2 ms
For 500 items: 34 ms
For 1000 items: 97 ms
For 5000 items: 1681 ms
For 10000 items: 6116 ms
For 50000 items: 175927 ms

P.S.
Btw, gap between 10,000 and 50,000 is strange

Duplicate of #7710.

@scabug
Copy link
Author

scabug commented Apr 27, 2014

Imported From: https://issues.scala-lang.org/browse/SI-8542?orig=1
Reporter: Alexander Abdugafarov (frozenspider)
Affected Versions: 2.10.3
Duplicates #7710
Attachments:

@scabug
Copy link
Author

scabug commented Apr 28, 2014

@soc said:
Not sure whether the official status of scala-parser-combinators has been changed to reflect its de-facto status (of being virtually unmaintained) since it has been split from the main library, but I'd recommend not bothering with it and using an alternative like parboiled2 instead.

@scabug
Copy link
Author

scabug commented Apr 28, 2014

@dcsobral said:
I bet that's caused by the change in substring behavior introduced early on Java 1.7 life, which made parsers quadratic, generally speaking. You could confirm it by running the benchmark against Java 1.6.

If that's the case, then it's simply a matter of finding intances of substringing in the code, or simply introducing the workaround of always using StringReader on a String input.

@scabug
Copy link
Author

scabug commented Apr 28, 2014

Alexander Abdugafarov (frozenspider) said:
Confirmation, parseAll(String) is much faster on Java 1.6:

parseAll(new StringReader(String))
For 100 items: 12 ms
For 500 items: 48 ms
For 1000 items: 64 ms
For 5000 items: 157 ms
For 10000 items: 147 ms
For 50000 items: 1470 ms
===
parseAll(String)
For 100 items: 0 ms
For 500 items: 2 ms
For 1000 items: 4 ms
For 5000 items: 17 ms
For 10000 items: 49 ms
For 50000 items: 186 ms

@scabug
Copy link
Author

scabug commented Apr 28, 2014

@gourlaysama said (edited on Apr 28, 2014 6:08:46 PM UTC):
I had an old fix for the duplicate ticket (#7710) lying around for some time, so here we go: scala/scala-parser-combinators#17

I am closing this one as a duplicate of #7710.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant