Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scala.util.matching.regex.replaceAllIn(String, (Match) => String) escapes backslashes #4750

Closed
scabug opened this issue Jul 1, 2011 · 6 comments
Assignees
Milestone

Comments

@scabug
Copy link

scabug commented Jul 1, 2011

scala.util.matching.regex.replaceAllIn(String, (Match) => String) seems to escape backslashes in the string to be operated on. The attached example shows how scala.util.matching.regex.replaceAllIn(String, (Match) => String) has different behavior compared to scala.util.matching.regex.replaceAllIn(String, String). In my environment I get the following output:

c:\projects\j2s\Java2Scala>scala RegexTest
rs = 'public foo "gh\dooh"'
r1 = '"gh\dooh"'
r2 = '"gh\dooh"'

But I would have expected r1 and r2 to be identical.

@scabug
Copy link
Author

scabug commented Jul 1, 2011

Imported From: https://issues.scala-lang.org/browse/SI-4750?orig=1
Reporter: Jeff Tsay (jctsay)
Attachments:

@scabug
Copy link
Author

scabug commented Jul 1, 2011

Jeff Tsay (jctsay) said:
Update: I found out that java.util.regex.Matcher.appendReplacement() has the same behavior, and is most likely the source of this problem. appendReplacement() is documented like this:

...
Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
...

In the case of regex.replaceAllIn(String, (Match) => String) the API user probably wouldn't be aware of this behavior and it would probably not be the common case to want to have replaceAllIn() to do processing of the returned string (for $1 references etc.). So I think replaceAllIn() should call Matcher.quoteReplacement() on the String result before passing it to Matcher.appendReplacement().

Alternatively, the behavior can be left the same, and the user required to call Matcher.quoteReplacement() himself, but then the documentation for replaceAllIn(String, (Match) => String) should be augmented to mention the string processing that is done by appendReplacement().

@scabug
Copy link
Author

scabug commented Jul 1, 2011

@dcsobral said:
FWIW, I concur. After all, all groups are already available through Match, which is the very reason why the function is Match => String, instead of String => String.

@scabug
Copy link
Author

scabug commented Jan 25, 2012

@dcsobral said:
Pull request 139 closes the issue by documenting the behavior and providing a quoteReplacement method on Scala's Regex object.

The behavior makes sense for the other replace methods, and I think it would be more surprising to have this one behave differently than having it behave like it does. The workaround is not too inconvenient, and proper documentation goes a long way towards addressing the issue.

@scabug
Copy link
Author

scabug commented Mar 12, 2012

@dcsobral said:
I have now seen three questions on Stack Overflow about this problem, all in the past six months. It's past time this was fixed.

@scabug
Copy link
Author

scabug commented Mar 26, 2012

@dcsobral said:
Fixed on 2.9.2-RC2 with commit 883cb14b12873062fc3a122a18df5429834ad3a9.

Fixed on master with commit 479dd13148c380619d3e9156ef1913467decc05c.

@scabug scabug closed this as completed Mar 26, 2012
@scabug scabug added the quickfix label Apr 7, 2017
@scabug scabug added this to the 2.9.2 milestone Apr 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants