Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Methods diff and intersect in String object with erroneous response(Scala 2.11.8) #9877

Closed
scabug opened this issue Jul 29, 2016 · 13 comments
Closed
Labels

Comments

@scabug
Copy link

scabug commented Jul 29, 2016

Methods as intercept and diff for Collections and String Objects not returning correct value.
For simulate the problem execute that test:

/**
 * Issue in methods diff and intercept for classes String and Collections
 * 
 * For tests execute that sample with:
 * 
 * In with diff: ("SOSSSSSSSSSSSSO" diff "SOSSOSSOSSOSSOS") length
 * Out Should to be: 5  BUT RESPONSE HERE IS 3.
 * 
 * In with intersect: "SOSSSSSSSSSSSSO" intersect "SOSSOSSOSSOSSOS" 
 * Out Should to be: "SOSSSSSSSS" BUT RESPONSE HERE IS "SOSSOSSSSSSS" 
 *  
 */

object MarsExploration {
  def main(args: Array[String]) {
    val sc = io.Source.stdin.getLines
    val S = sc.next();
    val s = "SOS"*(S.length/3)
    println((s diff S) length)
    println(s intersect S)
  }
}
@scabug
Copy link
Author

scabug commented Jul 29, 2016

Imported From: https://issues.scala-lang.org/browse/SI-9877?orig=1
Reporter: Josenildo SIlva (josenildo.silva)
Affected Versions: 2.11.8

@scabug
Copy link
Author

scabug commented Aug 2, 2016

@SethTisue said:
why erroneous? it appears to me the answers satisfy the definitions of multiset difference and intersection given in the Scaladoc for these methods.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

Josenildo SIlva (josenildo.silva) said:
You have tested the sample sended. When exists repetitions and exists variance between characters the response is incorrect. I do not understand as that is correct.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

@SethTisue said:
Let's take "intersect" for example. The Scaladoc for "intersect" says:

  • If an element value x appears ''n'' times in that, then the first ''n'' occurrences of x will be retained in the result, but any following occurrences will be omitted.

Consider "SOSSSSSSSSSSSSO" intersect "SOSSOSSOSSOSSOS" and the letter O. O occurs at least twice on the right, so both O's on the left should be retained. You're telling me the answer ought to be "SOSSSSSSSS", but that only has one O in it.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

Josenildo SIlva (josenildo.silva) said:
Exists one simple confusing here. If intersection compare each item in same position in each String so only one 'O' have to be returned no more. I will be test that in Java and C++ and Python. I will be comparing results. If I get same result that Scala close that question If not exists some erroneous interpretation mine. Thank for while but please not close that at tests.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

Josenildo SIlva (josenildo.silva) said:
Okay. As I speaks it's incorrect. View that:

#include
#include
#include
#include
#include

using namespace std;

int main()
{
string a = "SOSSSSSSSSSSSSO";
string b = "SOSSOSSOSSOSSOS";
string c;

set_intersection(
    a.begin(), a.end(),
    b.begin(), b.end(),
    back_inserter(c));

cout << c << '\n';

}

Response:
Success time: 0 memory: 3460 signal:0
SOSSSSSSSSS

@scabug
Copy link
Author

scabug commented Aug 2, 2016

@SethTisue said:
it appears to me from a quick glance at http://www.cplusplus.com/reference/algorithm/set_intersection/ that the STL's set_intersection method implements set intersection on sorted ranges. Scala's intersection method, on the other hand, implements multiset intersection and doesn't require sorted input. So you shouldn't expect the same answer.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

Josenildo SIlva (josenildo.silva) said:
No. This is not sorted in entry. If sorted in entry okay, But the entry is not sorted. I can search others implementations in Java and Python and Haskell. Let's go compare all that platforms.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

@SethTisue said:

~ % ghci
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Prelude> import Data.List
Prelude Data.List> intersect "SOSSSSSSSSSSSSO" "SOSSOSSOSSOSSOS"
"SOSSSSSSSSSSSSO"

I think our methods are working as designed and documented. If you think they ought to be designed differently, I'd suggest that you start a mailing list thread about it and see if there is consensus. (But personally I think you'd have an uphill battle there.)

@scabug scabug closed this as completed Aug 2, 2016
@scabug
Copy link
Author

scabug commented Aug 2, 2016

@SethTisue said:
And/or, you might (via pull request) would suggest improvements to the documentation, if you think it isn't sufficiently explicit and clear.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

Josenildo SIlva (josenildo.silva) said:
Not need only for that case I'm use C++ not Scala. or Haskell because for that case I do not sort elements I send that unordered elements and need positional response each position is equal intersect not equal not return with not is equal in determined position.
In case where need search for new proteins or series of proteins you can'nt ordered the protein you have get positional similarity for one serie. This is because I had thinking that is one issue. But is the language is designed that form and is your specification not need change any information.

@scabug
Copy link
Author

scabug commented Aug 2, 2016

@SethTisue said:
I'm guessing, but perhaps this will be helpful for your use case?

scala> ("SOSSSSSSSSSSSSO" zip "SOSSOSSOSSOSSOS").count{case (a, b) => a != b}
res12: Int = 5

@scabug
Copy link
Author

scabug commented Aug 2, 2016

Josenildo SIlva (josenildo.silva) said:
Sorry, but, If you have series with 10^100(power) characters it's not efficient the problem here is performance and if you get one more direct access without any type of combination is more efficient. Thank you for your attention and help. I will be resolve that with other form. Thank you for all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant