Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML sum method is not summing numeric textnodes correctly #4551

Closed
scabug opened this issue May 7, 2011 · 7 comments
Closed

XML sum method is not summing numeric textnodes correctly #4551

scabug opened this issue May 7, 2011 · 7 comments

Comments

@scabug
Copy link

scabug commented May 7, 2011

When summing over only one numeric textnode the output is correct: its's the numeric sum of the numeric values. 1 for instance is not converted to 49 but stays 1. But when summing over more than one node, there is a conversion done to the ASCII value (1 is converted to 49, 2 to 50)and these are summed (i.e. 99) or to char it is 'c'.

Another thing is that
val total2 = (someXML \ "VALUE").text.sum.toString.toInt is generating an error while
val total2 = (someXML \ "VALUE").text.sum.toInt.toString
is not (it outputs 99)
Error when for the case above
java.lang.NumberFormatException: For input string: "c"
at java.lang.NumberFormatException.forInputString(NumberFormatException.
java:48)
at java.lang.Integer.parseInt(Integer.java:449)
at java.lang.Integer.parseInt(Integer.java:499)
at scala.collection.immutable.StringLike$$class.toInt(StringLike.scala:23

Tested on:

java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) Client VM (build 16.0-b13, mixed mode, sharing)
Scala version 2.9.RC3
Windows Home Premium 32 bit

Code:

package xmlsum2
import xml.{XML, Node, NodeSeq}

object Main extends App {
val someXML = 12

var total = 0 ; (someXML \ "VALUE").foreach(total += .text.toInt)
val total2 = (someXML \ "VALUE").text.sum
val total2number = (someXML \ "VALUE").count(
=> true)
println("Total1: " + total)
println("Total2: " + total2)
println("Total2number: " + total2number)

val someXML2 = 4

var total3 = 0 ; (someXML2 \ "VALUE").foreach(total3 += .text.toInt)
val total4 = (someXML2 \ "VALUE").text.sum
val total4number = (someXML2 \ "VALUE").count(
=> true)
println("Total3: " + total3)
println("Total4: " + total4)
println("Total4number: " + total4number)
}

Output:

C:\scala-2.9.0.RC3\examples>scala xmlsum2.Main
java version "1.6.0_18"
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) Client VM (build 16.0-b13, mixed mode, sharing)

Total1: 3
Total2: c
Total2number: 2
Total3: 4
Total4: 4
Total4number: 1

@scabug
Copy link
Author

scabug commented May 7, 2011

Imported From: https://issues.scala-lang.org/browse/SI-4551?orig=1
Reporter: DaveScala (davescala)
Attachments:

  • xmlsum2.scala (created on May 7, 2011 3:34:08 PM UTC, 729 bytes)

@scabug
Copy link
Author

scabug commented May 7, 2011

DaveScala (davescala) said:
xmlsum2.scala

@scabug
Copy link
Author

scabug commented May 9, 2011

@paulp said:
I am not sure where you got the idea that sum is intended for anything like what you are doing, but in all cases it is adding the char values, because a string is a sequence of chars and sum is a collections method.

@scabug
Copy link
Author

scabug commented May 9, 2011

DaveScala (davescala) said:
-But then there is still the inconsistency for the single textnode case with for instance "1". This should return then 49 or 'a'

  • I got the idea from Xpath sum method. Of course usually there is a DTD/schema to which is not in here. (Is there a possibility for this?)

In any case summing ascii values or summing the numeric values it should be consistent for every number of textnodes.

So there is still a bug.

@scabug
Copy link
Author

scabug commented May 9, 2011

DaveScala (davescala) said:
"or 'a'."

Sorry this should be '1' which it is, so it is correct

Is there a way to enforce the summing numerically?

@scabug
Copy link
Author

scabug commented May 9, 2011

@paulp said:
Replying to [comment:2 DaveScala]:

-But then there is still the inconsistency for the single textnode case with for instance "1". This should return then 49 or 'a'

If you really think there is a bug involving the one-element case, you should produce the code which handles that case differently. Or does your next comment indicate you realize there is no such issue? In any case please don't reopen this, there's no bug.

  def sum[B >: A](implicit num: Numeric[B]): B = foldLeft(num.zero)(num.plus)

@scabug
Copy link
Author

scabug commented May 9, 2011

DaveScala (davescala) said:

If 'sum' has to convert a numerical textnode to char array and sum its ascii value then it works correctly but then Scala xml sum function does not work as an Xpath sum function. 
So what I mean is: it works according Scala spec if the above describes Scala spec but Scala spec does not work as Xpath spec.


For instance Xpath expression sum(/X/VALUE) for
 
<X><VALUE>1</VALUE><VALUE>2</VALUE></X>

would return 3 and not 99.
sum evaluates strings that can be evaluated as a numeric as numeric value (even without a scheme ) and summing them acordingly (which makes semantically sense for a sum) but Scala sum is evaluating them as a char array and returns a Char which is odd. You can check this in an XPath expression evaluation tool.
Or you can take an online tool like http://www.xmltools.dk/
If sum in the nodes that sum is evaluating finds a char then an error is generated. So there is no ascii conversion at all in Xpath but an error message

<X><VALUE>1</VALUE><VALUE>a</VALUE></X> generates for the same Xpath expression
Cannot convert string {a} to a double

Xpath function sum see specification :http://www.w3.org/TR/xquery-operators/#func-sum 
quote:
'All items in $$arg must be numeric or derived from a single base type. In addition, the type must support addition. Duration values must either all be xs:yearMonthDuration values or must all be xs:dayTimeDuration values. For numeric values, the numeric promotion rules defined in 6.2 Operators on Numeric Values are used to promote all values to a single common type. The sum of a sequence of integers will therefore be an integer, while the sum of a numeric sequence that includes at least one xs:double will be an xs:double.

If the above conditions are not met, a type error is raised [err:FORG0006].'

Which make sense for numeric aggregate functions

@scabug scabug closed this as completed May 18, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant