-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigDecimal#isWhole implementation is very heap intensive #6173
Comments
Imported From: https://issues.scala-lang.org/browse/SI-6173?orig=1 |
@paulp said: I don't dispute that there's all kinds of silliness in there, but most of it is there for some plausible reason. |
@djspiewak said: |
@paulp said: So unless you like the idea that somewhere BigInt(X) == BigDecimal(X) but BigInt(X + 1) != BigDecimal(X + 1), the hashCodes of BigDecimals must be coordinated with those of BigInts. Regardless of size. To do this, the BigDecimal must check whether it can conceivably be equal to a BigInt, or the hashcode scheme has to be such that BigDecimals/BigInts provide the same one when they are equal. If there is a better test than "isWhole", which does not have the ring of a heap-exhausting method, then please suggest it. |
@djspiewak said (edited on Aug 3, 2012 5:22:05 PM UTC): I would argue that BigInt and BigDecimal are being too aggressive with their large-value hashCodes. Honestly, if someone has a number that large, they've pretty much given up on efficiency. Let them deal with the hash collisions. This is especially valid (imo) in light of the fact that hashes will always collide for non-whole values greater than Double.MaxValue. Given that this avoids the call to isWhole altogether for sufficiently large-magnitude BigDecimal/BigInt values, it bypasses the performance issues with isWhole. |
@paulp said: To be clear, and despite the fact that (see below) the goal is not achieved, the hashcodes for these classes are not modified from the native ones for anything to do with efficiency. There is only one reason: "equal objects must have equal hashcodes." So collision rates are beside the point. I checked, and we're not even doing the thing I talked about, so the thing I thought was a bad idea is already the way that it is, by which I mean: scala> BigDecimal(Long.MaxValue) == BigInt(Long.MaxValue)
res9: Boolean = true
scala> (BigDecimal(Long.MaxValue) pow 2) == (BigInt(Long.MaxValue) pow 2)
res10: Boolean = false So it's not like there's any wonderful behavior to preserve here. I'm not going to touch any of this, I'm only trying to illuminate it a little for whoever may attempt something. |
@paulp said:
That's why they come out unequal in my previous comment - the BigDecimal version gets rounded. |
@paulp said:
|
@djspiewak said: All of these things are problems, though some beyond our purview (I'm pretty sure the rounding issue is Java's fault). Maybe we should just focus on trying to make isWhole a little more efficient. I can't think of any reason why that shouldn't be a constant-heap operation. |
@djspiewak said: {{{ So, the only solution then would be to return a constant for sufficiently large values, both in BigInt and in BigDecimal. |
@non said: Unfortunately I think this class is just busted. |
@paulp said: |
Jeff Skaistis (jeffska) said: // BigInteger.pow is slow, so make 10**n by constructing a Unfortunately, there's not a limit check in there, so it's creating a 3 GB array and dumping 1.5 billion zeros into it. Of course the actual pow() method does have a sanity check, so it would have been better just to call it. I think the lesson here is that BigDecimal shouldn't be treated like it was an actual floating-point number. Personally, I'd like to see support for the new IEEE decimal floating points. |
@soc said: |
@Ichoran said: |
@adriaanm said: |
{{{
BigDecimal("-4.6010652208491219E+1556245283").hashCode
}}}
Explodes with an OutOfMemoryError even with a 4GB heap. (tested using PaulP's sbt-extras script with SBT_MEM=4096) The issue is that the BigDecimal#isWhole function is extremely heap-intensive for large numbers. This is actually a bit silly, given that numbers like the one I pasted are never, ever going to be convertible to either Int or Double, which is what the isWhole check is intended to accomplish.
My proposal would be to change the implementation of BigDecimal#hashCode to do a quick sanity check on the size of the value to see if it's even possible to fall into the range of Int/Double. If not, then check the sign and return the value that Double's ## would have returned anyway (Double.POSITIVE_INFINITY or Double.NEGATIVE_INFINITY). As things stand, this is literally a security whole in Scala for people who parse user-specified text as BigDecimal, even with non-ARBITRARY contexts.
The text was updated successfully, but these errors were encountered: