Scala Programming Language
  1. Scala Programming Language
  2. SI-4979

warning on a wider set of provable nonsense equality comparisons

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Misc Compiler
    • Labels:
      None

      Description

      [This is an RFC on a patch I would go off and try to write -- not a request for someone else to do the work.]

      BACKGROUND: My colleagues and I are porting a large codebase to Scala, and we're finding we often use Option for fields or variables that are "optional". (In the Python original of the codebase, these would be variables where None is an acceptable value.) This results in a lot of comparisons like
      {{ if (someUid == someOtherUid) ... }}
      where it's easy to forget that, e.g., someUid is of type Option[Int] while someOtherUid is of type Int.

      PROBLEM: This comparison will always fail, even in the case where the programmer intended it to succeed, but scalac currently emits no warning. In SI-4978 I suggested enabling a comprehensive but imprecise test under -Xlint; that'd help, but it'd be better if we can have a warning for this that is always right so we can turn it on unconditionally.

      SOLUTION PROPOSAL: In nsc/typechecker/RefChecks.scala we have a number of fairly specific checks that try to warn in cases like this. Here's the broadest warning so far:

              // Whether def equals(other: Any) is overridden
              def isUsingDefaultEquals = ...
              def isWarnable           = isReferenceOp || (isUsingDefaultEquals && isUsingDefaultScalaOp)
      ...
              else if (isWarnable) {
      ...
                else if (receiver.isFinal && !(receiver isSubClass actual)) {  // object X, Y; X == Y
                  if (isEitherNullable)
                    nonSensible("non-null ", false)
                  else
                    nonSensible("", false)
                }
              }
      

      IOW, we try to prove (a) that the equality method in question is the default one which just checks reference equality, and (b) that the types on the two sides have no values in common other than null, and if we can prove both (a) and (b) then we warn.

      In the case of Option[Int] and Int, (b) is true, but a bit trickier to prove because Option is not final. Fortunately Option is sealed, so all of its immediate subclasses are known, and its immediate subclasses Some and None are final, so indeed all of its subclasses are known. More complicated is (a). Some and None, as case classes, actually do have their own 'equals' methods, injected by typechecker/SyntheticMethods.scala. These synthetic case-class 'equals' methods still have straightforward semantics when the argument is of some alien type – so long as the argument's isInstanceOf method is also straightforward and returns false in that case.

      So I think it should be feasible to prove both (b) and a version of (a) in a wider set of cases that would include Option[Int] vs Int. My rough plan would be something like

      • extend isFinal to a broader predicate allowing us to prove (b), along the lines of
        noformat
        lazy val isNearFinal = isFinal || (isSealed && for (cls <- /* immediate subclasses */) cls.isNearFinal)
        noformat
      • have SyntheticMethods set a flag when it supplies 'equals', so we can distinguish its 'equals' from a user-supplied 'equals';
      • when isWarnable and receiver.isNearFinal, warn if !(T isSubClass actual) for all subclasses T of receiver;
      • otherwise if we're using the synthetic case-class 'equals' and receiver.isNearFinal, check that actual.isNearFinal and that we'll be using the default 'isInstanceOf', and warn unless one of the subclasses of receiver is also a subclass of actual.

      Does this plan make sense? If it does (or for some version of it that does), I'll go try to carry it out and come back with a patch.

        Activity

        Hide
        Paul Phillips added a comment -

        I think I found the angle which warns what can be warned. I turned it on by default, even.

        The algorithm is: if neither operand is a subclass of the other, then calculate the lub. If the lub >: Object, warn. That will catch truly unrelated types, but allows "siblings" like B == C where B and C are both subclasses of A. And it sidesteps the complicated analysis.

        I'll check it in shortly, let me know if you have examples you think can be caught which it doesn't catch.

        Show
        Paul Phillips added a comment - I think I found the angle which warns what can be warned. I turned it on by default, even. The algorithm is: if neither operand is a subclass of the other, then calculate the lub. If the lub >: Object, warn. That will catch truly unrelated types, but allows "siblings" like B == C where B and C are both subclasses of A. And it sidesteps the complicated analysis. I'll check it in shortly, let me know if you have examples you think can be caught which it doesn't catch.
        Hide
        Commit Message Bot added a comment -

        (extempore in r25638) Brought back unrelated type comparison warning.

        Figured out how to turn it on by default, even.
        Closes SI-4979, no review.

        Show
        Commit Message Bot added a comment - (extempore in r25638 ) Brought back unrelated type comparison warning. Figured out how to turn it on by default, even. Closes SI-4979 , no review.
        Hide
        Greg Price added a comment -

        Hmm. That would cover most of the examples I know of.

        Relative to the warning without the lub check, out of the 15 or so examples I know of, it eliminates some false positives and also one real error.

        It eliminates false positives where, e.g., the two sides are subclasses of GenSeqLike, which overrides equals() so that many of its otherwise unrelated subclasses can compare equal. This covers the only false positive in Quora's codebase, and if I recall correctly the lub check would also cut out the two false positives in scalac itself.

        A false positive is still possible; there's no reason in principle why I might not declare class A overriding equals() to accept instances of B, with ScalaObject the lub. In practice I probably want B's equals() to also accept A, and then perhaps I want to define that equals() only once, which would lead me to give A and B a common base class. If people follow that logic, then this is a pretty sharp heuristic.

        But then the lub check does also eliminate one of the two real errors which the warning found in scalac itself:

        src/compiler/scala/tools/nsc/interpreter/JLineReader.scala:57:
        warning: scala.tools.nsc.interpreter.session.JLineHistory and object
        scala.tools.nsc.interpreter.session.NoHistory are unrelated: probably
        always compare unequal
        if (history ne NoHistory)

        Here the lub is scala.tools.nsc.interpreter.session.History. I don't see such a tight software-engineering argument as I gave above to say why this shouldn't happen; I suspect the lub check will suppress a decent number of real bugs in practice.

        That could argue for an option to make this warning skip the lub check.

        Thanks! Glad you found a simple version that's precise enough to turn on by default.

        Show
        Greg Price added a comment - Hmm. That would cover most of the examples I know of. Relative to the warning without the lub check, out of the 15 or so examples I know of, it eliminates some false positives and also one real error. It eliminates false positives where, e.g., the two sides are subclasses of GenSeqLike, which overrides equals() so that many of its otherwise unrelated subclasses can compare equal. This covers the only false positive in Quora's codebase, and if I recall correctly the lub check would also cut out the two false positives in scalac itself. A false positive is still possible; there's no reason in principle why I might not declare class A overriding equals() to accept instances of B, with ScalaObject the lub. In practice I probably want B's equals() to also accept A, and then perhaps I want to define that equals() only once, which would lead me to give A and B a common base class. If people follow that logic, then this is a pretty sharp heuristic. But then the lub check does also eliminate one of the two real errors which the warning found in scalac itself: src/compiler/scala/tools/nsc/interpreter/JLineReader.scala:57: warning: scala.tools.nsc.interpreter.session.JLineHistory and object scala.tools.nsc.interpreter.session.NoHistory are unrelated: probably always compare unequal if (history ne NoHistory) Here the lub is scala.tools.nsc.interpreter.session.History. I don't see such a tight software-engineering argument as I gave above to say why this shouldn't happen; I suspect the lub check will suppress a decent number of real bugs in practice. That could argue for an option to make this warning skip the lub check. Thanks! Glad you found a simple version that's precise enough to turn on by default.
        Hide
        Paul Phillips added a comment -

        That it doesn't warn when ScalaObject is the lub is a bug - I tested String on the lhs and forgot it would show up on non-java classes. For the purposes of this heuristic, ScalaObject is as good as Object. I'll fix it.

        The history comparison is wrong but it's going to be the exception that one can sensibly warn about a once-valid comparison which becomes invalid after inheritance refines one of the types. I believe that would be a parade of false positives.

        Show
        Paul Phillips added a comment - That it doesn't warn when ScalaObject is the lub is a bug - I tested String on the lhs and forgot it would show up on non-java classes. For the purposes of this heuristic, ScalaObject is as good as Object. I'll fix it. The history comparison is wrong but it's going to be the exception that one can sensibly warn about a once-valid comparison which becomes invalid after inheritance refines one of the types. I believe that would be a parade of false positives.
        Hide
        Greg Price added a comment -

        Cool.

        Yeah, I agree, the majority of the cases that the lub check cuts out are likely to be false positives. I think the description in terms of a once-valid comparison followed by inheritance refining one of the types is narrower than the full set of bugs that the lub check would hide in practice, though. For example, the one place where the warning with the lub check fires in scalac is here:

        src/compiler/scala/tools/nsc/backend/msil/GenMSIL.scala:1266:
        warning: GenMSIL.this.global.icodes.BasicBlock and
        ch.epfl.lamp.compiler.msil.emit.Label are unrelated: probably always
        compare unequal
        if (next != defaultTarget)

        This is a real bug. As it happens, these two classes come from fairly separate parts of the codebase, so their lub is ScalaObject. But the authors of this code clearly think of them as related classes, or they wouldn't make this mistake; they both belong to the MSIL subsystem and are things that describe chunks of MSIL code; so it could have happened that the two classes shared some generic utility class like "MSILItem" that was used by that subsystem generally. In that case, the lub check would have suppressed this warning.

        Perhaps an option to issue this warning without the lub check would fit under -Ywarn-all; the SNR (excluding what the lub check does let through) is probably too low for -Xlint.

        Show
        Greg Price added a comment - Cool. Yeah, I agree, the majority of the cases that the lub check cuts out are likely to be false positives. I think the description in terms of a once-valid comparison followed by inheritance refining one of the types is narrower than the full set of bugs that the lub check would hide in practice, though. For example, the one place where the warning with the lub check fires in scalac is here: src/compiler/scala/tools/nsc/backend/msil/GenMSIL.scala:1266: warning: GenMSIL.this.global.icodes.BasicBlock and ch.epfl.lamp.compiler.msil.emit.Label are unrelated: probably always compare unequal if (next != defaultTarget) This is a real bug. As it happens, these two classes come from fairly separate parts of the codebase, so their lub is ScalaObject. But the authors of this code clearly think of them as related classes, or they wouldn't make this mistake; they both belong to the MSIL subsystem and are things that describe chunks of MSIL code; so it could have happened that the two classes shared some generic utility class like "MSILItem" that was used by that subsystem generally. In that case, the lub check would have suppressed this warning. Perhaps an option to issue this warning without the lub check would fit under -Ywarn-all; the SNR (excluding what the lub check does let through) is probably too low for -Xlint.

          People

          • Assignee:
            Unassigned
            Reporter:
            Greg Price
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development