Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instances of sealed case class hierarchies with no mutable fields should be represented by singleton objects in memory, have their hashCode memoized, and equality implemented as reference equality #3375

Closed
scabug opened this issue Apr 30, 2010 · 6 comments

Comments

@scabug
Copy link

scabug commented Apr 30, 2010

Hello,
this remark applies to Scala 2.7.7.

I am using sealed, purely functional, case class hierarchies like this one :

sealed abstract class Op
abstract class BinOp extends Op
case object And extends BinOp
case object Or extends BinOp
abstract class UnOp extends Op
case object Not extends UnOp

sealed abstract class Expr
case class Ident(name:String) extends Expr
case class UnExpr( op:Unop, kid:Expression ) extends Expr
case class BinExpr(op: BinOp, left:Expression, right:Expression) extends Expr

to model abstract syntax trees and do some rewriting on them.

I've written my rewriting engine in a purely functional style, using trampolines (instead of mutually tail recursive functions because of the incomplete TCO support) to avoid overflowing the stack when doing various actions in pre/in/post order during traversal of deep trees.

Because of the purely functional coding style, I use Stacks, Maps and HashMaps to bind and unbind data with these AST nodes as the rewriting progresses top down, bottom up, breadth first or depth first.

As nothing can distinguish Expr instances having an identical structure, it would make sense to have all such instances represented by a single object in memory, and to have hashCode memoized at creation time, and equals redefined to be reference equality. This would provide a dramatic speedup, reduce memory footprint, would avoid stack overflow problems when inserting such things in HashTables, etc...

Because of the "bad" behaviour of :

  • memory allocation : instance with identical structure are not shared

  • hashCode : recursive but not tail recursive and not memoized

  • equality test : recursive but not tail recursive

for such case class hierarchies, the code is slow, eats up a lot of memory, and triggers stack overflows when inserting deep terms in Maps or HashMaps (whereas these things cause no problems in CAML for instance).

The only solution I've found to sidestep this is to manually maintain a pool unique instances (doing structural hashing and instance numbering), and to manually overload the hashCode and equals methods, which is a bit boring and non scalable. I'm under the impression that this structural hashing could be automated by the run time, or that the code could be transformed by the compiler to achieve this in the case of sealed hierarchies of case classes with no mutable fields.

This would make scala more efficient for language processing applications of this kind.

The remark would also apply to all generic containers instanciated for sealed case class hierarchies, like lists (lists suffixes could be globally unique in an application)

best regards.

@scabug
Copy link
Author

scabug commented Apr 30, 2010

Imported From: https://issues.scala-lang.org/browse/SI-3375?orig=1
Reporter: rede

@scabug
Copy link
Author

scabug commented May 1, 2010

@paulp said:
I am sympathetic to these issues. It should be addressed in the larger context of making case classes less of an all-or-nothing endeavor; in particular I think we should have a mechanism for autogenerated equals/hashCode without the other baggage. And such a mechanism could be able to address at least a subset of the above.

The direction I'd propose is compiler-recognized marker interfaces.

@scabug
Copy link
Author

scabug commented May 6, 2010

@dubochet said:
One thing to note: there is nothing that prevents a case class from having mutable state.�In the case of your Expr example, there isn't, but that is because none was defined. Furthermore, in some cases, identity remain important even is structure is identical — even with a case class, you can test for identity using method eq. As such, you cannot say, in general, that case classes with the same structure can be shared.

As Paul mentioned, this problem probably needs to be considered in terms of making case classes more configurable and not an all-or-nothing construct like it is now.

@scabug
Copy link
Author

scabug commented May 6, 2010

@dubochet said:
If this is a topic you want to pursue, we would very much like to have a more detailed proposal from you, probably in the form of a SID. If you want to go in that direction, I would recommend you discuss this topic on the mailing list to collect feedback from the user community.

@scabug
Copy link
Author

scabug commented Apr 10, 2012

thsoft said:
Hmm, I was kind of surprised when I saw this issue: I used to think that immutable case classes were implemented the proposed way. What's the obstacle of this?

@SethTisue
Copy link
Member

SethTisue commented Mar 2, 2018

stale ticket. could be revived on https://contributors.scala-lang.org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants