instances of sealed case class hierarchies with no mutable fields should be represented by singleton objects in memory, have their hashCode memoized, and equality implemented as reference equality #3375

scabug · 2010-04-30T12:58:36Z

Hello,
this remark applies to Scala 2.7.7.

I am using sealed, purely functional, case class hierarchies like this one :

sealed abstract class Op
abstract class BinOp extends Op
case object And extends BinOp
case object Or extends BinOp
abstract class UnOp extends Op
case object Not extends UnOp

sealed abstract class Expr
case class Ident(name:String) extends Expr
case class UnExpr( op:Unop, kid:Expression ) extends Expr
case class BinExpr(op: BinOp, left:Expression, right:Expression) extends Expr

to model abstract syntax trees and do some rewriting on them.

I've written my rewriting engine in a purely functional style, using trampolines (instead of mutually tail recursive functions because of the incomplete TCO support) to avoid overflowing the stack when doing various actions in pre/in/post order during traversal of deep trees.

Because of the purely functional coding style, I use Stacks, Maps and HashMaps to bind and unbind data with these AST nodes as the rewriting progresses top down, bottom up, breadth first or depth first.

As nothing can distinguish Expr instances having an identical structure, it would make sense to have all such instances represented by a single object in memory, and to have hashCode memoized at creation time, and equals redefined to be reference equality. This would provide a dramatic speedup, reduce memory footprint, would avoid stack overflow problems when inserting such things in HashTables, etc...

Because of the "bad" behaviour of :

memory allocation : instance with identical structure are not shared
hashCode : recursive but not tail recursive and not memoized
equality test : recursive but not tail recursive

for such case class hierarchies, the code is slow, eats up a lot of memory, and triggers stack overflows when inserting deep terms in Maps or HashMaps (whereas these things cause no problems in CAML for instance).

The only solution I've found to sidestep this is to manually maintain a pool unique instances (doing structural hashing and instance numbering), and to manually overload the hashCode and equals methods, which is a bit boring and non scalable. I'm under the impression that this structural hashing could be automated by the run time, or that the code could be transformed by the compiler to achieve this in the case of sealed hierarchies of case classes with no mutable fields.

This would make scala more efficient for language processing applications of this kind.

The remark would also apply to all generic containers instanciated for sealed case class hierarchies, like lists (lists suffixes could be globally unique in an application)

best regards.

scabug · 2010-04-30T12:58:36Z

Imported From: https://issues.scala-lang.org/browse/SI-3375?orig=1
Reporter: rede

scabug · 2010-05-01T15:27:16Z

@paulp said:
I am sympathetic to these issues. It should be addressed in the larger context of making case classes less of an all-or-nothing endeavor; in particular I think we should have a mechanism for autogenerated equals/hashCode without the other baggage. And such a mechanism could be able to address at least a subset of the above.

The direction I'd propose is compiler-recognized marker interfaces.

scabug · 2010-05-06T09:36:23Z

@dubochet said:
One thing to note: there is nothing that prevents a case class from having mutable state.�In the case of your Expr example, there isn't, but that is because none was defined. Furthermore, in some cases, identity remain important even is structure is identical — even with a case class, you can test for identity using method eq. As such, you cannot say, in general, that case classes with the same structure can be shared.

As Paul mentioned, this problem probably needs to be considered in terms of making case classes more configurable and not an all-or-nothing construct like it is now.

scabug · 2010-05-06T09:38:33Z

@dubochet said:
If this is a topic you want to pursue, we would very much like to have a more detailed proposal from you, probably in the form of a SID. If you want to go in that direction, I would recommend you discuss this topic on the mailing list to collect feedback from the user community.

scabug · 2012-04-10T21:23:00Z

thsoft said:
Hmm, I was kind of surprised when I saw this issue: I used to think that immutable case classes were implemented the proposed way. What's the obstacle of this?

SethTisue · 2018-03-02T20:55:15Z

stale ticket. could be revived on https://contributors.scala-lang.org

scabug added enhancement help wanted labels Apr 7, 2017

SethTisue closed this as completed Mar 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

instances of sealed case class hierarchies with no mutable fields should be represented by singleton objects in memory, have their hashCode memoized, and equality implemented as reference equality #3375

instances of sealed case class hierarchies with no mutable fields should be represented by singleton objects in memory, have their hashCode memoized, and equality implemented as reference equality #3375

scabug commented Apr 30, 2010

scabug commented Apr 30, 2010

scabug commented May 1, 2010

scabug commented May 6, 2010

scabug commented May 6, 2010

scabug commented Apr 10, 2012

SethTisue commented Mar 2, 2018 •

edited

instances of sealed case class hierarchies with no mutable fields should be represented by singleton objects in memory, have their hashCode memoized, and equality implemented as reference equality #3375

instances of sealed case class hierarchies with no mutable fields should be represented by singleton objects in memory, have their hashCode memoized, and equality implemented as reference equality #3375

Comments

scabug commented Apr 30, 2010

scabug commented Apr 30, 2010

scabug commented May 1, 2010

scabug commented May 6, 2010

scabug commented May 6, 2010

scabug commented Apr 10, 2012

SethTisue commented Mar 2, 2018 • edited

SethTisue commented Mar 2, 2018 •

edited