Scala Programming Language
  1. Scala Programming Language
  2. SI-1133

nested extractors generate exponential-space bytecode

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Scala 2.10.0-M4
    • Component/s: Pattern Matcher
    • Labels:
      None
    • Environment:

      extractors, bytecode

      Description

      Pattern matching multiple cases against extractors triggers a combination explosion that quickly overwhelms the maximum method size. This has bitten me quite hard on my in-progress java to scala translator, as I use extractors against the JDT for parsing java source. I have attached a demonstration. You can make a minor tweak to the attached source and see the size of the bytecode drop in half (or double again, if FJBG wouldn't choke.)

      leaf:dfascala paulp$$ scalac *.scala
      Exception in thread "main" java.lang.Error: ch.epfl.lamp.fjbg.JCode$$OffsetTooBigException: offset too big to fit in 16 bits: 36580
      at ch.epfl.lamp.fjbg.JFieldOrMethod.writeTo(JFieldOrMethod.java:114)
      [...]

      The disassembled bytecode reveals enormous nested if-else statements with massive duplicated code.

        Activity

        Hide
        Daniel Sobral added a comment - - edited
        dcs@ayanami:~/tmp$ scalac -Yvirtpatmat  Tokeniser.scala 
        dcs@ayanami:~/tmp$ 
        

        Yay!

        Show
        Daniel Sobral added a comment - - edited dcs@ayanami:~/tmp$ scalac -Yvirtpatmat Tokeniser.scala dcs@ayanami:~/tmp$ Yay!
        Hide
        Daniel Sobral added a comment -

        For the record, I can also compile the original example. Also, I have tested the virtual pattern matcher against the regex example with the code below, and it passes. So, not only it doesn't crash, but it matches correctly so far. Thanks, Adriaan!

        object TokeniserTest extends Properties("Tokeniser") {                                                          
          val hexDigit = Gen oneOf (('0' to '9') ++ ('a' to 'f'))                                                       
          val identifier = Gen listOfN (8, hexDigit) map (_.mkString)                                                   
          val command0 = Gen oneOf ("status", "reset") map (_ + "=0")                                                   
          val command01 = for {                                                                                         
            cmd <- Gen oneOf ("motor", "side", "stream")                                                                
            digit <- Gen oneOf ("0", "1")                                                                               
          } yield cmd + "=" + digit                                                                                     
          val commandX = Gen oneOf ('0' to '9') map ("density="+_)                                                      
          val commandN = for {                                                                                          
            cmd <- Gen oneOf ("min_track", "max_track", "track")                                                        
            number <- Gen.numStr suchThat (_.size > 0)                                                                  
          } yield cmd + "=" + number                                                                                    
          val textGen = for {                                                                                           
            i <- id                                                                                                     
            cmd <- Gen oneOf (command0, command01, commandX, commandN)                                                  
          } yield i + ": " + cmd                                                                                        
                                                                                                                        
                                                                                                                        
          property("Parsing") = forAllNoShrink(textGen) ( text => Tokeniser.go(text)                                    
            matches """(status|reset|density \d+|(min |max )?track \d+|(motor|stream) (on|off)|side \d+)""" )           
        }
        
        Show
        Daniel Sobral added a comment - For the record, I can also compile the original example. Also, I have tested the virtual pattern matcher against the regex example with the code below, and it passes. So, not only it doesn't crash, but it matches correctly so far. Thanks, Adriaan! object TokeniserTest extends Properties("Tokeniser") { val hexDigit = Gen oneOf (('0' to '9') ++ ('a' to 'f')) val identifier = Gen listOfN (8, hexDigit) map (_.mkString) val command0 = Gen oneOf ("status", "reset") map (_ + "=0") val command01 = for { cmd <- Gen oneOf ("motor", "side", "stream") digit <- Gen oneOf ("0", "1") } yield cmd + "=" + digit val commandX = Gen oneOf ('0' to '9') map ("density="+_) val commandN = for { cmd <- Gen oneOf ("min_track", "max_track", "track") number <- Gen.numStr suchThat (_.size > 0) } yield cmd + "=" + number val textGen = for { i <- id cmd <- Gen oneOf (command0, command01, commandX, commandN) } yield i + ": " + cmd property("Parsing") = forAllNoShrink(textGen) ( text => Tokeniser.go(text) matches """(status|reset|density \d+|(min |max )?track \d+|(motor|stream) (on|off)|side \d+)""" ) }
        Hide
        Adriaan Moors added a comment -

        sweet! now let's see if we can speed this up a little...
        I'm getting started with the next phase of project VPM: run-time performance
        I really appreciate -Yvirtpatmat [bug|success] reports and benchmark/test cases!
        please assign them directly to me or just send me an email

        Show
        Adriaan Moors added a comment - sweet! now let's see if we can speed this up a little... I'm getting started with the next phase of project VPM: run-time performance I really appreciate -Yvirtpatmat [bug|success] reports and benchmark/test cases! please assign them directly to me or just send me an email
        Hide
        Matic Potočnik added a comment -

        Works fine for me with both 2.10.0-M1 -Yvirtpatmat, and 2.9.1. -Ypmat-naive.

        The code I had trouble with is way too messy to post, but I also confirm it works with Kierons case.

        Show
        Matic Potočnik added a comment - Works fine for me with both 2.10.0-M1 -Yvirtpatmat, and 2.9.1. -Ypmat-naive. The code I had trouble with is way too messy to post, but I also confirm it works with Kierons case .
        Hide
        Jason Zaugg added a comment -

        Test case pending merge:

        https://github.com/scala/scala/pull/542

        Show
        Jason Zaugg added a comment - Test case pending merge: https://github.com/scala/scala/pull/542

          People

          • Assignee:
            Paul Phillips
            Reporter:
            Paul Phillips
            TracCC:
            Alex Cruise, Daniel Sobral, Dustin Whitney, fmpwizard, Igor Rumiha, Ismael Juma, Johannes Rudolph, Juha Heljoranta, Kieron Wilkinson, Mirko Stocker, Nathan, Paul Phillips, Seth Tisue, tomer.doron, Willis Blackburn
          • Votes:
            8 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development