Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to inherit from Hadoop Mapper (new or old API) in 2.9.0 with code that functions in 2.8.1 #4603

Closed
scabug opened this issue May 17, 2011 · 25 comments
Assignees
Milestone

Comments

@scabug
Copy link

scabug commented May 17, 2011

=== What steps will reproduce the problem (please be specific and use wikiformatting)? ===

The following code works in 2.8.1:

import org.apache.hadoop.io.{NullWritable, LongWritable, Text}
import org.apache.hadoop.mapreduce.{Mapper, Job}


object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])
    
  }
}

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {

  }
}

=== What do you see instead? ===
I get this compiler error:

error: type mismatch;
found   : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])

=== Additional information ===
Daniel Sobral vry helpfully directed me here from my Stack Overflow question at (http://stackoverflow.com/questions/6028221/how-does-one-implement-a-hadoop-mapper-in-scala-2-9-0).

The method definition of the failing call (Job.setMapperClass) is this:

public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }

The method definition of Mapper itself is this:

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>

=== What versions of the following are you using? ===

  • Scala: 2.8.1 and 2.9.0
  • Java: latest
  • Operating system: OSX / Linux
@scabug
Copy link
Author

scabug commented May 17, 2011

Imported From: https://issues.scala-lang.org/browse/SI-4603?orig=1
Reporter: Chris Bissell (chris_bissell)

@scabug
Copy link
Author

scabug commented May 18, 2011

@lrytz said:
(comment by extempore):

This is induced by the use of a type constructor as a bound. Much simplified:

// J.java
public class J<T> {
  public static void f(java.lang.Class<? extends J> cls) { }
  // correctly it should be like this, and then it would work.
  // unfortunately that doesn't mean we don't have to deal with it.
  // public static void f(java.lang.Class<? extends J<?>> cls) { }
}
// S.scala
class S extends J[AnyRef]

object Test {    
  def main(args:Array[String]) {
    J.f(classOf[S])
  }
}

This results in:

S.scala:5: error: type mismatch;
 found   : java.lang.Class[S](classOf[S])
 required: java.lang.Class[_ <: J]
    J.f(classOf[S])
               ^
one error found

I don't immediately see how to work around it from the scala side alone, because scala is picky about how we express these things.

// nope
J.f(classOf[S]: Class[_ <: J[_]])

S.scala:5: error: type mismatch;
 found   : Class[_$1(in method main)] where type _$1(in method main) <: J[_]
 required: java.lang.Class[_ <: J]
    J.f(classOf[S]: Class[_ <: J[_]])
                  ^
one error found

// nope
J.f(classOf[S]: Class[J[_]])

S.scala:5: error: type mismatch;
 found   : java.lang.Class[S](classOf[S])
 required: Class[J[_]]
Note: S <: J[_] (and java.lang.Class[S](classOf[S]) <: java.lang.Class[S]), but Java-defined class Class is invariant in type T.
You may wish to investigate a wildcard type such as `_ <: J[_]`. (SLS 3.2.10)
    J.f(classOf[S]: Class[J[_]])
               ^
one error found

// naturally, nope
J.f(classOf[S]: Class[J])

S.scala:5: error: class J takes type parameters
    J.f(classOf[S]: Class[J])
                          ^
one error found

@scabug
Copy link
Author

scabug commented May 18, 2011

@odersky said:
Paul, can you bisect to see what made the regression happen? Then assign to the one who did the commit (might well be myself).

@scabug
Copy link
Author

scabug commented May 21, 2011

@paulp said:
It is indeed yourself, in r24275.

commit d832118727bc6ffc1b87519086bae47e06090bb6
Author: odersky odersky@5e8d7ff9-d8ef-0310-90f0-a4852d11357a
Date: Sun Feb 13 15:04:53 2011 +0000

More tweaks to rawToExistential to avoid pile-up of transformations.

git-svn-id: http://lampsvn.epfl.ch/svn-repos/scala/scala/trunk@24275 5e8d7ff9-d8ef-0310-90f0-a4852d11357a

@scabug
Copy link
Author

scabug commented Jun 1, 2011

@ijuma said:
I have verified that this is fixed in 2.9.0-1. Could not find the exact commit that fixed it though.

@scabug
Copy link
Author

scabug commented Jun 5, 2011

@leifwickland said:
I was unable to duplicate Ismael Juma's finding that this is fixed in 2.9.0-1. I created an sbt project with build.properties like:

project.scratch=true
project.name=test
sbt.version=0.7.7
project.version=1.0
build.scala.versions=2.9.0-1
project.initialize=false

The build/Project.scala looked like:

import sbt._

class Project(info: ProjectInfo) extends DefaultProject(info) {
  val clouderaRepo = "cloudera release" at "https://repository.cloudera.com/content/repositories/releases"
  val cdhVer = "0.20.2-cdh3u0"
  val hadoopCore = "org.apache.hadoop" % "hadoop-core" % cdhVer % "provided"
}

The project contained exactly one source file:

import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._

object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])
  }
}

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
  }
}

Compiling resulted in the following error message

[error] ....MyJob.scala:9: type mismatch;
[error]  found   : java.lang.Class[MyMapper](classOf[MyMapper])
[error]  required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
[error]     job.setMapperClass(classOf[MyMapper])
[error]                               ^
[error] one error found

@scabug
Copy link
Author

scabug commented Jun 5, 2011

@leifwickland said (edited on Jun 5, 2011 6:37:52 PM UTC):
I also saw the same compile error on 2.9.1-SNAPSHOT.

I should've mentioned explicitly that I tested with Cloudera's CDH3, which is based on Hadoop 20.2.

@scabug
Copy link
Author

scabug commented Jun 5, 2011

@ijuma said:
Hi Leif. Interesting. The difference between your example and the one I have is that my mapper is an inner class of the object. I tried doing the same for your example and then it works.

So, you're right that this is not actually fixed. It's just that you can workaround it by making the mapper an inner class of the object. Coincidentally, I made that change before I tested it with 2.9.0-1 and that made me think it had been fixed in 2.9.0-1.

@scabug
Copy link
Author

scabug commented Jun 6, 2011

@leifwickland said (edited on Jun 6, 2011 9:31:44 PM UTC):
Ismael,

Thanks for taking another look at this.

I had actually tried the inner class trick before I posted and it failed for me. I discovered experimentally that that's because I put my inner class after the main(). If the inner class MyMapper is declared before the main, it works fine.

It turns out, having MyMapper as a top level class declared in the same file and before MyJob also works around the problem.

@scabug
Copy link
Author

scabug commented Jun 20, 2011

@leifwickland said (edited on Jun 20, 2011 7:30:11 PM UTC):
For an unknown reason, the workaround of defining the mapper and reducer before the job stopped working when I added enough files to my project. I didn't notice the problem until I made a change that caused sbt (v0.10) to have to recompile a few files. Now I see an error like:

[error]  found   : java.lang.Class[MyReducer](classOf[MyReducer])
[error]  required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Reducer]
[error]     job.setCombinerClass(classOf[MyReducer])

@scabug
Copy link
Author

scabug commented Jun 20, 2011

@leifwickland said:
Going back to Scala 2.8.1 allowed me to avoid this problem.

@scabug
Copy link
Author

scabug commented Jul 6, 2011

@paulp said:
See also duplicate #4769. I can see exactly what line is causing this and removing the line does fix this. Unfortunately there is no test case to give me an idea of what the line is supposed to be preventing.

@scabug
Copy link
Author

scabug commented Jul 14, 2011

Thomas Themel (themel) said:
Here's what I ran into, which might make a nicer test case than Hadoop because all the required classes come with the JDK:

package something.weird

import javax.xml.bind.annotation.adapters.{XmlAdapter,XmlJavaTypeAdapter}
import scala.reflect.BeanInfo

@BeanInfo class Intermediate {
    var foo : String = _
    var bar: String = _
}

@XmlJavaTypeAdapter(classOf[Adapter]) class Original(val foo: String, val bar: String) { 
}

class Adapter extends XmlAdapter[Intermediate, Original] { 
    def marshal(o: Original) : Intermediate = null
    def unmarshal(i: Intermediate) : Original = null
}

Compiles in 2.8.1, fails in 2.9.0 with

[error] jaxb.scala:11: type mismatch;
[error]  found   : java.lang.Class[something.weird.Adapter](classOf[something.weird.Adapter])
[error]  required: java.lang.Class[_ <: javax.xml.bind.annotation.adapters.XmlAdapter]
[error] @XmlJavaTypeAdapter(classOf[Adapter]) class Original(val foo: String, val bar: String) {

@scabug
Copy link
Author

scabug commented Jul 28, 2011

@paulp said:
See also duplicate #4822.

@scabug
Copy link
Author

scabug commented Jul 28, 2011

Shai Yallin (electricmonk) said:
We ran into a problem similar to [~themel] (a colleague of mine opened #4822). I think it's safe to assume that the raw types issue exists across the JDK and 3rd party libraries and since we can't get anyone else to fix it, I recon a fix in the Scala compiler is needed - even though this is not strictly a bug in the compiler...

@scabug
Copy link
Author

scabug commented Jul 28, 2011

Commit Message Bot (anonymous) said:
(odersky in r25394) Closes #4603. Review by extempore.

@scabug scabug closed this as completed Jul 28, 2011
@scabug
Copy link
Author

scabug commented Jul 28, 2011

@odersky said:
My commit r25394 should fix this.

@scabug
Copy link
Author

scabug commented Jul 28, 2011

Commit Message Bot (anonymous) said:
(extempore in r25403) Test case for #4603, no review.

@scabug
Copy link
Author

scabug commented Jul 28, 2011

@paulp said:
This will be in the next RC for 2.9.1 as well.

@scabug
Copy link
Author

scabug commented Aug 11, 2011

@leifwickland said:
I verified that my problem is resolved in 2.9.1 RC2. Thanks for the fix!

@scabug
Copy link
Author

scabug commented Aug 17, 2011

Chris Bissell (chris_bissell) said:
Verified that the problem is fixed in 2.9.1.RC3 for our code. Many thanks! And the compiler is snappier too!

@scabug
Copy link
Author

scabug commented Aug 28, 2011

Jiamin Zhao (littledumb) said (edited on Aug 28, 2011 1:49:44 AM UTC):
Will this fix be backported to 2.8.1 or 2.9.0? Cuz I'm using Lift and GridGain. I have the same problem even with 2.8.1.

@scabug
Copy link
Author

scabug commented Nov 12, 2013

Federico Ragona (federico.ragona-at-gmail.com) said (edited on Nov 12, 2013 8:05:18 PM UTC):
Hello,
I'm facing a similar issue in Scala 2.10.3, only when extending a Java class from a Scala class. This issue is not reproduced if AMapper (see code below) is implemented as a Scala class.

Code

Here is my project layout:

src
  main
    java
      myhadoop
        AMapper.java
    scala
      myhadoop
        MyMapper.scala

And these are my classes' sources:

// AMapper.java
package myhadoop;

import org.apache.hadoop.io.BytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class AMapper extends Mapper<Text, BytesWritable, Text, Text> {
  public void handleThis(Text key, Mapper<Text, BytesWritable, Text, Text>.Context context, Text text) {
    // not relevant to this test
  }
}
// MyMapper.scala
package myhadoop

import org.apache.hadoop.io.{BytesWritable, Text}
import org.apache.hadoop.mapreduce.Mapper

class MyMapper extends AMapper {
 
  override def map(key: Text, value: BytesWritable, context: Mapper[Text, BytesWritable, Text, Text]#Context) {
    handleThis(key, context, new Text(value.getBytes))
  }
}

Error

Compiling this code will raise the following error:

error: type mismatch;
found: org.apache.hadoop.mapreduce.Mapper[org.apache.hadoop.io.Text,org.apache.hadoop.io.BytesWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text]#Context
required: org.apache.hadoop.mapreduce.Mapper[KEYIN,VALUEIN,KEYOUT,VALUEOUT]#Context

pointing to the handleThis(key, context, new Text(bytes)) line.

Environment

Scala 2.10.3
Java 1.7.0u21
Ubuntu Linux 12.04

@scabug
Copy link
Author

scabug commented Nov 12, 2013

@soc said:
Hi Frederico, could you provide some self-contained piece of code? That would make it easier and faster to investigate this issue. Thanks!

@scabug
Copy link
Author

scabug commented Nov 12, 2013

Federico Ragona (federico.ragona-at-gmail.com) said:
Simon, I have to apologize to you.
I thought that the snippet I pasted in my original comment was enough to reproduce the issue but it's definitely not. I just edited my original comment to provide you with all the information needed to reproduce it. As I mentioned in the comment, the compiler will raise this issue only when extending a Java class.

I hope this helps you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants