Build.sbt

 

name := "spark"

version := "1.0"

scalaVersion := "2.11.7"

 

libraryDependencies ++= Seq(
  "org.apache.spark"  % "spark-core_2.11"              % "1.5.1",
  "org.apache.spark"  % "spark-streaming_2.11"         % "1.5.1",
  "org.apache.spark"  % "spark-mllib_2.11"             % "1.5.1"

 

 

From: Бобров Виктор [mailto:ma...@bk.ru] 
Sent: Thursday, December 10, 2015 2:54 PM
To: 'Harsh J' <ha...@cloudera.com>
Cc: user@spark.apache.org
Subject: RE: Can't filter

 

Spark – 1.5.1, ty for help.

 

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import scala.io.Source


object SimpleApp {
    def main(args: Array[String]) {
    var A = scala.collection.mutable.Map[Array[String], Int]()
    val filename = "C:\\Users\\bobrov\\IdeaProjects\\spark\\file\\spark1.txt"
    for((line, i) <- Source.fromFile(filename).getLines().zipWithIndex){
      val lst = line.split(" ")
      A += (lst -> i)
    }

    def filter1(tp: ((Array[String], Int), (Array[String], Int))): Boolean= {
      tp._1._2 < tp._2._2
      }

    val conf = new 
SparkConf().setMaster("spark://web01:7077").setAppName("Simple Application")
    val sc = new SparkContext(conf)
    val mail_rdd = sc.parallelize(A.toSeq).cache()
    val step1 = mail_rdd.cartesian(mail_rdd)
    val step2 = step1.filter(filter1)
    //step1.collect().foreach(println)
  }
}

 

From: Harsh J [ <mailto:ha...@cloudera.com> mailto:ha...@cloudera.com] 
Sent: Thursday, December 10, 2015 2:50 PM
To: Бобров Виктор < <mailto:ma...@bk.ru> ma...@bk.ru>; Ndjido Ardo Bar < 
<mailto:ndj...@gmail.com> ndj...@gmail.com>
Cc:  <mailto:user@spark.apache.org> user@spark.apache.org
Subject: Re: Can't filter

 

Are you sure you do not have any messages preceding the trace, such as one 
quoting which class is found to be missing? That'd be helpful to see and 
suggest what may (exactly) be going wrong. It appear similar to 
https://issues.apache.org/jira/browse/SPARK-8368, but I cannot tell for certain 
cause I don't know if your code uses the SparkSQL features.

 

Also, what version is your Spark running?

 

I am able to run your program without a problem in Spark 1.5.x (with a sample 
Seq).

 

On Thu, Dec 10, 2015 at 5:01 PM Бобров Виктор <ma...@bk.ru <mailto:ma...@bk.ru> 
> wrote:

0 = {StackTraceElement@7132} 
"com.esotericsoftware.reflectasm.shaded.org.objectweb.asm.ClassReader.a(Unknown 
Source)"

1 = {StackTraceElement@7133} 
"com.esotericsoftware.reflectasm.shaded.org.objectweb.asm.ClassReader.<init>(Unknown
 Source)"

2 = {StackTraceElement@7134} 
"org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:40)"

3 = {StackTraceElement@7135} 
"org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:81)"

4 = {StackTraceElement@7136} 
"org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)"

5 = {StackTraceElement@7137} 
"org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)"

6 = {StackTraceElement@7138} 
"org.apache.spark.SparkContext.clean(SparkContext.scala:2030)"

7 = {StackTraceElement@7139} 
"org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:331)"

8 = {StackTraceElement@7140} 
"org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:330)"

9 = {StackTraceElement@7141} 
"org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)"

10 = {StackTraceElement@7142} 
"org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)"

11 = {StackTraceElement@7143} 
"org.apache.spark.rdd.RDD.withScope(RDD.scala:306)"

12 = {StackTraceElement@7144} "org.apache.spark.rdd.RDD.filter(RDD.scala:330)"

13 = {StackTraceElement@7145} 
"SimpleApp$GeneratedEvaluatorClass$44$1.invoke(FileToCompile0.scala:30)"

14 = {StackTraceElement@7146} "SimpleApp$.main(test1.scala:26)"

15 = {StackTraceElement@7147} "SimpleApp.main(test1.scala)"

 

From: Ndjido Ardo Bar [mailto:ndj...@gmail.com <mailto:ndj...@gmail.com> ] 
Sent: Thursday, December 10, 2015 2:20 PM
To: Бобров Виктор <ma...@bk.ru <mailto:ma...@bk.ru> >
Cc: user@spark.apache.org <mailto:user@spark.apache.org> 
Subject: Re: Can't filter

 

Please send your call stack with the full description of the exception .


On 10 Dec 2015, at 12:10, Бобров Виктор <ma...@bk.ru <mailto:ma...@bk.ru> > 
wrote:

Hi, I can’t filter my rdd.

 

def filter1(tp: ((Array[String], Int), (Array[String], Int))): Boolean= {
  tp._1._2 > tp._2._2
}
val mail_rdd = sc.parallelize(A.toSeq).cache()
val step1 = mail_rdd.cartesian(mail_rdd)
val step2 = step1.filter(filter1)

 

Get error “Class not found”. What I’m doing wrong ? Thanks for help.

 

 

 

Reply via email to