Hi Jozef, all, I seem to be running into another issue .. here is what i did ..
I'm running Spark Streaming - Kafka integration using Spark 2.x & Kafka 10. I compiled the program using sbt, and the compilation went through fine. I was able able to import this into Eclipse & run the program from Eclipse. However, when i run the program using spark-submit, i'm getting the following error : ---------------------------------- > $SPARK_HOME/bin/spark-submit --class > "structuredStreaming.kafka.StructuredKafkaWordCount1" --master local[2] > /Users/karanalang/Documents/Technology/Coursera_spark_scala/structuredStreamingKafka/target/scala-2.11/StructuredStreamingKafka-assembly-1.0.jar > *java.lang.ClassNotFoundException: > structuredStreaming.kafka.StructuredKafkaWordCount1* > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at org.apache.spark.util.Utils$.classForName(Utils.scala:229) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:695) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > ------------------------------ I've put the jar in the classpath, but i still get the error -> echo $CLASSPATH .:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/jopt-simple-3.2.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/kafka-clients-0.9.0.1.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/kafka_2.11-0.9.0.1.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/log4j-1.2.17.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/metrics-core-2.2.0.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/scala-library-2.11.7.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/slf4j-api-1.7.6.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/slf4j-log4j12-1.7.6.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/snappy-java-1.1.1.7.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/zkclient-0.7.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/zookeeper-3.4.6.jar:/Users/karanalang/Documents/Technology/ApacheSpark-v2.1/spark-2.1.0-bin-hadoop2.7/jars/*.jar:/Users/karanalang/Documents/Technology/kafka/mirrormaker_topic_rename-master/target/mmchangetopic-1.0-SNAPSHOT.jar:/Users/karanalang/Documents/Technology/ > *Coursera_spark_scala/structuredStreamingKafka/target/scala-2.11/* > *StructuredStreamingKafka-assembly-1.0.jar* When i look inside the jar - *StructuredStreamingKafka-assembly-1.0.jar, i don't see the file "*StructuredKafkaWordCount1.class" Attaching my build.sbt. Any ideas on what i need to do ? On Mon, Jun 19, 2017 at 1:39 PM, Jozef.koval <jozef.ko...@protonmail.ch> wrote: > Hi Karan, > > You assume incorrectly, KafkaUtils is part of (already mentioned) > spark-streaming-kafka library. I have not seen Eclipse in ages, but I would > suggest to import sbt project there, could make it easier for you. Or > alternatively, using ENSIME with your favorite editor will make your life > much easier. > > Jozef > > Sent from ProtonMail <https://protonmail.ch>, encrypted email based in > Switzerland. > > > -------- Original Message -------- > Subject: Re: Kafka-Spark Integration - build failing with sbt > Local Time: June 19, 2017 8:02 PM > UTC Time: June 19, 2017 6:02 PM > From: karan.al...@gmail.com > To: Jozef.koval <jozef.ko...@protonmail.ch> > users@kafka.apache.org <users@kafka.apache.org> > > Hi Jozef - i do have a additional basic question .. > When i tried to compile the code in Eclipse, i was not able to do that > > eg. > import org.apache.spark.streaming.kafka.KafkaUtils > > gave errors saying KafaUtils was not part of the package. > However, when i used sbt to compile - the compilation went through fine > > So, I assume additional libraries are being downloaded when i provide the > appropriate packages in LibraryDependencies ? > which ones would have helped compile this ? > > On Sat, Jun 17, 2017 at 2:52 PM, karan alang <karan.al...@gmail.com> > wrote: > >> Hey Jozef, >> >> Thanks for the quick response .. >> yes, you are right .. spark-sql dependency was missing .. added that & it >> worked fine. >> >> regds, >> Karan Alang >> >> On Sat, Jun 17, 2017 at 2:24 PM, Jozef.koval <jozef.ko...@protonmail.ch> >> wrote: >> >>> Hey Karan, >>> I believe you are missing spark-sql dependency. >>> >>> Jozef >>> >>> Sent from ProtonMail <https://protonmail.ch>, encrypted email based in >>> Switzerland. >>> >>> >>> -------- Original Message -------- >>> Subject: Re: Kafka-Spark Integration - build failing with sbt >>> Local Time: June 17, 2017 10:52 PM >>> UTC Time: June 17, 2017 8:52 PM >>> From: karan.al...@gmail.com >>> To: users@kafka.apache.org, Jozef.koval <jozef.ko...@protonmail.ch> >>> >>> >>> Thanks, i was able to get this working. >>> here is what i added in build.sbt file >>> ------------------------------------------------------------ >>> ---------------------------------- >>> >>> scalaVersion := "2.11.7" >>> >>> val sparkVers = "2.1.0" >>> >>> // Base Spark-provided dependencies >>> >>> libraryDependencies ++= Seq( >>> >>> "org.apache.spark" %% "spark-core" % sparkVers % "provided", >>> >>> "org.apache.spark" %% "spark-streaming" % sparkVers % "provided", >>> >>> "org.apache.spark" % "spark-streaming-kafka-0-8_2.11" % sparkVers) >>> >>> ------------------------------------------------------------ >>> ------------------------------------ >>> >>> However, i'm running into addition issue when compiling the file using >>> sbt .. it gives errors as shown below : >>> >>> *Pls note - I've added the jars in eclipse, and it works .. however when >>> i use sbt to compile, it is failing.* >>> >>> *What needs to be done ? * >>> >>> *I've also tried added the jars in CLASSPATH, but still get the same >>> error. * >>> >>> ------------------------------------------------------------ >>> ------------------------------------- >>> >>> [info] Compiling 1 Scala source to /Users/karanalang/Documents/Te >>> chnology/Coursera_spark_scala/spark_kafka_code/target/scala- >>> 2.11/classes... >>> >>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/ >>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:6: >>> object SQLContext is not a member of package org.apache.spark.sql >>> >>> [error] import org.apache.spark.sql.SQLContext >>> >>> [error] ^ >>> >>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/ >>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:10: >>> object SparkSession is not a member of package org.apache.spark.sql >>> >>> [error] import org.apache.spark.sql.SparkSession >>> >>> [error] ^ >>> >>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/ >>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:11: >>> object types is not a member of package org.apache.spark.sql >>> >>> [error] import org.apache.spark.sql.types.StructField >>> >>> [error] ^ >>> >>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/ >>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:12: >>> object types is not a member of package org.apache.spark.sql >>> >>> [error] import org.apache.spark.sql.types.StringType >>> >>> [error] ^ >>> >>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/ >>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:13: >>> object types is not a member of package org.apache.spark.sql >>> >>> [error] import org.apache.spark.sql.types.StructType >>> >>> [error] ^ >>> >>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/ >>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:14: >>> object Row is not a member of package org.apache.spark.sql >>> >>> [error] import org.apache.spark.sql.Row >>> >>> [error] ^ >>> >>> >>> >>> On Sat, Jun 17, 2017 at 12:35 AM, Jozef.koval <jozef.ko...@protonmail.ch >>> > wrote: >>> >>>> Hi Karan, >>>> >>>> spark-streaming-kafka is for old spark (version < 1.6.3) >>>> spark-streaming-kafka-0.8 is for current spark (version > 2.0) >>>> >>>> Jozef >>>> >>>> n.b. there is also version for kafka 0.10+ see [this]( >>>> https://spark.apache.org/docs/latest/streaming-kafka-integration.html) >>>> >>>> Sent from [ProtonMail](https://protonmail.ch), encrypted email based >>>> in Switzerland. >>>> >>>> >>>> -------- Original Message -------- >>>> Subject: Kafka-Spark Integration - build failing with sbt >>>> Local Time: June 17, 2017 1:50 AM >>>> UTC Time: June 16, 2017 11:50 PM >>>> From: karan.al...@gmail.com >>>> To: users@kafka.apache.org >>>> >>>> I"m trying to compile kafka & Spark Streaming integration code i.e. >>>> reading >>>> from Kafka using Spark Streaming, >>>> and the sbt build is failing with error - >>>> >>>> [error] (*:update) sbt.ResolveException: unresolved dependency: >>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found >>>> >>>> Scala version -> 2.10.7 >>>> Spark Version -> 2.1.0 >>>> Kafka version -> 0.9 >>>> sbt version -> 0.13 >>>> >>>> Contents of sbt files is as shown below -> >>>> >>>> 1) >>>> vi spark_kafka_code/project/plugins.sbt >>>> >>>> addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2") >>>> >>>> 2) >>>> vi spark_kafka_code/sparkkafka.sbt >>>> >>>> import AssemblyKeys._ >>>> assemblySettings >>>> >>>> name := "SparkKafka Project" >>>> >>>> version := "1.0" >>>> scalaVersion := "2.11.7" >>>> >>>> val sparkVers = "2.1.0" >>>> >>>> // Base Spark-provided dependencies >>>> libraryDependencies ++= Seq( >>>> "org.apache.spark" %% "spark-core" % sparkVers % "provided", >>>> "org.apache.spark" %% "spark-streaming" % sparkVers % "provided", >>>> "org.apache.spark" %% "spark-streaming-kafka" % sparkVers) >>>> >>>> mergeStrategy in assembly := { >>>> case m if m.toLowerCase.endsWith("manifest.mf") => >>>> MergeStrategy.discard >>>> case m if m.toLowerCase.startsWith("META-INF") => MergeStrategy.discard >>>> case "reference.conf" => MergeStrategy.concat >>>> case m if m.endsWith("UnusedStubClass.class") => MergeStrategy.discard >>>> case _ => MergeStrategy.first >>>> } >>>> >>>> i launch sbt, and then try to create an eclipse project, complete error >>>> is as shown below - >>>> >>>> --------------------- >>>> >>>> sbt >>>> [info] Loading global plugins from /Users/karanalang/.sbt/0.13/plugins >>>> [info] Loading project definition from >>>> /Users/karanalang/Documents/Technology/Coursera_spark_scala/ >>>> spark_kafka_code/project >>>> [info] Set current project to SparkKafka Project (in build >>>> file:/Users/karanalang/Documents/Technology/Coursera_spark_s >>>> cala/spark_kafka_code/) >>>> > eclipse >>>> [info] About to create Eclipse project files for your project(s). >>>> [info] Updating >>>> {file:/Users/karanalang/Documents/Technology/Coursera_spark_ >>>> scala/spark_kafka_code/}spark_kafka_code... >>>> [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ... >>>> [warn] module not found: >>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0 >>>> [warn] ==== local: tried >>>> [warn] >>>> /Users/karanalang/.ivy2/local/org.apache.spark/spark-streami >>>> ng-kafka_2.11/2.1.0/ivys/ivy.xml >>>> [warn] ==== activator-launcher-local: tried >>>> [warn] >>>> /Users/karanalang/.activator/repository/org.apache.spark/spa >>>> rk-streaming-kafka_2.11/2.1.0/ivys/ivy.xml >>>> [warn] ==== activator-local: tried >>>> [warn] >>>> /Users/karanalang/Documents/Technology/SCALA/activator-dist- >>>> 1.3.10/repository/org.apache.spark/spark-streaming-kafka_2.1 >>>> 1/2.1.0/ivys/ivy.xml >>>> [warn] ==== public: tried >>>> [warn] >>>> https://repo1.maven.org/maven2/org/apache/spark/spark-stream >>>> ing-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom >>>> [warn] ==== typesafe-releases: tried >>>> [warn] >>>> http://repo.typesafe.com/typesafe/releases/org/apache/spark/ >>>> spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom >>>> [warn] ==== typesafe-ivy-releasez: tried >>>> [warn] >>>> http://repo.typesafe.com/typesafe/ivy-releases/org.apache.sp >>>> ark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml >>>> [info] Resolving jline#jline;2.12.1 ... >>>> [warn] :::::::::::::::::::::::::::::::::::::::::::::: >>>> [warn] :: UNRESOLVED DEPENDENCIES :: >>>> [warn] :::::::::::::::::::::::::::::::::::::::::::::: >>>> [warn] :: org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found >>>> [warn] :::::::::::::::::::::::::::::::::::::::::::::: >>>> [warn] >>>> [warn] Note: Unresolved dependencies path: >>>> [warn] org.apache.spark:spark-streaming-kafka_2.11:2.1.0 >>>> (/Users/karanalang/Documents/Technology/Coursera_spark_scala >>>> /spark_kafka_code/sparkkafka.sbt#L12-16) >>>> [warn] +- sparkkafka-project:sparkkafka-project_2.11:1.0 >>>> [trace] Stack trace suppressed: run last *:update for the full output. >>>> [error] (*:update) sbt.ResolveException: unresolved dependency: >>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found >>>> [info] Updating >>>> {file:/Users/karanalang/Documents/Technology/Coursera_spark_ >>>> scala/spark_kafka_code/}spark_kafka_code... >>>> [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ... >>>> [warn] module not found: >>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0 >>>> [warn] ==== local: tried >>>> [warn] >>>> /Users/karanalang/.ivy2/local/org.apache.spark/spark-streami >>>> ng-kafka_2.11/2.1.0/ivys/ivy.xml >>>> [warn] ==== activator-launcher-local: tried >>>> [warn] >>>> /Users/karanalang/.activator/repository/org.apache.spark/spa >>>> rk-streaming-kafka_2.11/2.1.0/ivys/ivy.xml >>>> [warn] ==== activator-local: tried >>>> [warn] >>>> /Users/karanalang/Documents/Technology/SCALA/activator-dist- >>>> 1.3.10/repository/org.apache.spark/spark-streaming-kafka_2.1 >>>> 1/2.1.0/ivys/ivy.xml >>>> [warn] ==== public: tried >>>> [warn] >>>> https://repo1.maven.org/maven2/org/apache/spark/spark-stream >>>> ing-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom >>>> [warn] ==== typesafe-releases: tried >>>> [warn] >>>> http://repo.typesafe.com/typesafe/releases/org/apache/spark/ >>>> spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom >>>> [warn] ==== typesafe-ivy-releasez: tried >>>> [warn] >>>> http://repo.typesafe.com/typesafe/ivy-releases/org.apache.sp >>>> ark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml >>>> [info] Resolving jline#jline;2.12.1 ... >>>> [warn] :::::::::::::::::::::::::::::::::::::::::::::: >>>> [warn] :: UNRESOLVED DEPENDENCIES :: >>>> [warn] :::::::::::::::::::::::::::::::::::::::::::::: >>>> [warn] :: org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found >>>> [warn] :::::::::::::::::::::::::::::::::::::::::::::: >>>> [warn] >>>> [warn] Note: Unresolved dependencies path: >>>> [warn] org.apache.spark:spark-streaming-kafka_2.11:2.1.0 >>>> (/Users/karanalang/Documents/Technology/Coursera_spark_scala >>>> /spark_kafka_code/sparkkafka.sbt#L12-16) >>>> [warn] +- sparkkafka-project:sparkkafka-project_2.11:1.0 >>>> [trace] Stack trace suppressed: run last *:update for the full output. >>>> [error] (*:update) sbt.ResolveException: unresolved dependency: >>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found >>>> [error] Could not create Eclipse project files: >>>> [error] Error evaluating task "scalacOptions": error >>>> [error] Error evaluating task "externalDependencyClasspath": error >>>> > >>>> >>> >>> >