Hi Jozef, all,

I seem to be running into another issue .. here is what i did ..

I'm running Spark Streaming - Kafka integration using Spark 2.x & Kafka 10.

I compiled the program using sbt, and the compilation went through fine.
I was able able to import this into Eclipse & run the program from Eclipse.

However, when i run the program using spark-submit, i'm getting the
following error :

----------------------------------

>  $SPARK_HOME/bin/spark-submit --class
> "structuredStreaming.kafka.StructuredKafkaWordCount1" --master local[2]
> /Users/karanalang/Documents/Technology/Coursera_spark_scala/structuredStreamingKafka/target/scala-2.11/StructuredStreamingKafka-assembly-1.0.jar



> *java.lang.ClassNotFoundException:
> structuredStreaming.kafka.StructuredKafkaWordCount1*
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:229)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:695)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>
------------------------------

I've put the jar in the classpath, but i still get the error ->

echo $CLASSPATH

.:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/jopt-simple-3.2.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/kafka-clients-0.9.0.1.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/kafka_2.11-0.9.0.1.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/log4j-1.2.17.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/metrics-core-2.2.0.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/scala-library-2.11.7.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/slf4j-api-1.7.6.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/slf4j-log4j12-1.7.6.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/snappy-java-1.1.1.7.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/zkclient-0.7.jar:/Users/karanalang/Documents/Technology/kafka/kafka_2.11-0.9.0.1/lib/zookeeper-3.4.6.jar:/Users/karanalang/Documents/Technology/ApacheSpark-v2.1/spark-2.1.0-bin-hadoop2.7/jars/*.jar:/Users/karanalang/Documents/Technology/kafka/mirrormaker_topic_rename-master/target/mmchangetopic-1.0-SNAPSHOT.jar:/Users/karanalang/Documents/Technology/
> *Coursera_spark_scala/structuredStreamingKafka/target/scala-2.11/*
> *StructuredStreamingKafka-assembly-1.0.jar*


When i look inside the jar - *StructuredStreamingKafka-assembly-1.0.jar, i
don't see the file "*StructuredKafkaWordCount1.class"

Attaching my build.sbt.

Any ideas on what i need to do ?






















On Mon, Jun 19, 2017 at 1:39 PM, Jozef.koval <jozef.ko...@protonmail.ch>
wrote:

> Hi Karan,
>
> You assume incorrectly, KafkaUtils is part of (already mentioned)
> spark-streaming-kafka library. I have not seen Eclipse in ages, but I would
> suggest to import sbt project there, could make it easier for you. Or
> alternatively, using ENSIME with your favorite editor will make your life
> much easier.
>
> Jozef
>
> Sent from ProtonMail <https://protonmail.ch>, encrypted email based in
> Switzerland.
>
>
> -------- Original Message --------
> Subject: Re: Kafka-Spark Integration - build failing with sbt
> Local Time: June 19, 2017 8:02 PM
> UTC Time: June 19, 2017 6:02 PM
> From: karan.al...@gmail.com
> To: Jozef.koval <jozef.ko...@protonmail.ch>
> users@kafka.apache.org <users@kafka.apache.org>
>
> Hi Jozef - i do have a additional basic question ..
> When i tried to compile the code in Eclipse, i was not able to do that
>
> eg.
> import org.apache.spark.streaming.kafka.KafkaUtils
>
> gave errors saying KafaUtils was not part of the package.
> However, when i used sbt to compile - the compilation went through fine
>
> So, I assume additional libraries are being downloaded when i provide the
> appropriate packages in LibraryDependencies ?
> which ones would have helped compile this ?
>
> On Sat, Jun 17, 2017 at 2:52 PM, karan alang <karan.al...@gmail.com>
> wrote:
>
>> Hey Jozef,
>>
>> Thanks for the quick response ..
>> yes, you are right .. spark-sql dependency was missing .. added that & it
>> worked fine.
>>
>> regds,
>> Karan Alang
>>
>> On Sat, Jun 17, 2017 at 2:24 PM, Jozef.koval <jozef.ko...@protonmail.ch>
>> wrote:
>>
>>> Hey Karan,
>>> I believe you are missing spark-sql dependency.
>>>
>>> Jozef
>>>
>>> Sent from ProtonMail <https://protonmail.ch>, encrypted email based in
>>> Switzerland.
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: Kafka-Spark Integration - build failing with sbt
>>> Local Time: June 17, 2017 10:52 PM
>>> UTC Time: June 17, 2017 8:52 PM
>>> From: karan.al...@gmail.com
>>> To: users@kafka.apache.org, Jozef.koval <jozef.ko...@protonmail.ch>
>>>
>>>
>>> Thanks, i was able to get this working.
>>> here is what i added in build.sbt file
>>> ------------------------------------------------------------
>>> ----------------------------------
>>>
>>> scalaVersion := "2.11.7"
>>>
>>> val sparkVers = "2.1.0"
>>>
>>> // Base Spark-provided dependencies
>>>
>>> libraryDependencies ++= Seq(
>>>
>>>   "org.apache.spark" %% "spark-core" % sparkVers % "provided",
>>>
>>>   "org.apache.spark" %% "spark-streaming" % sparkVers % "provided",
>>>
>>>   "org.apache.spark" % "spark-streaming-kafka-0-8_2.11" % sparkVers)
>>>
>>> ------------------------------------------------------------
>>> ------------------------------------
>>>
>>> However, i'm running into addition issue when compiling the file using
>>> sbt .. it gives errors as shown below :
>>>
>>> *Pls note - I've added the jars in eclipse, and it works .. however when
>>> i use sbt to compile, it is failing.*
>>>
>>> *What needs to be done ? *
>>>
>>> *I've also tried added the jars in CLASSPATH, but still get the same
>>> error. *
>>>
>>> ------------------------------------------------------------
>>> -------------------------------------
>>>
>>> [info] Compiling 1 Scala source to /Users/karanalang/Documents/Te
>>> chnology/Coursera_spark_scala/spark_kafka_code/target/scala-
>>> 2.11/classes...
>>>
>>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:6:
>>> object SQLContext is not a member of package org.apache.spark.sql
>>>
>>> [error] import org.apache.spark.sql.SQLContext
>>>
>>> [error]        ^
>>>
>>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:10:
>>> object SparkSession is not a member of package org.apache.spark.sql
>>>
>>> [error] import org.apache.spark.sql.SparkSession
>>>
>>> [error]        ^
>>>
>>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:11:
>>> object types is not a member of package org.apache.spark.sql
>>>
>>> [error] import org.apache.spark.sql.types.StructField
>>>
>>> [error]                             ^
>>>
>>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:12:
>>> object types is not a member of package org.apache.spark.sql
>>>
>>> [error] import org.apache.spark.sql.types.StringType
>>>
>>> [error]                             ^
>>>
>>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:13:
>>> object types is not a member of package org.apache.spark.sql
>>>
>>> [error] import org.apache.spark.sql.types.StructType
>>>
>>> [error]                             ^
>>>
>>> [error] /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>>> spark_kafka_code/src/main/scala/spark/kafka/SparkKafkaDS.scala:14:
>>> object Row is not a member of package org.apache.spark.sql
>>>
>>> [error] import org.apache.spark.sql.Row
>>>
>>> [error]        ^
>>>
>>>
>>>
>>> On Sat, Jun 17, 2017 at 12:35 AM, Jozef.koval <jozef.ko...@protonmail.ch
>>> > wrote:
>>>
>>>> Hi Karan,
>>>>
>>>> spark-streaming-kafka is for old spark (version < 1.6.3)
>>>> spark-streaming-kafka-0.8 is for current spark (version > 2.0)
>>>>
>>>> Jozef
>>>>
>>>> n.b. there is also version for kafka 0.10+ see [this](
>>>> https://spark.apache.org/docs/latest/streaming-kafka-integration.html)
>>>>
>>>> Sent from [ProtonMail](https://protonmail.ch), encrypted email based
>>>> in Switzerland.
>>>>
>>>>
>>>> -------- Original Message --------
>>>> Subject: Kafka-Spark Integration - build failing with sbt
>>>> Local Time: June 17, 2017 1:50 AM
>>>> UTC Time: June 16, 2017 11:50 PM
>>>> From: karan.al...@gmail.com
>>>> To: users@kafka.apache.org
>>>>
>>>> I"m trying to compile kafka & Spark Streaming integration code i.e.
>>>> reading
>>>> from Kafka using Spark Streaming,
>>>> and the sbt build is failing with error -
>>>>
>>>> [error] (*:update) sbt.ResolveException: unresolved dependency:
>>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>>>>
>>>> Scala version -> 2.10.7
>>>> Spark Version -> 2.1.0
>>>> Kafka version -> 0.9
>>>> sbt version -> 0.13
>>>>
>>>> Contents of sbt files is as shown below ->
>>>>
>>>> 1)
>>>> vi spark_kafka_code/project/plugins.sbt
>>>>
>>>> addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")
>>>>
>>>> 2)
>>>> vi spark_kafka_code/sparkkafka.sbt
>>>>
>>>> import AssemblyKeys._
>>>> assemblySettings
>>>>
>>>> name := "SparkKafka Project"
>>>>
>>>> version := "1.0"
>>>> scalaVersion := "2.11.7"
>>>>
>>>> val sparkVers = "2.1.0"
>>>>
>>>> // Base Spark-provided dependencies
>>>> libraryDependencies ++= Seq(
>>>> "org.apache.spark" %% "spark-core" % sparkVers % "provided",
>>>> "org.apache.spark" %% "spark-streaming" % sparkVers % "provided",
>>>> "org.apache.spark" %% "spark-streaming-kafka" % sparkVers)
>>>>
>>>> mergeStrategy in assembly := {
>>>> case m if m.toLowerCase.endsWith("manifest.mf") =>
>>>> MergeStrategy.discard
>>>> case m if m.toLowerCase.startsWith("META-INF") => MergeStrategy.discard
>>>> case "reference.conf" => MergeStrategy.concat
>>>> case m if m.endsWith("UnusedStubClass.class") => MergeStrategy.discard
>>>> case _ => MergeStrategy.first
>>>> }
>>>>
>>>> i launch sbt, and then try to create an eclipse project, complete error
>>>> is as shown below -
>>>>
>>>> ---------------------
>>>>
>>>> sbt
>>>> [info] Loading global plugins from /Users/karanalang/.sbt/0.13/plugins
>>>> [info] Loading project definition from
>>>> /Users/karanalang/Documents/Technology/Coursera_spark_scala/
>>>> spark_kafka_code/project
>>>> [info] Set current project to SparkKafka Project (in build
>>>> file:/Users/karanalang/Documents/Technology/Coursera_spark_s
>>>> cala/spark_kafka_code/)
>>>> > eclipse
>>>> [info] About to create Eclipse project files for your project(s).
>>>> [info] Updating
>>>> {file:/Users/karanalang/Documents/Technology/Coursera_spark_
>>>> scala/spark_kafka_code/}spark_kafka_code...
>>>> [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
>>>> [warn] module not found:
>>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0
>>>> [warn] ==== local: tried
>>>> [warn]
>>>> /Users/karanalang/.ivy2/local/org.apache.spark/spark-streami
>>>> ng-kafka_2.11/2.1.0/ivys/ivy.xml
>>>> [warn] ==== activator-launcher-local: tried
>>>> [warn]
>>>> /Users/karanalang/.activator/repository/org.apache.spark/spa
>>>> rk-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>>> [warn] ==== activator-local: tried
>>>> [warn]
>>>> /Users/karanalang/Documents/Technology/SCALA/activator-dist-
>>>> 1.3.10/repository/org.apache.spark/spark-streaming-kafka_2.1
>>>> 1/2.1.0/ivys/ivy.xml
>>>> [warn] ==== public: tried
>>>> [warn]
>>>> https://repo1.maven.org/maven2/org/apache/spark/spark-stream
>>>> ing-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>>>> [warn] ==== typesafe-releases: tried
>>>> [warn]
>>>> http://repo.typesafe.com/typesafe/releases/org/apache/spark/
>>>> spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>>>> [warn] ==== typesafe-ivy-releasez: tried
>>>> [warn]
>>>> http://repo.typesafe.com/typesafe/ivy-releases/org.apache.sp
>>>> ark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>>> [info] Resolving jline#jline;2.12.1 ...
>>>> [warn] ::::::::::::::::::::::::::::::::::::::::::::::
>>>> [warn] :: UNRESOLVED DEPENDENCIES ::
>>>> [warn] ::::::::::::::::::::::::::::::::::::::::::::::
>>>> [warn] :: org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>>>> [warn] ::::::::::::::::::::::::::::::::::::::::::::::
>>>> [warn]
>>>> [warn] Note: Unresolved dependencies path:
>>>> [warn] org.apache.spark:spark-streaming-kafka_2.11:2.1.0
>>>> (/Users/karanalang/Documents/Technology/Coursera_spark_scala
>>>> /spark_kafka_code/sparkkafka.sbt#L12-16)
>>>> [warn] +- sparkkafka-project:sparkkafka-project_2.11:1.0
>>>> [trace] Stack trace suppressed: run last *:update for the full output.
>>>> [error] (*:update) sbt.ResolveException: unresolved dependency:
>>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>>>> [info] Updating
>>>> {file:/Users/karanalang/Documents/Technology/Coursera_spark_
>>>> scala/spark_kafka_code/}spark_kafka_code...
>>>> [info] Resolving org.apache.spark#spark-streaming-kafka_2.11;2.1.0 ...
>>>> [warn] module not found:
>>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0
>>>> [warn] ==== local: tried
>>>> [warn]
>>>> /Users/karanalang/.ivy2/local/org.apache.spark/spark-streami
>>>> ng-kafka_2.11/2.1.0/ivys/ivy.xml
>>>> [warn] ==== activator-launcher-local: tried
>>>> [warn]
>>>> /Users/karanalang/.activator/repository/org.apache.spark/spa
>>>> rk-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>>> [warn] ==== activator-local: tried
>>>> [warn]
>>>> /Users/karanalang/Documents/Technology/SCALA/activator-dist-
>>>> 1.3.10/repository/org.apache.spark/spark-streaming-kafka_2.1
>>>> 1/2.1.0/ivys/ivy.xml
>>>> [warn] ==== public: tried
>>>> [warn]
>>>> https://repo1.maven.org/maven2/org/apache/spark/spark-stream
>>>> ing-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>>>> [warn] ==== typesafe-releases: tried
>>>> [warn]
>>>> http://repo.typesafe.com/typesafe/releases/org/apache/spark/
>>>> spark-streaming-kafka_2.11/2.1.0/spark-streaming-kafka_2.11-2.1.0.pom
>>>> [warn] ==== typesafe-ivy-releasez: tried
>>>> [warn]
>>>> http://repo.typesafe.com/typesafe/ivy-releases/org.apache.sp
>>>> ark/spark-streaming-kafka_2.11/2.1.0/ivys/ivy.xml
>>>> [info] Resolving jline#jline;2.12.1 ...
>>>> [warn] ::::::::::::::::::::::::::::::::::::::::::::::
>>>> [warn] :: UNRESOLVED DEPENDENCIES ::
>>>> [warn] ::::::::::::::::::::::::::::::::::::::::::::::
>>>> [warn] :: org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>>>> [warn] ::::::::::::::::::::::::::::::::::::::::::::::
>>>> [warn]
>>>> [warn] Note: Unresolved dependencies path:
>>>> [warn] org.apache.spark:spark-streaming-kafka_2.11:2.1.0
>>>> (/Users/karanalang/Documents/Technology/Coursera_spark_scala
>>>> /spark_kafka_code/sparkkafka.sbt#L12-16)
>>>> [warn] +- sparkkafka-project:sparkkafka-project_2.11:1.0
>>>> [trace] Stack trace suppressed: run last *:update for the full output.
>>>> [error] (*:update) sbt.ResolveException: unresolved dependency:
>>>> org.apache.spark#spark-streaming-kafka_2.11;2.1.0: not found
>>>> [error] Could not create Eclipse project files:
>>>> [error] Error evaluating task "scalacOptions": error
>>>> [error] Error evaluating task "externalDependencyClasspath": error
>>>> >
>>>>
>>>
>>>
>

Reply via email to