"--package" will add transitive dependencies that are not "$SPARK_HOME/external/kafka-0-10-sql/target/*.jar".
> i have tried building the jar with dependencies, but still face the same error. What's the command you used? On Wed, Jun 28, 2017 at 12:00 PM, satyajit vegesna < satyajit.apas...@gmail.com> wrote: > Hi All, > > I am trying too build Kafka-0-10-sql module under external folder in > apache spark source code. > Once i generate jar file using, > build/mvn package -DskipTests -pl external/kafka-0-10-sql > i get jar file created under external/kafka-0-10-sql/target. > > And try to run spark-shell with jars created in target folder as below, > bin/spark-shell --jars $SPARK_HOME/external/kafka-0-10-sql/target/*.jar > > i get below error based on the command, > > Using Spark's default log4j profile: org/apache/spark/log4j- > defaults.properties > > Setting default log level to "WARN". > > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > > 17/06/28 11:54:03 WARN NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > > Spark context Web UI available at http://10.1.10.241:4040 > > Spark context available as 'sc' (master = local[*], app id = > local-1498676043936). > > Spark session available as 'spark'. > > Welcome to > > ____ __ > > / __/__ ___ _____/ /__ > > _\ \/ _ \/ _ `/ __/ '_/ > > /___/ .__/\_,_/_/ /_/\_\ version 2.3.0-SNAPSHOT > > /_/ > > > > Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java > 1.8.0_131) > > Type in expressions to have them evaluated. > > Type :help for more information. > > scala> val lines = > spark.readStream.format("kafka").option("kafka.bootstrap.servers", > "localhost:9092").option("subscribe", "test").load() > > java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ > ByteArrayDeserializer > > at org.apache.spark.sql.kafka010.KafkaSourceProvider$.<init>( > KafkaSourceProvider.scala:378) > > at org.apache.spark.sql.kafka010.KafkaSourceProvider$.<clinit>( > KafkaSourceProvider.scala) > > at org.apache.spark.sql.kafka010.KafkaSourceProvider. > validateStreamOptions(KafkaSourceProvider.scala:325) > > at org.apache.spark.sql.kafka010.KafkaSourceProvider.sourceSchema( > KafkaSourceProvider.scala:60) > > at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema( > DataSource.scala:192) > > at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$ > lzycompute(DataSource.scala:87) > > at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo( > DataSource.scala:87) > > at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply( > StreamingRelation.scala:30) > > at org.apache.spark.sql.streaming.DataStreamReader. > load(DataStreamReader.scala:150) > > ... 48 elided > > Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common. > serialization.ByteArrayDeserializer > > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > > ... 57 more > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > i have tried building the jar with dependencies, but still face the same > error. > > But when i try to do --package with spark-shell using bin/spark-shell > --package org.apache.spark:spark-sql-kafka-0-10_2.11:2.1.0 , it works > fine. > > The reason, i am trying to build something from source code, is because i > want to try pushing dataframe data into kafka topic, based on the url > https://github.com/apache/spark/commit/b0a5cd89097c563e9949d8cfcf84d1 > 8b03b8d24c, which doesn't work with version 2.1.0. > > > Any help would be highly appreciated. > > > Regards, > > Satyajit. > > >