Hi Sean, It's been for years I'd say that you had to specify --packages to get the Kafka-related jars on the classpath. I simply got used to this annoyance (as did others). Could it be that it's an external package (although an integral part of Spark)?!
I'm very glad you've brought it up since I think Kafka data source is so important that it should be included in spark-shell and spark-submit by default. THANKS! Pozdrawiam, Jacek Laskowski ---- https://about.me/JacekLaskowski Mastering Spark SQL https://bit.ly/mastering-spark-sql Spark Structured Streaming https://bit.ly/spark-structured-streaming Mastering Kafka Streams https://bit.ly/mastering-kafka-streams Follow me at https://twitter.com/jaceklaskowski On Sat, Aug 4, 2018 at 9:56 PM, Sean Owen <sro...@gmail.com> wrote: > Let's take this to https://issues.apache.org/jira/browse/SPARK-25026 -- I > provisionally marked this a Blocker, as if it's correct, then the release > is missing an important piece and we'll want to remedy that ASAP. I still > have this feeling I am missing something. The classes really aren't there > in the release but ... *nobody* noticed all this time? I guess maybe > Spark-Kafka users may be using a vendor distro that does package these bits. > > > On Sat, Aug 4, 2018 at 10:48 AM Sean Owen <sro...@gmail.com> wrote: > >> I was debugging why a Kafka-based streaming app doesn't seem to find >> Kafka-related integration classes when run standalone from our latest 2.3.1 >> release, and noticed that there doesn't seem to be any Kafka-related jars >> from Spark in the distro. In jars/, I see: >> >> spark-catalyst_2.11-2.3.1.jar >> spark-core_2.11-2.3.1.jar >> spark-graphx_2.11-2.3.1.jar >> spark-hive-thriftserver_2.11-2.3.1.jar >> spark-hive_2.11-2.3.1.jar >> spark-kubernetes_2.11-2.3.1.jar >> spark-kvstore_2.11-2.3.1.jar >> spark-launcher_2.11-2.3.1.jar >> spark-mesos_2.11-2.3.1.jar >> spark-mllib-local_2.11-2.3.1.jar >> spark-mllib_2.11-2.3.1.jar >> spark-network-common_2.11-2.3.1.jar >> spark-network-shuffle_2.11-2.3.1.jar >> spark-repl_2.11-2.3.1.jar >> spark-sketch_2.11-2.3.1.jar >> spark-sql_2.11-2.3.1.jar >> spark-streaming_2.11-2.3.1.jar >> spark-tags_2.11-2.3.1.jar >> spark-unsafe_2.11-2.3.1.jar >> spark-yarn_2.11-2.3.1.jar >> >> I checked make-distribution.sh, and it copies a bunch of JARs into the >> distro, but does not seem to touch the kafka modules. >> >> Am I crazy or missing something obvious -- those should be in the >> release, right? >> >