One of the packages just contains the streaming-kafka code.  The other
contains that code, plus everything it depends on.  That's what "assembly"
typically means in JVM land.

Java/Scala users are accustomed to using their own build tool to include
necessary dependencies.  JVM dependency management is (thankfully)
different from Python dependency management.

As far as I can tell, there is no core issue, upstream or otherwise.






On Tue, May 12, 2015 at 11:39 AM, Lee McFadden <[email protected]> wrote:

> Thanks again for all the help folks.
>
> I can confirm that simply switching to `--packages
> org.apache.spark:spark-streaming-kafka-assembly_2.10:1.3.1` makes
> everything work as intended.
>
> I'm not sure what the difference is between the two packages honestly, or
> why one should be used over the other, but the documentation is currently
> not intuitive in this matter.  If you follow the instructions, initially it
> will seem broken.  Is there any reason why the docs for Python users (or,
> in fact, all users - Java/Scala users will run into this too except they
> are armed with the ability to build their own jar with the dependencies
> included) should not be changed to using the assembly package by default?
>
> Additionally, after a few google searches yesterday combined with your
> help I'm wondering if the core issue is upstream in Kafka's dependency
> chain?
>
> On Tue, May 12, 2015 at 8:53 AM Ted Yu <[email protected]> wrote:
>
>> bq. it is already in the assembly
>>
>> Yes. Verified:
>>
>> $ jar tvf ~/Downloads/spark-streaming-kafka-assembly_2.10-1.3.1.jar | grep 
>> yammer | grep Gauge
>>   1329 Sat Apr 11 04:25:50 PDT 2015 com/yammer/metrics/core/Gauge.class
>>
>>
>> On Tue, May 12, 2015 at 8:05 AM, Sean Owen <[email protected]> wrote:
>>
>>> It doesn't depend directly on yammer metrics; Kafka does. It wouldn't
>>> be correct to declare that it does; it is already in the assembly
>>> anyway.
>>>
>>> On Tue, May 12, 2015 at 3:50 PM, Ted Yu <[email protected]> wrote:
>>> > Currently external/kafka/pom.xml doesn't cite yammer metrics as
>>> dependency.
>>> >
>>> > $ ls -l
>>> >
>>> ~/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar
>>> > -rw-r--r--  1 tyu  staff  82123 Dec 17  2013
>>> >
>>> /Users/tyu/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar
>>> >
>>> > Including the metrics-core jar would not increase the size of the final
>>> > release artifact much.
>>> >
>>> > My two cents.
>>>
>>
>>

Reply via email to