Hi Jeremy ,

if you are using *addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.4") *
in  "project/plugin.sbt"

You also need to edit  "project / project / build.scala" with same sbt
version(0.11.4).

like

import sbt._

object Plugins extends Build {
  lazy val root = Project("root", file(".")) dependsOn(
    uri("git://github.com/sbt/sbt-assembly.git#0.11.4")
  )
}


Then try *sbt assembly.*

Let me know is it working or not.

Regards,
prabeesh



On Thu, Jun 5, 2014 at 1:16 PM, Nick Pentreath <nick.pentre...@gmail.com>
wrote:

> Great - well we do hope we hear from you, since the user list is for
> interesting success stories and anecdotes, as well as blog posts etc too :)
>
>
> On Thu, Jun 5, 2014 at 9:40 AM, Jeremy Lee <unorthodox.engine...@gmail.com
> > wrote:
>
>> Oh. Yes of course. *facepalm*
>>
>> I'm sure I typed that at first, but at some point my fingers decided to
>> grammar-check me. Stupid fingers. I wonder what "sbt assemble" does? (apart
>> from error) It certainly takes a while to do it.
>>
>> Thanks for the maven offer, but I'm not scheduled to learn that until
>> after Scala, streaming, graphx, mllib, HDFS, sbt, Python, and yarn. I'll
>> probably need to know it for yarn, but I'm really hoping to put it off
>> until then. (fortunately I already knew about linux, AWS, eclipse, git,
>> java, distributed programming and ssh keyfiles, or I would have been in
>> real trouble)
>>
>> Ha! OK, that worked for the Kafka project... fails on the other old 0.9
>> Twitter project, but who cares... now for mine....
>>
>> HAHA! YES!! Oh thank you! I have the equivalent of "hello world" that
>> uses one external library! Now the compiler and I can have a _proper_
>> conversation.
>>
>> Hopefully you won't be hearing from me for a while.
>>
>>
>>
>> On Thu, Jun 5, 2014 at 3:06 PM, Nick Pentreath <nick.pentre...@gmail.com>
>> wrote:
>>
>>> The "magic incantation" is "sbt assembly" (not "assemble").
>>>
>>> Actually I find maven with their assembly plugins to be very easy (mvn
>>> package). I can send a Pom.xml for a skeleton project if you need
>>> —
>>> Sent from Mailbox <https://www.dropbox.com/mailbox>
>>>
>>>
>>> On Thu, Jun 5, 2014 at 6:59 AM, Jeremy Lee <
>>> unorthodox.engine...@gmail.com> wrote:
>>>
>>>> Hmm.. That's not working so well for me. First, I needed to add a
>>>> "project/plugin.sbt" file with the contents:
>>>>
>>>> addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.4")
>>>>
>>>> Before 'sbt/sbt assemble' worked at all. And I'm not sure about that
>>>> version number, but "0.9.1" isn't working much better and "11.4" is the
>>>> latest one recommended by the sbt project site. Where did you get your
>>>> version from?
>>>>
>>>> Second, even when I do get it to build a .jar, spark-submit is still
>>>> telling me the external.twitter library is missing.
>>>>
>>>> I tried using your github project as-is, but it also complained about
>>>> the missing plugin.. I'm trying it with various versions now to see if I
>>>> can get that working, even though I don't know anything about kafka. Hmm,
>>>> and no. Here's what I get:
>>>>
>>>>  [info] Set current project to Simple Project (in build
>>>> file:/home/ubuntu/spark-1.0.0/SparkKafka/)
>>>> [error] Not a valid command: assemble
>>>> [error] Not a valid project ID: assemble
>>>> [error] Expected ':' (if selecting a configuration)
>>>> [error] Not a valid key: assemble (similar: assembly, assemblyJarName,
>>>> assemblyDirectory)
>>>> [error] assemble
>>>> [error]
>>>>
>>>> I also found this project which seemed to be exactly what I was after:
>>>>  https://github.com/prabeesh/SparkTwitterAnalysis
>>>>
>>>> ...but it was for Spark 0.9, and though I updated all the version
>>>> references to "1.0.0", that one doesn't work either. I can't even get it to
>>>> build.
>>>>
>>>> *sigh*
>>>>
>>>> Is it going to be easier to just copy the external/ source code into my
>>>> own project? Because I will... especially if creating "Uberjars" takes this
>>>> long every... single... time...
>>>>
>>>>
>>>>
>>>> On Thu, Jun 5, 2014 at 8:52 AM, Jeremy Lee <
>>>> unorthodox.engine...@gmail.com> wrote:
>>>>
>>>>> Thanks Patrick!
>>>>>
>>>>> Uberjars. Cool. I'd actually heard of them. And thanks for the link to
>>>>> the example! I shall work through that today.
>>>>>
>>>>> I'm still learning sbt and it's many options... the last new framework
>>>>> I learned was node.js, and I think I've been rather spoiled by "npm".
>>>>>
>>>>> At least it's not maven. Please, oh please don't make me learn maven
>>>>> too. (The only people who seem to like it have Software Stockholm 
>>>>> Syndrome:
>>>>> "I know maven kidnapped me and beat me up, but if you spend long enough
>>>>> with it, you eventually start to sympathize and see it's point of view".)
>>>>>
>>>>>
>>>>> On Thu, Jun 5, 2014 at 3:39 AM, Patrick Wendell <pwend...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hey Jeremy,
>>>>>>
>>>>>> The issue is that you are using one of the external libraries and
>>>>>> these aren't actually packaged with Spark on the cluster, so you need
>>>>>> to create an uber jar that includes them.
>>>>>>
>>>>>> You can look at the example here (I recently did this for a kafka
>>>>>> project and the idea is the same):
>>>>>>
>>>>>> https://github.com/pwendell/kafka-spark-example
>>>>>>
>>>>>> You'll want to make an uber jar that includes these packages (run sbt
>>>>>> assembly) and then submit that jar to spark-submit. Also, I'd try
>>>>>> running it locally first (if you aren't already) just to make the
>>>>>> debugging simpler.
>>>>>>
>>>>>> - Patrick
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 4, 2014 at 6:16 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>>> > Ah sorry, this may be the thing I learned for the day. The issue is
>>>>>> > that classes from that particular artifact are missing though. Worth
>>>>>> > interrogating the resulting .jar file with "jar tf" to see if it
>>>>>> made
>>>>>> > it in?
>>>>>> >
>>>>>> > On Wed, Jun 4, 2014 at 2:12 PM, Nick Pentreath <
>>>>>> nick.pentre...@gmail.com> wrote:
>>>>>> >> @Sean, the %% syntax in SBT should automatically add the Scala
>>>>>> major version
>>>>>> >> qualifier (_2.10, _2.11 etc) for you, so that does appear to be
>>>>>> correct
>>>>>> >> syntax for the build.
>>>>>> >>
>>>>>> >> I seemed to run into this issue with some missing Jackson deps,
>>>>>> and solved
>>>>>> >> it by including the jar explicitly on the driver class path:
>>>>>> >>
>>>>>> >> bin/spark-submit --driver-class-path
>>>>>> >> SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar --class
>>>>>> "SimpleApp"
>>>>>> >> SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar
>>>>>> >>
>>>>>> >> Seems redundant to me since I thought that the JAR as argument is
>>>>>> copied to
>>>>>> >> driver and made available. But this solved it for me so perhaps
>>>>>> give it a
>>>>>> >> try?
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Wed, Jun 4, 2014 at 3:01 PM, Sean Owen <so...@cloudera.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> Those aren't the names of the artifacts:
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22spark-streaming-twitter_2.10%22
>>>>>> >>>
>>>>>> >>> The name is "spark-streaming-twitter_2.10"
>>>>>> >>>
>>>>>> >>> On Wed, Jun 4, 2014 at 1:49 PM, Jeremy Lee
>>>>>> >>> <unorthodox.engine...@gmail.com> wrote:
>>>>>> >>> > Man, this has been hard going. Six days, and I finally got a
>>>>>> "Hello
>>>>>> >>> > World"
>>>>>> >>> > App working that I wrote myself.
>>>>>> >>> >
>>>>>> >>> > Now I'm trying to make a minimal streaming app based on the
>>>>>> twitter
>>>>>> >>> > examples, (running standalone right now while learning) and
>>>>>> when running
>>>>>> >>> > it
>>>>>> >>> > like this:
>>>>>> >>> >
>>>>>> >>> > bin/spark-submit --class "SimpleApp"
>>>>>> >>> > SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar
>>>>>> >>> >
>>>>>> >>> > I'm getting this error:
>>>>>> >>> >
>>>>>> >>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>> >>> > org/apache/spark/streaming/twitter/TwitterUtils$
>>>>>> >>> >
>>>>>> >>> > Which I'm guessing is because I haven't put in a dependency to
>>>>>> >>> > "external/twitter" in the .sbt, but _how_? I can't find any
>>>>>> docs on it.
>>>>>> >>> > Here's my build file so far:
>>>>>> >>> >
>>>>>> >>> > simple.sbt
>>>>>> >>> > ------------------------------------------
>>>>>> >>> > name := "Simple Project"
>>>>>> >>> >
>>>>>> >>> > version := "1.0"
>>>>>> >>> >
>>>>>> >>> > scalaVersion := "2.10.4"
>>>>>> >>> >
>>>>>> >>> > libraryDependencies += "org.apache.spark" %% "spark-core" %
>>>>>> "1.0.0"
>>>>>> >>> >
>>>>>> >>> > libraryDependencies += "org.apache.spark" %% "spark-streaming"
>>>>>> % "1.0.0"
>>>>>> >>> >
>>>>>> >>> > libraryDependencies += "org.apache.spark" %%
>>>>>> "spark-streaming-twitter" %
>>>>>> >>> > "1.0.0"
>>>>>> >>> >
>>>>>> >>> > libraryDependencies += "org.twitter4j" % "twitter4j-stream" %
>>>>>> "3.0.3"
>>>>>> >>> >
>>>>>> >>> > resolvers += "Akka Repository" at "
>>>>>> http://repo.akka.io/releases/";
>>>>>> >>> > ------------------------------------------
>>>>>> >>> >
>>>>>> >>> > I've tried a few obvious things like adding:
>>>>>> >>> >
>>>>>> >>> > libraryDependencies += "org.apache.spark" %% "spark-external" %
>>>>>> "1.0.0"
>>>>>> >>> >
>>>>>> >>> > libraryDependencies += "org.apache.spark" %%
>>>>>> "spark-external-twitter" %
>>>>>> >>> > "1.0.0"
>>>>>> >>> >
>>>>>> >>> > because, well, that would match the naming scheme implied so
>>>>>> far, but it
>>>>>> >>> > errors.
>>>>>> >>> >
>>>>>> >>> >
>>>>>> >>> > Also, I just realized I don't completely understand if:
>>>>>> >>> > (a) the "spark-submit" command _sends_ the .jar to all the
>>>>>> workers, or
>>>>>> >>> > (b) the "spark-submit" commands sends a _job_ to the workers,
>>>>>> which are
>>>>>> >>> > supposed to already have the jar file installed (or in hdfs), or
>>>>>> >>> > (c) the Context is supposed to list the jars to be distributed.
>>>>>> (is that
>>>>>> >>> > deprecated?)
>>>>>> >>> >
>>>>>> >>> > One part of the documentation says:
>>>>>> >>> >
>>>>>> >>> >  "Once you have an assembled jar you can call the
>>>>>> bin/spark-submit
>>>>>> >>> > script as
>>>>>> >>> > shown here while passing your jar."
>>>>>> >>> >
>>>>>> >>> > but another says:
>>>>>> >>> >
>>>>>> >>> > "application-jar: Path to a bundled jar including your
>>>>>> application and
>>>>>> >>> > all
>>>>>> >>> > dependencies. The URL must be globally visible inside of your
>>>>>> cluster,
>>>>>> >>> > for
>>>>>> >>> > instance, an hdfs:// path or a file:// path that is present on
>>>>>> all
>>>>>> >>> > nodes."
>>>>>> >>> >
>>>>>> >>> > I suppose both could be correct if you take a certain point of
>>>>>> view.
>>>>>> >>> >
>>>>>> >>> > --
>>>>>> >>> > Jeremy Lee  BCompSci(Hons)
>>>>>> >>> >   The Unorthodox Engineers
>>>>>> >>
>>>>>> >>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jeremy Lee  BCompSci(Hons)
>>>>>   The Unorthodox Engineers
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jeremy Lee  BCompSci(Hons)
>>>>   The Unorthodox Engineers
>>>>
>>>
>>>
>>
>>
>> --
>> Jeremy Lee  BCompSci(Hons)
>>   The Unorthodox Engineers
>>
>
>

Reply via email to