Hi Hari,
Now Iam trying out the same FlumeEventCount example running with spark-submit
Instead of run example. The steps I followed is that I have exported the
JavaFlumeEventCount.java into jar.
The command used is
./bin/spark-submit --jars lib/spark-examples-1.1.0-hadoop1.0.4.jar --master
local --class org.JavaFlumeEventCount bin/flumeeventcnt2.jar localhost 2323
The output is
14/11/12 17:55:02 INFO scheduler.ReceiverTracker: Stream 0 received 1 blocks
14/11/12 17:55:02 INFO scheduler.JobScheduler: Added jobs for time 1415795102000
If I use this command
./bin/spark-submit --master local --class org.JavaFlumeEventCount
bin/flumeeventcnt2.jar localhost 2323
Then I get an error
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/spark/examples/streaming/StreamingExamples
at org.JavaFlumeEventCount.main(JavaFlumeEventCount.java:22)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.spark.examples.streaming.StreamingExamples
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 8 more
I Just wanted to ask is that it is able to find spark-assembly.jar but why
not spark-example.jar.
The next doubt is while running FlumeEventCount example through runexample
I get an output as
Received 4 flume events.
14/11/12 18:30:14 INFO scheduler.JobScheduler: Finished job streaming job
1415797214000 ms.0 from job set of time 1415797214000 ms
14/11/12 18:30:14 INFO rdd.MappedRDD: Removing RDD 70 from persistence list
But If I run the same program through Spark-Submit
I get an output as
14/11/12 17:55:02 INFO scheduler.ReceiverTracker: Stream 0 received 1 blocks
14/11/12 17:55:02 INFO scheduler.JobScheduler: Added jobs for time 1415795102000
So I need a clarification, since in the program the printing statement is
written as " Received n flume events." So how come Iam able to see as "Stream 0
received n blocks".
And what is the difference of running the program through spark-submit and
run-example.
Awaiting for your kind reply
Regards,
Jeniba Johnson
________________________________
The contents of this e-mail and any attachment(s) may contain confidential or
privileged information for the intended recipient(s). Unintended recipients are
prohibited from taking action on the basis of information in this e-mail and
using or disseminating the information, and must notify the sender and delete
it from their system. L&T Infotech will not accept responsibility or liability
for the accuracy or completeness of, or the presence of any virus or disabling
code in this e-mail"