Hi using spark 1.3 and trying some sample code:
when i run:
all works well but with
it falls over and i get a whole heap of errors:
Is anyone else experiencing this? Ive tried different graphs and always end
up with the same results.
thanks
--
View this message in context:
http://apache-
In spark, every action (foreach, collect etc.) gets converted into a spark
job and jobs are executed sequentially.
You may want to refactor your code in calculateUseCase? to just run
transformations (map, flatmap) and call a single action in the end.
On Sun, Aug 16, 2015 at 3:19 PM, mohanaugust
Hi,I have a basic spark sql join run in the local mode. I checked the UI,and
see that there are two jobs are run. There DAG graph are pasted at the end.
I have several questions here:
1. Looks that Job0 and Job1 all have the same DAG Stages, but the stage 3 and
stage4 are skipped. I would ask wha
To make it clear, Spark Standalone is similar to Yarn as a simple cluster
management system.
Spark Master <---> Yarn Resource Manager
Spark Worker <---> Yarn Node Manager
On Mon, Aug 17, 2015 at 4:59 AM, Ruslan Dautkhanov
wrote:
> There is no Spark master in YARN mode. It's standalone mo
Check module example's dependency (right click examples and click Open
Modules Settings), by default scala-library is provided, you need to change
it to compile to run SparkPi in Intellij. As I remember, you also need to
change guava and jetty related library to compile too.
On Mon, Aug 17, 2015 a
Thanks Ted, it may be a bug. This is a jira ticket.
https://issues.apache.org/jira/browse/SPARK-10039
Kevin
--- Original Message ---
Sender : Ted Yu
Date : 2015-08-16 11:29 (GMT+09:00)
Title : Re: Can't find directory after resetting REPL state
I tried with master branch and got the fol
There is no Spark master in YARN mode. It's standalone mode terminology.
In YARN cluster mode, Spark's Application Master (Spark Driver runs in it)
will be restarted
automatically by RM up to yarn.resourcemanager.am.max-retries
times (default is 2).
--
Ruslan Dautkhanov
On Fri, Jul 17, 2015 at 1:
can you tell more about your environment. I understand you are running it
on a single machine but is firewall enabled?
On Sun, Aug 16, 2015 at 5:47 AM, t4ng0 wrote:
> Hi
>
> I am new to spark and trying to run standalone application using
> spark-submit. Whatever i could understood, from logs is
Hi I have Spark driver program which has one loop which iterates for around
2000 times and for two thousands times it executes jobs in YARN. Since loop
will do the job serially I want to introduce parallelism If I create 2000
tasks/runnable/callable in my Spark driver program will it get executed i
Hi I have written Spark job which seems to be working fine for almost an hour
and after that executor start getting lost because of timeout I see the
following in log statement
15/08/16 12:26:46 WARN spark.HeartbeatReceiver: Removing executor 10 with no
recent heartbeats: 1051638 ms exceeds timeou
Hi,
I am trying to run SparkPi in Intellij and getting NoClassDefFoundError.
Anyone else saw this issue before ?
Exception in thread "main" java.lang.NoClassDefFoundError:
scala/collection/Seq
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0
I am building spark with the following options - most notably the
**scala-2.11**:
. dev/switch-to-scala-2.11.sh
mvn -Phive -Pyarn -Phadoop-2.6 -Dhadoop2.6.2 -Pscala-2.11 -DskipTests
-Dmaven.javadoc.skip=true clean package
The build goes pretty far but fails in one of the minor modules *repl
I did check it out and although I did get a general understanding of the
various classes used to implement Sort and Hash shuffles, however these
slides lack details as to how they are implemented and why sort generally
has better performance than hash
On Sun, Aug 16, 2015 at 4:31 AM, Ravi Kiran
w
Hi
I have been trying to run standalone application using spark-submit but
somehow spark started the http server and added jar file to it but it is
unable to fetch the jar file. I am running the spark-cluster on localhost.
If anyone can help me to find what i am missing here, thanks in advance.
try --jars rather than --class to submit jar.
On Fri, Aug 14, 2015 at 6:19 AM, Stephen Boesch wrote:
> The NoClassDefFoundException differs from ClassNotFoundException : it
> indicates an error while initializing that class: but the class is found in
> the classpath. Please provide the full st
Hi
I have been trying to run standalone application using spark-submit but
somehow spark started the http server and added jar file to it but it is
unable to fetch the jar file. I am running the spark-cluster on localhost.
If anyone can help me to find what i am missing here, thanks in advance.
Dataframes in simple terms are RDDs combined with Schema. In reality they
are much more than that and provide a very fine level of optimization,
Check out project Tungsten.
In your case it was one column as you chose. By default, it keeps same
columns as in RDD (same as field of a case class if yo
Hi
I am new to spark and trying to run standalone application using
spark-submit. Whatever i could understood, from logs is that spark can't
fetch the jar file after adding it to the http server. Do i need to
configure proxy settings for spark too individually if it is a problem.
Otherwise please
Hi Mohit,
It depends on whether dynamic allocation is turned on. If not, the number
of executors is specified by the user with the --num-executors option. If
dynamic allocation is turned on, refer to the doc for details:
https://spark.apache.org/docs/1.4.0/job-scheduling.html#dynamic-resource-al
JavaPairReceiverInputDStream messages =
KafkaUtils.createStream(...);
JavaPairDStream filteredMessages =
filterValidMessages(messages);
JavaDStream useCase1 = calculateUseCase1(filteredMessages);
JavaDStream useCase2 = calculateUseCase2(filteredMessages);
JavaDStream useCase3 = calculateUseCase3(f
Thanks Andrew.
On Sun, Aug 16, 2015 at 1:53 PM, Andrew Or wrote:
> Hi Canan, TestSQLContext is no longer a singleton but now a class. It is
> never meant to be a fully public API, but if you wish to use it you can
> just instantiate a new one:
>
> val sqlContext = new TestSQLContext
>
> or jus
Hi,
I'm building a spark application in which I load some data from an
Elasticsearch cluster (using latest elasticsearch-hadoop connector) and
continue to perform some calculations on the spark cluster.
In one case, I use collect on the RDD as soon as it is created (loaded from
ES).
However, it is
23 matches
Mail list logo