I'm trying to setup a simple iterative message/update problem in GraphX
(spark 1.2.0), but I'm running into issues with the caching and
re-calculation of data. I'm trying to follow the example found in the
Pregel implementation of materializing and cacheing messages and graphs and
then unpersisting
Hi,
I submitted a spark job to an ec2 cluster, using spark-submit. At a worker
node, there is an exception of 'no space left on device' as follows.
==
15/02/08 01:53:38 ERROR logging.FileAppender: Error writing stream to file
/root/spark/work/app-2015020
Hi there,
Spark version: 1.2
/home/hadoop/spark/bin/spark-submit
--class com.litb.bi.CSLog2ES
--master yarn
--executor-memory 1G
--jars
/mnt/external/kafka/target/spark-streaming-kafka_2.10-1.2.0.jar,/mnt/external/kafka/target/zkclient-0.3.jar,/mnt/external/kafka/target/metrics-core-2.2.0.jar,
Hello people, I have an issue that my streaming receiver is laggy on YARN.
Can anyone reply to my question on StackOverflow?:
http://stackoverflow.com/questions/28370362/spark-streaming-receiver-particularly-slow-on-yarn
Thanks
Jong Wook
--
View this message in context:
http://apache-spark-u
So, Can I increase the number of threads by manually coding in the Spark
code?
On Sat, Feb 7, 2015 at 6:52 PM, Sean Owen wrote:
> If you look at the threads, the other 30 are almost surely not Spark
> worker threads. They're the JVM finalizer, GC threads, Jetty
> listeners, etc. Nothing wrong wi
Sorry for the many typos as I was typing from my cell phone. Hope you still
can get the idea.
On Sat, Feb 7, 2015 at 1:55 PM, Chester @work wrote:
>
> I just implemented this in our application. The impersonation is done
> before the job is submitted. In spark yarn (we are using yarn cluster mo
Hello,
I'm new to Spark, and tried to setup a Spark cluster of 1 master VM SparkV1
and 1 worker VM SparkV4 (the error is the same if I have 2 workers). They
are connected without a problem now. But when I submit a job (as in
https://spark.apache.org/docs/latest/quick-start.html) at the master:
>s
I just implemented this in our application. The impersonation is done before
the job is submitted. In spark yarn (we are using yarn cluster mode) , it just
takes the current User from UserGroupInfoemation and summitted to yarn resource
manager.
If one use Kinit from command line, the who Jvm
Yes. You need to create xiaobogu under /user and provide right permission to
xiaobogu.
Thanks.
Zhan Zhang
On Feb 7, 2015, at 8:15 AM, guxiaobo1982
mailto:guxiaobo1...@qq.com>> wrote:
Hi Zhan Zhang,
With the pre-bulit version 1.2.0 of spark against the yarn cluster installed by
ambari 1.7.0,
https://issues.apache.org/jira/browse/SPARK-5493 currently tracks this.
-Sandy
On Mon, Feb 2, 2015 at 9:37 PM, Zhan Zhang wrote:
> I think you can configure hadoop/hive to do impersonation. There is no
> difference between secure or insecure hadoop cluster by using kinit.
>
> Thanks.
>
> Zh
Hi Ted,
I’ve seen the codes, I am using JavaKafkaWordCount.java but I would like
reproducing in java that I’ve done in scala. Is it possible doing the same
thing that scala code does in java?
Principally this code below or something looks liked:
> val KafkaDStreams = (1 to numStreams) map {_ =
Can you take a look at:
./examples/scala-2.10/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCount.java
./external/kafka/src/test/java/org/apache/spark/streaming/kafka/JavaKafkaStreamSuite.java
Cheers
On Sat, Feb 7, 2015 at 9:45 AM, Eduardo Costa Alfaia wrote:
> Hi Guys,
>
> Ho
Hi Sachin,
In your YARN configuration, either yarn.nodemanager.resource.memory-mb is
1024 on your nodes or yarn.scheduler.maximum-allocation-mb is set to 1024.
If you have more than 1024 MB on each node, you should bump these
properties. Otherwise, you should request fewer resources by setting
--
Hi,
when I am trying to execute my program as
spark-submit --master yarn --class com.mytestpack.analysis.SparkTest
sparktest-1.jar
I am getting error bellow error-
java.lang.IllegalArgumentException: Required executor memory (1024+384 MB)
is above the max threshold (1024 MB) of this cluster!
Hi Guys,
How could I doing in Java the code scala below?
val KafkaDStreams = (1 to numStreams) map {_ =>
KafkaUtils.createStream[String, String, StringDecoder, StringDecoder](ssc,
kafkaParams, topicMap,storageLevel = StorageLevel.MEMORY_ONLY).map(_._2)
}
val unifiedStream =
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
Permission denied: user=xiaobogu, access=WRITE,
inode="/user":hdfs:hdfs:drwxr-xr-x
Looks like permission issue. Can you give access to 'xiaobogu' ?
Cheers
On Sat, Feb 7, 2015 at 8:15 AM, guxiaob
Hi Zhan Zhang,
With the pre-bulit version 1.2.0 of spark against the yarn cluster installed by
ambari 1.7.0, I come with the following errors:
[xiaobogu@lix1 spark]$ ./bin/spark-submit --class
org.apache.spark.examples.SparkPi--master yarn-cluster --num-executors 3
--driver-memory 512m
https://github.com/apache/spark/blob/master/dev/create-release/create-release.sh#L217
Yes, except the 'without hive' version.
On Sat, Feb 7, 2015 at 3:45 PM, guxiaobo1982 wrote:
> Hi,
>
> After various problems with the binaries built by myself, I want to try the
> pre-built binary, but I want t
Hi,
After various problems with the binaries built by myself, I want to try the
pre-built binary, but I want to know whether it is built with --hive option.
Thanks.
If you look at the threads, the other 30 are almost surely not Spark
worker threads. They're the JVM finalizer, GC threads, Jetty
listeners, etc. Nothing wrong with this. Your OS has hundreds of
threads running now, most of which are idle, and up to 4 of which can
be executing. In a one-machine cl
> 1
You have 4 CPU core and 34 threads (system wide, you likely have many more,
by the way).
Think of it as having 4 espresso machine and 34 baristas. Does the fact
that you have only 4 espresso machine mean you can only have 4 baristas? Of
course not, there's plenty more work other than making esp
Hi,
I am using YourKit tool to profile Spark jobs that is run in my Single Node
Spark Cluster.
When I see the YourKit UI Performance Charts, the thread count always
remains at
All threads: 34
Daemon threads: 32
Here are my questions:
1. My system can run only 4 threads simultaneously, and obvious
22 matches
Mail list logo