Yep, and it works fine for operations which does not involve any shuffle
(like foreach,, count etc) and those which involves shuffle operations ends
up in an infinite loop. Spark should somehow indicate this instead of going
in an infinite loop.
Thanks
Best Regards
On Thu, Aug 13, 2015 at 11:37 P
Hi, Sea
Problem solved, it turn out to be that I have updated spark cluster to
1.4.1, however the client has not been updated.
Thank you so much.
Sea <261810...@qq.com>于2015年8月14日周五 下午1:01写道:
> I have no idea... We use scala. You upgrade to 1.4 so quickly..., are you
> using spark in p
I have no idea... We use scala. You upgrade to 1.4 so quickly..., are you
using spark in production? Spark 1.3 is better than spark1.4.
-- --
??: "??";;
: 2015??8??14??(??) 11:14
??: "Sea"<261810...@qq.com>; "dev@spark.apach
OK, thanks, probably just myself…
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Friday, August 14, 2015 11:04 AM
To: Cheng, Hao
Cc: Josh Rosen; dev
Subject: Re: Automatically deleting pull request comments left by AmplabJenkins
I tried accessing just now.
It took several seconds before the page
Hi Sea
I have updated spark to 1.4.1, however the problem still exists, any
idea?
Sea <261810...@qq.com>于2015年8月14日周五 上午12:36写道:
> Yes, I guess so. I see this bug before.
>
>
> -- 原始邮件 --
> *发件人:* "周千昊";;
> *发送时间:* 2015年8月13日(星期四) 晚上9:30
> *收件人:* "Sea"<261810.
I tried accessing just now.
It took several seconds before the page showed up.
FYI
On Thu, Aug 13, 2015 at 7:56 PM, Cheng, Hao wrote:
> I found the https://spark-prs.appspot.com/ is super slow while open it in
> a new window recently, not sure just myself or everybody experience the
> same, is
I found the https://spark-prs.appspot.com/ is super slow while open it in a new
window recently, not sure just myself or everybody experience the same, is
there anyways to speed up?
From: Josh Rosen [mailto:rosenvi...@gmail.com]
Sent: Friday, August 14, 2015 10:21 AM
To: dev
Subject: Re: Automat
Thanks Josh for the initiative.
I think reducing the redundancy in QA bot posts would make discussion on GitHub
UI more focused.
Cheers
On Thu, Aug 13, 2015 at 7:21 PM, Josh Rosen wrote:
> Prototype is at https://github.com/databricks/spark-pr-dashboard/pull/59
>
> On Wed, Aug 12, 2015 at 7:51
Prototype is at https://github.com/databricks/spark-pr-dashboard/pull/59
On Wed, Aug 12, 2015 at 7:51 PM, Josh Rosen wrote:
> *TL;DR*: would anyone object if I wrote a script to auto-delete pull
> request comments from AmplabJenkins?
>
> Currently there are two bots which post Jenkins test resul
Hi Tom,
Not sure how much this helps, but are you aware that you can build Spark
with the -Phadoop-provided profile to avoid packaging Hadoop dependencies
in the assembly jar?
-Sandy
On Fri, Aug 14, 2015 at 6:08 AM, Thomas Dudziak wrote:
> Unfortunately it doesn't because our version of Hive h
Unfortunately it doesn't because our version of Hive has different syntax
elements and thus I need to patch them in (and a few other minor things).
It would be great if there would be a developer api on a somewhat higher
level.
On Thu, Aug 13, 2015 at 2:19 PM, Reynold Xin wrote:
> I believe for
That works.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/possible-bug-user-SparkConf-properties-not-copied-to-worker-process-tp13665p13689.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
-
I believe for Hive, there is already a client interface that can be used to
build clients for different Hive metastores. That should also work for your
heavily forked one.
For Hadoop, it is definitely a bigger project to refactor. A good way to
start evaluating this is to list what needs to be cha
Is this through Java properties? For java properties, you can pass them
using spark.executor.extraJavaOptions.
On Thu, Aug 13, 2015 at 2:11 PM, rfarrjr wrote:
> Thanks for the response.
>
> In this particular case we passed a url that would be leveraged when
> configuring some serialization s
Thanks for the response.
In this particular case we passed a url that would be leveraged when
configuring some serialization support for Kryo. We are using a schema
registry and leveraging it to efficiently serialize avro objects without the
need to register specific records or schemas up front.
Hi,
I have asked this before but didn't receive any comments, but with the
impending release of 1.5 I wanted to bring this up again.
Right now, Spark is very tightly coupled with OSS Hive & Hadoop which
causes me a lot of work every time there is a new version because I don't
run OSS Hive/Hadoop v
Thanks Marcelo!
The reason I was asking that question is that I was expecting my spark job
to be a "map only" job. In other words, it should finish after the
mapPartitions run for all partitions. This is because the job is only
mapPartitions() plus count() where mapPartitions only yield one integ
(I tried to send this last night but somehow ASF mailing list rejected my
mail)
In order to facilitate community testing of the 1.5.0 release, I've built a
preview package. This is not a release candidate, so there is no voting
involved. However, it'd be great if community members can start testi
Hi Naga,
If you are trying to use classes from this jar, you will need to call the
addJar method from the sparkcontext, which will put this jar in the all
workers context.
Even when you execute it in standalone.
2015-08-13 16:02 GMT-03:00 Naga Vij :
> Hi Dirceu,
>
> Thanks for getting back to me
Cool seems like the design are very close.
Here is my latest blog on my work with HBase and Spark. Let me know if you
have any questions. There should be two more blogs next month talking
about bulk load through spark 14150 which is committed, and SparkSQL 14181
which should be done next week.
Retry sending this again ...
-- Forwarded message --
From: Reynold Xin
Date: Thu, Aug 13, 2015 at 12:15 AM
Subject: [ANNOUNCE] Spark 1.5.0-preview package
To: "dev@spark.apache.org"
In order to facilitate community testing of the 1.5.0 release, I've built a
preview package. Thi
Hello,
Any idea on why this is happening?
Thanks
Naga
-- Forwarded message --
From: Naga Vij
Date: Wed, Aug 12, 2015 at 5:47 PM
Subject: - Spark 1.4.1 - run-example SparkPi - Failure ...
To: u...@spark.apache.org
Hi,
I am evaluating Spark 1.4.1
Any idea on why run-example Sp
Hi Naga,
This happened here sometimes when the memory of the spark cluster wasn't
enough, and Java GC enters into an infinite loop trying to free some memory.
To fix this I just added more memory to the Workers of my cluster, or you
can increase the number of partitions of your RDD, using the repar
Has anyone run into this?
-- Forwarded message --
From: Naga Vij
Date: Wed, Aug 12, 2015 at 5:47 PM
Subject: - Spark 1.4.1 - run-example SparkPi - Failure ...
To: u...@spark.apache.org
Hi,
I am evaluating Spark 1.4.1
Any idea on why run-example SparkPi fails?
Here's what I am
See first section on https://spark.apache.org/community
On Thu, Aug 13, 2015 at 9:44 AM, Naga Vij wrote:
> subscribe
>
subscribe
That was intentional - what's your use case that require configs not
starting with spark?
On Thu, Aug 13, 2015 at 8:16 AM, rfarrjr wrote:
> Ran into an issue setting a property on the SparkConf that wasn't made
> available on the worker. After some digging[1] I noticed that only
> properties t
oh I see, you are defining your own RDD & Partition types, and you had a
bug where partition.index did not line up with the partitions slot in
rdd.getPartitions. Is that correct?
On Thu, Aug 13, 2015 at 2:40 AM, Akhil Das
wrote:
> I figured that out, And these are my findings:
>
> -> It just en
Ran into an issue setting a property on the SparkConf that wasn't made
available on the worker. After some digging[1] I noticed that only
properties that start with "spark." are sent by the schedular. I'm not
sure if this was intended behavior or not.
Using Spark Streaming 1.4.1 running on Java
That's not a program, it's just a class in the Java library. Spark looks at
the call stack and uses it to describe the job in the UI. If you look at
the whole stack trace you'll see more things that might tell you what's
really going on in that job.
On Thu, Aug 13, 2015 at 9:13 AM, freedafeng wro
Yes, I guess so. I see this bug before.
-- --
??: "??";;
: 2015??8??13??(??) 9:30
??: "Sea"<261810...@qq.com>; "dev@spark.apache.org";
: Re: please help with ClassNotFoundException
Hi seaIs it the same issue as
h
I am running a spark job with only two operations: mapPartition and then
collect(). The output data size of mapPartition is very small. One integer
per partition. I saw there is a stage 2 for this job that runs this java
program. I am not a java programmer. Could anyone please let me know what
this
Hi,
sampledVertices is a HashSet of vertices
var sampledVertices: HashSet[VertexId] = HashSet()
In each iteration, I am making a list of neighborVertexIds
val neighborVertexIds = burnEdges.map((e:Edge[Int]) => e.dstId)
I want to add this neighborVertexIds to the sampledVertices Has
Have a look at spark.shuffle.manager, You can switch between sort and hash
with this configuration.
spark.shuffle.managersortImplementation to use for shuffling data. There
are two implementations available:sort and hash. Sort-based shuffle is more
memory-efficient and is the default option starti
Hi sea
Is it the same issue as https://issues.apache.org/jira/browse/SPARK-8368
Sea <261810...@qq.com>于2015年8月13日周四 下午6:52写道:
> Are you using 1.4.0? If yes, use 1.4.1
>
>
> -- 原始邮件 --
> *发件人:* "周千昊";;
> *发送时间:* 2015年8月13日(星期四) 晚上6:04
> *收件人:* "dev";
> *主题:* pl
Hi Cheez,
You can set the parameter spark.shuffle.manager when you submit the Spark
job.
--conf spark.shuffle.manager=hash
Thank you,
Ranjana
On Thu, Aug 13, 2015 at 2:26 AM, cheez <11besemja...@seecs.edu.pk> wrote:
> I understand that the current master branch of Spark uses Sort based
> shuff
I understand that the current master branch of Spark uses Sort based shuffle.
Is there a way to change that to Hash based shuffle, just for experimental
purposes by modifying the source code ?
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Switch-from
Are you using 1.4.0? If yes, use 1.4.1
-- --
??: "??";;
: 2015??8??13??(??) 6:04
??: "dev";
: please help with ClassNotFoundException
Hi,I am using spark 1.4 when an issue occurs to me.
I am trying to use the
Hi,
I am using spark 1.4 when an issue occurs to me.
I am trying to use the aggregate function:
JavaRdd rdd = some rdd;
HashMap zeroValue = new HashMap();
// add initial key-value pair for zeroValue
rdd.aggregate(zeroValue,
new Function2,
39 matches
Mail list logo