Well, That's weird. I don't see this thread in my mail box as sending to user
list. Maybe because I also subscribe the incubator mail list? I do see mails
sending to incubator mail list and no one replies. I thought it was because
people don't subscribe the incubator now.
--
Ye Xianjin
Sent wi
I think the mails to spark.incubator.apache.org will be forwarded to
spark.apache.org.
Here is the header of the first mail:
from: redocpot
to: u...@spark.incubator.apache.org
date: Mon, Sep 8, 2014 at 7:29 AM
subject: groupBy gives non deterministic results
mailing list: user.spark.apache.org F
| Do the two mailing lists share messages ?
I don't think so. I didn't receive this message from the user list. I am not
in databricks, so I can't answer your other questions. Maybe Davies Liu
can answer you?
--
Ye Xianjin
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On Wednesday
Hi, Xianjin
I checked user@spark.apache.org, and found my post there:
http://mail-archives.apache.org/mod_mbox/spark-user/201409.mbox/browser
I am using nabble to send this mail, which indicates that the mail will be
sent from my email address to the u...@spark.incubator.apache.org mailing
list.
Ah, thank you. I did not notice that.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/groupBy-gives-non-deterministic-results-tp13698p13871.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---
Great. And you should ask question in user@spark.apache.org mail list. I
believe many people don't subscribe the incubator mail list now.
--
Ye Xianjin
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On Wednesday, September 10, 2014 at 6:03 PM, redocpot wrote:
> Hi,
>
> I am using s
Hi,
I am using spark 1.0.0. The bug is fixed by 1.0.1.
Hao
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/groupBy-gives-non-deterministic-results-tp13698p13864.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---
Which version of Spark are you using?
This bug had been fixed in 0.9.2, 1.0.2 and 1.1, could you upgrade to
one of these versions
to verify it?
Davies
On Tue, Sep 9, 2014 at 7:03 AM, redocpot wrote:
> Thank you for your replies.
>
> More details here:
>
> The prog is executed on local mode (sin
Thank you for your replies.
More details here:
The prog is executed on local mode (single node). Default env params are
used.
The test code and the result are in this gist:
https://gist.github.com/coderh/0147467f0b185462048c
Here is 10 first lines of the data: 3 fields each row, the delimiter i
Can you provide small sample or test data that reproduce this problem? and
what's your env setup? single node or cluster?
Sent from my iPhone
> On 2014年9月8日, at 22:29, redocpot wrote:
>
> Hi,
>
> I have a key-value RDD called rdd below. After a groupBy, I tried to count
> rows.
> But the resu
What's the type of the key?
If the hash of key is different across slaves, then you could get this confusing
results. We had met this similar results in Python, because of hash of None
is different across machines.
Davies
On Mon, Sep 8, 2014 at 8:16 AM, redocpot wrote:
> Update:
>
> Just test w
Update:
Just test with HashPartitioner(8) and count on each partition:
List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
*(5,657591*), (*6,658327*), (*7,658434*)),
List((0,657824), (1,658549), (2,659199), (3,658684), (4,659394),
*(5,657594)*, (6,658326), (*7,658434*)),
List((0,65
12 matches
Mail list logo