I'm getting a lot of task lost with this build in a large mesos cluster.
Happens with both hash and sort shuffles.
14/11/20 18:08:38 WARN TaskSetManager: Lost task 9.1 in stage 1.0 (TID 897,
i-d4d6553a.inst.aws.airbnb.com): FetchFailed(null, shuffleId=1, mapId=-1,
reduceId=9, message=
org.apache.s
I'm still seeing the fetch failed error and updated
https://issues.apache.org/jira/browse/SPARK-3633
On Thu, Nov 20, 2014 at 10:21 AM, Marcelo Vanzin
wrote:
> +1 (non-binding)
>
> . ran simple things on spark-shell
> . ran jobs in yarn client & cluster modes, and standalone cluster mode
>
> On W
I think it is a race condition caused by netty deactivating a channel while
it is active.
Switched to nio and it works fine
--conf spark.shuffle.blockTransferService=nio
On Thu, Nov 20, 2014 at 10:44 AM, Hector Yee wrote:
> I'm still seeing the fetch failed error and updated
This is whatever was in http://people.apache.org/~andrewor14/spark-1
.1.1-rc2/
On Thu, Nov 20, 2014 at 11:48 AM, Matei Zaharia
wrote:
> Hector, is this a comment on 1.1.1 or on the 1.2 preview?
>
> Matei
>
> > On Nov 20, 2014, at 11:39 AM, Hector Yee wrote:
> >
ice property doesn't
> exist in 1.1 (AFAIK) -- what exactly are you doing to get this problem?
>
> Matei
>
> On Nov 20, 2014, at 11:50 AM, Hector Yee wrote:
>
> This is whatever was in http://people.apache.org/~andrewor14/spark-1
> .1.1-rc2/
>
> On Thu, Nov 20, 20
Congrats!
On Thu, Mar 5, 2015 at 1:34 PM, shane knapp wrote:
> WOOT!
>
> On Thu, Mar 5, 2015 at 1:26 PM, Reynold Xin wrote:
>
> > We reached a new milestone today.
> >
> > https://github.com/apache/spark
> >
> >
> > 10,001 commits now. Congratulations to Xiangrui for making the 1th
> > comm
I use Thrift and then base64 encode the binary and save it as text file
lines that are snappy or gzip encoded.
It makes it very easy to copy small chunks locally and play with subsets of
the data and not have dependencies on HDFS / hadoop for server stuff for
example.
On Thu, Mar 26, 2015 at 2:5
files instead of file lines?
>
>
>
> *From:* Hector Yee [mailto:hector@gmail.com]
> *Sent:* Wednesday, April 01, 2015 11:36 AM
> *To:* Ulanov, Alexander
> *Cc:* Evan R. Sparks; Stephen Boesch; dev@spark.apache.org
>
> *Subject:* Re: Storing large data for MLlib mach
Speaking as a user of spark on mesos
Yes it appears that each app appears as a separate framework on the mesos
master
In fine grained mode the number of executors goes up and down vs fixed in
coarse.
I would not run fine grained mode on a large cluster as it can potentially
spin up a lot of execu
Hi Spark devs,
Is it possible to add jcenter or bintray support for Spark packages?
I'm trying to add our artifact which is on jcenter
https://bintray.com/airbnb/aerosolve
but I noticed in Spark packages it only accepts Maven coordinates.
--
Yee Yang Li Hector
google.com/+HectorYee
-
I would say for bigdata applications the most useful would be hierarchical
k-means with back tracking and the ability to support k nearest centroids.
On Tue, Jul 8, 2014 at 10:54 AM, RJ Nowling wrote:
> Hi all,
>
> MLlib currently has one clustering algorithm implementation, KMeans.
> It would
On Tue, Jul 8, 2014 at 1:01 PM, Hector Yee wrote:
>
> > I would say for bigdata applications the most useful would be
> hierarchical
> > k-means with back tracking and the ability to support k nearest
> centroids.
> >
> >
> > On Tue, Jul 8, 20
. more interesting problem here is choosing k at each level. Kernel
> methods seem to be most promising.
>
>
> On Tue, Jul 8, 2014 at 1:31 PM, Hector Yee wrote:
>
> > No idea, never looked it up. Always just implemented it as doing k-means
> > again on each cluster.
> >
t; Is something like that you were thinking Hector?
>
> On Tue, Jul 8, 2014 at 4:50 PM, Dmitriy Lyubimov
> wrote:
> > sure. more interesting problem here is choosing k at each level. Kernel
> > methods seem to be most promising.
> >
> >
> > On Tue, Jul 8, 201
ower
> communication overheads than, say, shuffling data around that belongs to
> one cluster or another. Something like that could work here as well.
>
> I'm not super-familiar with hierarchical K-Means so perhaps there's a more
> efficient way to implement it, though.
>
>
15 matches
Mail list logo