n [mailto:shiva...@eecs.berkeley.edu]
> > Sent: Sunday, April 05, 2015 7:13 PM
> > To: Ulanov, Alexander
> > Cc: shiva...@eecs.berkeley.edu; Joseph Bradley; dev@spark.apache.org
> > Subject: Re: Stochastic gradient descent performance
> >
> > Yeah, a simple wa
hivaram Venkataraman [mailto:shiva...@eecs.berkeley.edu]
> Sent: Sunday, April 05, 2015 7:13 PM
> To: Ulanov, Alexander
> Cc: shiva...@eecs.berkeley.edu; Joseph Bradley; dev@spark.apache.org
> Subject: Re: Stochastic gradient descent performance
>
> Yeah, a simple way to estimate the ti
g<mailto:dev@spark.apache.org>
Subject: Re: Stochastic gradient descent performance
I haven't looked closely at the sampling issues, but regarding the aggregation
latency, there are fixed overheads (in local and distributed mode) with the way
aggregation is done in Spark. Launching a s
s? I do understand that in cluster
> > mode the network speed will kick in and then one can blame it.
> >
> >
> >
> > Best regards, Alexander
> >
> >
> >
> > *From:* Joseph Bradley [mailto:jos...@databricks.com]
> > *Sent:* Thursday, April 02, 2015
: Thursday, April 02, 2015 1:26 PM
To: Joseph Bradley
Cc: Ulanov, Alexander; dev@spark.apache.org
Subject: Re: Stochastic gradient descent performance
I haven't looked closely at the sampling issues, but regarding the aggregation
latency, there are fixed overheads (in local and distributed mode)
ering why it works so slow
> in
> > local mode? Could you elaborate on this? I do understand that in cluster
> > mode the network speed will kick in and then one can blame it.
> >
> >
> >
> > Best regards, Alexander
> >
> >
> >
> &g
y, April 02, 2015 10:51 AM
> *To:* Ulanov, Alexander
> *Cc:* dev@spark.apache.org
> *Subject:* Re: Stochastic gradient descent performance
>
>
>
> It looks like SPARK-3250 was applied to the sample() which GradientDescent
> uses, and that should kick in for your miniba
gradient descent performance
It looks like SPARK-3250 was applied to the sample() which GradientDescent
uses, and that should kick in for your minibatchFraction <= 0.4. Based on your
numbers, aggregation seems like the main issue, though I hesitate to optimize
aggregation based on local tests for d
It looks like SPARK-3250 was applied to the sample() which GradientDescent
uses, and that should kick in for your minibatchFraction <= 0.4. Based on
your numbers, aggregation seems like the main issue, though I hesitate to
optimize aggregation based on local tests for data sizes that small.
The f
Sorry for bothering you again, but I think that it is an important issue for
applicability of SGD in Spark MLlib. Could Spark developers please comment on
it.
-Original Message-
From: Ulanov, Alexander
Sent: Monday, March 30, 2015 5:00 PM
To: dev@spark.apache.org
Subject: Stochastic gra
10 matches
Mail list logo