For linear regression, the main tasks are computing the covariance
matrix and X * y, which can both be parallelized well, and then you
need to solve a linear equation whose dimension consists of the number
of features. So if number of features is small, it actually makes
sense to do the setup in Fl
I agree that given a small data set it's probably better to solve the
linear regression problem directly. However, I'm not so sure how well this
performs if the data gets really big (more in terms of number of data
points). But maybe we can find something like a sweet spot when to switch
between bo
> >>
> >>
> >> On Wed, Jun 3, 2015 at 8:05 PM, Mikio Braun
> >> wrote:
> >>
> >> > We should probably look into this nevertheless. Requiring full grid
> >> search
> >> > for a simple algorithm like mlr sounds like overkill
On Thu, Jun 4, 2015 at 1:26 PM, Till Rohrmann wrote:
> Maybe also the default learning rate of 0.1 is set too high.
>
Could be.
But grid search on learning rate is pretty standard practice. Running
multiple learning engines at the same time with different learning rates is
pretty plausible.
Al
simple algorithm like mlr sounds like overkill.
>> >
>> > Do you have written down the math of your implementation somewhere?
>> >
>> > -M
>> >
>> > - Ursprüngliche Nachricht -
>> > Von: "Till Rohrmann"
>> > Ge
>
> > -M
> >
> > - Ursprüngliche Nachricht -----
> > Von: "Till Rohrmann"
> > Gesendet: 02.06.2015 11:31
> > An: "dev@flink.apache.org"
> > Betreff: Re: MultipleLinearRegression - Strange results
> >
> > Great to hear. Th
1
> An: "dev@flink.apache.org"
> Betreff: Re: MultipleLinearRegression - Strange results
>
> Great to hear. This should no longer be a pain point once we support proper
> cross validation.
>
> On Tue, Jun 2, 2015 at 11:11 AM, Felix Neutatz
> wrote:
>
> >
Great to hear. This should no longer be a pain point once we support proper
cross validation.
On Tue, Jun 2, 2015 at 11:11 AM, Felix Neutatz
wrote:
> Yes, grid search solved the problem :)
>
> 2015-06-02 11:07 GMT+02:00 Till Rohrmann :
>
> > The SGD algorithm adapts the learning rate accordingly
Yes, grid search solved the problem :)
2015-06-02 11:07 GMT+02:00 Till Rohrmann :
> The SGD algorithm adapts the learning rate accordingly. However, this does
> not help if you choose the initial learning rate too large because then you
> calculate a weight vector in the first iterations from whi
The SGD algorithm adapts the learning rate accordingly. However, this does
not help if you choose the initial learning rate too large because then you
calculate a weight vector in the first iterations from which it takes
really long to recover.
Cheer,
Till
On Mon, Jun 1, 2015 at 7:15 PM, Sachin G
You can set the learning rate to be 1/sqrt(iteration number). This usually
works.
Regards
Sachin Goel
On Mon, Jun 1, 2015 at 9:09 PM, Alexander Alexandrov <
alexander.s.alexand...@gmail.com> wrote:
> I've seen some work on adaptive learning rates in the past days.
>
> Maybe we can think about ex
I've seen some work on adaptive learning rates in the past days.
Maybe we can think about extending the base algorithm and comparing the use
case setting for the IMPRO-3 project.
@Felix you can discuss this with the others on Wednesday, Manu will be also
there and can give some feedback, I'll try
Since MLR uses stochastic gradient descent, you probably have to configure
the step size right. SGD is very sensitive to the right step size choice.
If the step size is too high, then the SGD algorithm does not converge. You
can find the parameter description here [1].
Cheers,
Till
[1]
http://ci.
13 matches
Mail list logo