We should probably look into this nevertheless. Requiring full grid search for a simple algorithm like mlr sounds like overkill.
Do you have written down the math of your implementation somewhere? -M ----- Ursprüngliche Nachricht ----- Von: "Till Rohrmann" <till.rohrm...@gmail.com> Gesendet: 02.06.2015 11:31 An: "dev@flink.apache.org" <dev@flink.apache.org> Betreff: Re: MultipleLinearRegression - Strange results Great to hear. This should no longer be a pain point once we support proper cross validation. On Tue, Jun 2, 2015 at 11:11 AM, Felix Neutatz <neut...@googlemail.com> wrote: > Yes, grid search solved the problem :) > > 2015-06-02 11:07 GMT+02:00 Till Rohrmann <till.rohrm...@gmail.com>: > > > The SGD algorithm adapts the learning rate accordingly. However, this > does > > not help if you choose the initial learning rate too large because then > you > > calculate a weight vector in the first iterations from which it takes > > really long to recover. > > > > Cheer, > > Till > > > > On Mon, Jun 1, 2015 at 7:15 PM, Sachin Goel <sachingoel0...@gmail.com> > > wrote: > > > > > You can set the learning rate to be 1/sqrt(iteration number). This > > usually > > > works. > > > > > > Regards > > > Sachin Goel > > > > > > On Mon, Jun 1, 2015 at 9:09 PM, Alexander Alexandrov < > > > alexander.s.alexand...@gmail.com> wrote: > > > > > > > I've seen some work on adaptive learning rates in the past days. > > > > > > > > Maybe we can think about extending the base algorithm and comparing > the > > > use > > > > case setting for the IMPRO-3 project. > > > > > > > > @Felix you can discuss this with the others on Wednesday, Manu will > be > > > also > > > > there and can give some feedback, I'll try to send a link tomorrow > > > > morning... > > > > > > > > > > > > 2015-06-01 20:33 GMT+10:00 Till Rohrmann <trohrm...@apache.org>: > > > > > > > > > Since MLR uses stochastic gradient descent, you probably have to > > > > configure > > > > > the step size right. SGD is very sensitive to the right step size > > > choice. > > > > > If the step size is too high, then the SGD algorithm does not > > converge. > > > > You > > > > > can find the parameter description here [1]. > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > http://ci.apache.org/projects/flink/flink-docs-master/libs/ml/multiple_linear_regression.html > > > > > > > > > > On Mon, Jun 1, 2015 at 11:48 AM, Felix Neutatz < > > neut...@googlemail.com > > > > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I want to use MultipleLinearRegression, but I got really strange > > > > results. > > > > > > So I tested it with the housing price dataset: > > > > > > > > > > > > > > > > > > > > > > > > > > > http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data > > > > > > > > > > > > And here I get negative house prices - even when I use the > training > > > set > > > > > as > > > > > > dataset: > > > > > > LabeledVector(-1.1901998613214253E78, DenseVector(1500.0, 2197.0, > > > > 2978.0, > > > > > > 1369.0, 1451.0)) > > > > > > LabeledVector(-2.7411218018254747E78, DenseVector(4445.0, 4522.0, > > > > 4038.0, > > > > > > 4223.0, 4868.0)) > > > > > > LabeledVector(-2.688526857613956E78, DenseVector(4522.0, 4038.0, > > > > 4351.0, > > > > > > 4129.0, 4617.0)) > > > > > > LabeledVector(-1.3075960386971714E78, DenseVector(2001.0, 2059.0, > > > > 1992.0, > > > > > > 2008.0, 2504.0)) > > > > > > LabeledVector(-1.476238770814297E78, DenseVector(1992.0, 1965.0, > > > > 1983.0, > > > > > > 2300.0, 3811.0)) > > > > > > LabeledVector(-1.4298128754759792E78, DenseVector(2059.0, 1992.0, > > > > 1965.0, > > > > > > 2425.0, 3178.0)) > > > > > > ... > > > > > > > > > > > > and a huge squared error: > > > > > > Squared error: 4.799184832395361E159 > > > > > > > > > > > > You can find my code here: > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/FelixNeutatz/wikiTrends/blob/master/extraction/src/test/io/sanfran/wikiTrends/extraction/flink/Regression.scala > > > > > > > > > > > > Can you help me? What did I do wrong? > > > > > > > > > > > > Thank you for your help, > > > > > > Felix > > > > > > > > > > > > > > > > > > > > >