The SGD algorithm adapts the learning rate accordingly. However, this does not help if you choose the initial learning rate too large because then you calculate a weight vector in the first iterations from which it takes really long to recover.
Cheer, Till On Mon, Jun 1, 2015 at 7:15 PM, Sachin Goel <sachingoel0...@gmail.com> wrote: > You can set the learning rate to be 1/sqrt(iteration number). This usually > works. > > Regards > Sachin Goel > > On Mon, Jun 1, 2015 at 9:09 PM, Alexander Alexandrov < > alexander.s.alexand...@gmail.com> wrote: > > > I've seen some work on adaptive learning rates in the past days. > > > > Maybe we can think about extending the base algorithm and comparing the > use > > case setting for the IMPRO-3 project. > > > > @Felix you can discuss this with the others on Wednesday, Manu will be > also > > there and can give some feedback, I'll try to send a link tomorrow > > morning... > > > > > > 2015-06-01 20:33 GMT+10:00 Till Rohrmann <trohrm...@apache.org>: > > > > > Since MLR uses stochastic gradient descent, you probably have to > > configure > > > the step size right. SGD is very sensitive to the right step size > choice. > > > If the step size is too high, then the SGD algorithm does not converge. > > You > > > can find the parameter description here [1]. > > > > > > Cheers, > > > Till > > > > > > [1] > > > > > > > > > http://ci.apache.org/projects/flink/flink-docs-master/libs/ml/multiple_linear_regression.html > > > > > > On Mon, Jun 1, 2015 at 11:48 AM, Felix Neutatz <neut...@googlemail.com > > > > > wrote: > > > > > > > Hi, > > > > > > > > I want to use MultipleLinearRegression, but I got really strange > > results. > > > > So I tested it with the housing price dataset: > > > > > > > > > > > > > > http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data > > > > > > > > And here I get negative house prices - even when I use the training > set > > > as > > > > dataset: > > > > LabeledVector(-1.1901998613214253E78, DenseVector(1500.0, 2197.0, > > 2978.0, > > > > 1369.0, 1451.0)) > > > > LabeledVector(-2.7411218018254747E78, DenseVector(4445.0, 4522.0, > > 4038.0, > > > > 4223.0, 4868.0)) > > > > LabeledVector(-2.688526857613956E78, DenseVector(4522.0, 4038.0, > > 4351.0, > > > > 4129.0, 4617.0)) > > > > LabeledVector(-1.3075960386971714E78, DenseVector(2001.0, 2059.0, > > 1992.0, > > > > 2008.0, 2504.0)) > > > > LabeledVector(-1.476238770814297E78, DenseVector(1992.0, 1965.0, > > 1983.0, > > > > 2300.0, 3811.0)) > > > > LabeledVector(-1.4298128754759792E78, DenseVector(2059.0, 1992.0, > > 1965.0, > > > > 2425.0, 3178.0)) > > > > ... > > > > > > > > and a huge squared error: > > > > Squared error: 4.799184832395361E159 > > > > > > > > You can find my code here: > > > > > > > > > > > > > > https://github.com/FelixNeutatz/wikiTrends/blob/master/extraction/src/test/io/sanfran/wikiTrends/extraction/flink/Regression.scala > > > > > > > > Can you help me? What did I do wrong? > > > > > > > > Thank you for your help, > > > > Felix > > > > > > > > > >