Hi all, I too am having some issues with *RegressionWithSGD algorithms.
Concerning your issue Eustache, this could be due to the fact that these regression algorithms uses a fixed step (that is divided by sqrt(iteration)). During my tests, quite often, the algorithm diverged an infinity cost, I guessed because the step was too big. I reduce it and managed to get good results on a very simple generated dataset. But I was wondering if anyone here had some advises concerning the use of these regression algorithms, for example how to choose a good step and number of iterations? I wonder if I'm using those right... Thanks, -- *Thomas ROBERT* www.creativedata.fr 2014-07-03 16:16 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>: > Printing the model show the intercept is always 0 :( > > Should I open a bug for that ? > > > 2014-07-02 16:11 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>: > >> Hi list, >> >> I'm benchmarking MLlib for a regression task [1] and get strange results. >> >> Namely, using RidgeRegressionWithSGD it seems the predicted points miss >> the intercept: >> >> {code} >> val trainedModel = RidgeRegressionWithSGD.train(trainingData, 1000) >> ... >> valuesAndPreds.take(10).map(t => println(t)) >> {code} >> >> output: >> >> (2007.0,-3.784588726958493E75) >> (2003.0,-1.9562390324037716E75) >> (2005.0,-4.147413202985629E75) >> (2003.0,-1.524938024096847E75) >> ... >> >> If I change the parameters (step size, regularization and iterations) I >> get NaNs more often than not: >> (2007.0,NaN) >> (2003.0,NaN) >> (2005.0,NaN) >> ... >> >> On the other hand DecisionTree model give sensible results. >> >> I see there is a `setIntercept()` method in abstract class >> GeneralizedLinearAlgorithm that seems to trigger the use of the intercept >> but I'm unable to use it from the public interface :( >> >> Any help appreciated :) >> >> Eustache >> >> [1] https://archive.ics.uci.edu/ml/datasets/YearPredictionMSD >> >