Hi all,

I too am having some issues with *RegressionWithSGD algorithms.

Concerning your issue Eustache, this could be due to the fact that these
regression algorithms uses a fixed step (that is divided by
sqrt(iteration)). During my tests, quite often, the algorithm diverged an
infinity cost, I guessed because the step was too big. I reduce it and
managed to get good results on a very simple generated dataset.

But I was wondering if anyone here had some advises concerning the use of
these regression algorithms, for example how to choose a good step and
number of iterations? I wonder if I'm using those right...

Thanks,

-- 

*Thomas ROBERT*
www.creativedata.fr


2014-07-03 16:16 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>:

> Printing the model show the intercept is always 0 :(
>
> Should I open a bug for that ?
>
>
> 2014-07-02 16:11 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>:
>
>> Hi list,
>>
>> I'm benchmarking MLlib for a regression task [1] and get strange results.
>>
>> Namely, using RidgeRegressionWithSGD it seems the predicted points miss
>> the intercept:
>>
>> {code}
>> val trainedModel = RidgeRegressionWithSGD.train(trainingData, 1000)
>> ...
>> valuesAndPreds.take(10).map(t => println(t))
>> {code}
>>
>> output:
>>
>> (2007.0,-3.784588726958493E75)
>> (2003.0,-1.9562390324037716E75)
>> (2005.0,-4.147413202985629E75)
>> (2003.0,-1.524938024096847E75)
>> ...
>>
>> If I change the parameters (step size, regularization and iterations) I
>> get NaNs more often than not:
>> (2007.0,NaN)
>> (2003.0,NaN)
>> (2005.0,NaN)
>> ...
>>
>> On the other hand DecisionTree model give sensible results.
>>
>> I see there is a `setIntercept()` method in abstract class
>> GeneralizedLinearAlgorithm that seems to trigger the use of the intercept
>> but I'm unable to use it from the public interface :(
>>
>> Any help appreciated :)
>>
>> Eustache
>>
>> [1] https://archive.ics.uci.edu/ml/datasets/YearPredictionMSD
>>
>

Reply via email to