Re: LinearRegressionWithSGD accuracy

2015-01-28 Thread DB Tsai
Hi Robin, You can try this PR out. This has built-in features scaling, and has ElasticNet regularization (L1/L2 mix). This implementation can stably converge to model from R's glmnet package. https://github.com/apache/spark/pull/4259 Sincerely, DB Tsai --

Re: LinearRegressionWithSGD accuracy

2015-01-17 Thread DB Tsai
I'm working on LinearRegressionWithElasticNet using OWLQN now. This will do the data standardization internally so it's transparent to users. With OWLQN, you don't have to manually choose stepSize. Will send out PR soon next week. Sincerely, DB Tsai ---

Re: LinearRegressionWithSGD accuracy

2015-01-15 Thread Devl Devel
It was a bug in the code, however adding the step parameter got the results to work. Mean Squared Error = 2.610379825794694E-5 I've also opened a jira to put the step parameter in the examples so that people new to mllib have a way to improve the MSE. https://issues.apache.org/jira/browse/SPARK-

Re: LinearRegressionWithSGD accuracy

2015-01-15 Thread Joseph Bradley
It looks like you're training on the non-scaled data but testing on the scaled data. Have you tried this training & testing on only the scaled data? On Thu, Jan 15, 2015 at 10:42 AM, Devl Devel wrote: > Thanks, that helps a bit at least with the NaN but the MSE is still very > high even with th

Re: LinearRegressionWithSGD accuracy

2015-01-15 Thread Devl Devel
Thanks, that helps a bit at least with the NaN but the MSE is still very high even with that step size and 10k iterations: training Mean Squared Error = 3.3322561285919316E7 Does this method need say 100k iterations? On Thu, Jan 15, 2015 at 5:42 PM, Robin East wrote: > -dev, +user > > You

Re: LinearRegressionWithSGD accuracy

2015-01-15 Thread Robin East
-dev, +user You’ll need to set the gradient descent step size to something small - a bit of trial and error shows that 0.0001 works. You’ll need to create a LinearRegressionWithSGD instance and set the step size explicitly: val lr = new LinearRegressionWithSGD() lr.optimizer.setStepSize(0.