Re: Problem with using Spark ML

2015-04-23 Thread Staffan
So I got the tip of trying to reduce step-size and that finally gave some more decent results, had hoped for the default params to give at least OK results and thought that the problem must be somewhere else in the code. Problem solved! -- View this message in context: http://apache-spark-user-

Problem with using Spark ML

2015-04-21 Thread Staffan
Hi, I've written an application that performs some machine learning on some data. I've validated that the data _should_ give a good output with a decent RMSE by using Lib-SVM: Mean squared error = 0.00922063 (regression) Squared correlation coefficient = 0.9987 (regression) When I try to use Spark

Pipelines for controlling workflow

2015-04-07 Thread Staffan
Pcamp-pipeline which relies on type-safety, but I'm confused about how to create a branching pipeline using only type-declarations. Thanks, Staffan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Pipelines-for-controlling-workflow-tp22403.html Sent from t

How to efficiently control concurrent Spark jobs

2015-02-25 Thread Staffan
So, either do anyone have a suggestion of how to do this in a better way or perhaps if there a higher level workflow tool that I can use on top of Spark? (The cool solution would have been to use nestled RDDs and just map over them in a high level way, but as this is not supported afaik).

Re: Issues when combining Spark and a third party java library

2015-01-27 Thread Staffan
Okay, I finally tried to change the Hadoop-client version from 2.4.0 to 2.5.2 and that mysteriously fixed everything.. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Issues-when-combining-Spark-and-a-third-party-java-library-tp21367p21387.html Sent from th

Re: Issues when combining Spark and a third party java library

2015-01-27 Thread Staffan
To clarify: I'm currently working on this locally, running on a laptop and I do not use Spark-submit (using Eclipse to run my applications currently). I've tried running both on Mac OS X and in a VM running Ubuntu. Furthermore, I've got the VM from a fellow worker which has no issues running his Sp

Issues when combining Spark and a third party java library

2015-01-26 Thread Staffan
I'm using Maven and Eclipse to build my project. I'm letting Maven download all the things I need for running everything, which has worked fine up until now. I need to use the CDK library (https://github.com/egonw/cdk, http://sourceforge.net/projects/cdk/) and once I add the dependencies to my pom.