Re: Spark LIBLINEAR

2014-10-27 Thread Debasish Das
Hi Professor Lin, It will be great if you could please review the TRON code in breeze and whether it is similar to the original TRON implementation...Breeze is already integrated in mllib (we are using BFGS and OWLQN is under works for mllib LogisticRegression) and comparing with TRON should be qu

Re: Spark LIBLINEAR

2014-10-26 Thread Chih-Jen Lin
Debasish Das writes: > If the SVM is not already migrated to BFGS, that's the first thing you should > try...Basically following LBFGS Logistic Regression come up with LBFGS based > linear SVM... > > About integrating TRON in mllib, David already has a version of TRON in > breeze > but som

Re: Spark LIBLINEAR

2014-10-24 Thread DB Tsai
yeah, column normalizarion. for some of the datasets, without doing this, it will not be converged. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Fri, Oct 24, 2014 at 3:46 PM, Debasish D

Re: Spark LIBLINEAR

2014-10-24 Thread Debasish Das
You mean row/column normalization of data ? how much performance gain you saw using that ? On Fri, Oct 24, 2014 at 3:14 PM, DB Tsai wrote: > oh, we just train the model in the standardized space which will help > the convergence of LBFGS. Then we convert the weights to original > space so the w

Re: Spark LIBLINEAR

2014-10-24 Thread DB Tsai
oh, we just train the model in the standardized space which will help the convergence of LBFGS. Then we convert the weights to original space so the whole thing is transparent to users. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com Link

Re: Spark LIBLINEAR

2014-10-24 Thread Debasish Das
@dbtsai for condition number what did you use ? Diagonal preconditioning of the inverse of B matrix ? But then B matrix keeps on changing...did u condition it after every few iterations ? Will it be possible to put that code in Breeze since it will be very useful to condition other solvers as well

Re: Spark LIBLINEAR

2014-10-24 Thread DB Tsai
We don't have SVMWithLBFGS, but you can check out how we implement LogisticRegressionWithLBFGS, and we also deal with some condition number improving stuff in LogisticRegressionWithLBFGS which improves the performance dramatically. Sincerely, DB Tsai --

Re: Spark LIBLINEAR

2014-10-24 Thread k.tham
Oh, I've only seen SVMWithSGD, hadn't realized LBFGS was implemented. I'll try it out when I have time. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-LIBLINEAR-tp5546p17240.html Sent from the Apache Spark User List mailing list archive at Nab

Re: Spark LIBLINEAR

2014-10-24 Thread Debasish Das
If the SVM is not already migrated to BFGS, that's the first thing you should try...Basically following LBFGS Logistic Regression come up with LBFGS based linear SVM... About integrating TRON in mllib, David already has a version of TRON in breeze but someone needs to validate it for linear SVM an

Re: Spark LIBLINEAR

2014-10-24 Thread k.tham
Just wondering, any update on this? Is there a plan to integrate CJ's work with mllib? I'm asking since SVM impl in MLLib did not give us good results and we have to resort to training our svm classifier in a serial manner on the driver node with liblinear. Also, it looks like CJ Lin is coming to

Re: Spark LIBLINEAR

2014-05-16 Thread Tom Vacek
I've done some comparisons with my own implementation of TRON on Spark. From a distributed computing perspective, it does 2x more local work per iteration than LBFGS, so the parallel isoefficiency is improved slightly. I think the truncated Newton solver holds some potential because there have be

Re: Spark LIBLINEAR

2014-05-16 Thread DB Tsai
Hi Deb, My co-worker fixed a owlqn bug in breeze, and it's important to have this to converge to the correct result. https://github.com/scalanlp/breeze/pull/247 You may want to use the snapshot of breeze to have this fix in. Sincerely, DB Tsai -

Re: Spark LIBLINEAR

2014-05-14 Thread Debasish Das
Hi Professor Lin, On our internal datasets, I am getting accuracy at par with glmnet-R for sparse feature selection from liblinear. The default mllib based gradient descent was way off. I did not tune learning rate but I run with varying lambda. Ths feature selection was weak. I used liblinear c

Re: Spark LIBLINEAR

2014-05-12 Thread DB Tsai
It seems that the code isn't managed in github. Can be downloaded from http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/distributed-liblinear/spark/spark-liblinear-1.94.zip It will be easier to track the changes in github. Sincerely, DB Tsai ---

Re: Spark LIBLINEAR

2014-05-12 Thread Xiangrui Meng
Hi Chieh-Yen, Great to see the Spark implementation of LIBLINEAR! We will definitely consider adding a wrapper in MLlib to support it. Is the source code on github? Deb, Spark LIBLINEAR uses BSD license, which is compatible with Apache. Best, Xiangrui On Sun, May 11, 2014 at 10:29 AM, Debasish

Re: Spark LIBLINEAR

2014-05-11 Thread Debasish Das
Hello Prof. Lin, Awesome news ! I am curious if you have any benchmarks comparing C++ MPI with Scala Spark liblinear implementations... Is Spark Liblinear apache licensed or there are any specific restrictions on using it ? Except using native blas libraries (which each user has to manage by pul

Re: Spark LIBLINEAR

2014-05-11 Thread DB Tsai
Dear Prof. Lin, Interesting! We had an implementation of L-BFGS in Spark and already merged in the upstream now. We read your paper comparing TRON and OWL-QN for logistic regression with L1 (http://www.csie.ntu.edu.tw/~cjlin/papers/l1.pdf), but it seems that it's not in the distributed setup. Wi