Hey Aseem, If you are looking for a full-featured library to execute Spark ML pipelines outside of Spark, take a look at MLeap: https://github.com/combust/mleap
Not only does it support transforming single instances of a feature vector, but you can execute your entire ML pipeline including feature extraction. Cheers, Hollin On Wed, Feb 1, 2017 at 8:49 AM, Seth Hendrickson < seth.hendrickso...@gmail.com> wrote: > In Spark.ML the coefficients are not "pivoted" meaning that they do not > set one of the coefficient sets equal to zero. You can read more about it > here: https://en.wikipedia.org/wiki/Multinomial_logistic_ > regression#As_a_set_of_independent_binary_regressions > > You can translate your set of coefficients to a pivoted version by simply > subtracting one of the sets of coefficients from all the others. That > leaves the one you selected, the "pivot", as all zeros. You can then pass > this into the mllib model, disregarding the "pivot" coefficients. The > coefficients should be laid out like: > > [feature0_class0, feature1_class0, feature2_class0, intercept0, > feature0_class1, ..., intercept1] > > So you have 9 coefficients and 3 intercepts, but you are going to get rid > of one class's coefficients, leaving you with 6 coefficients and two > intercepts - so a vector of length 8 for mllib's model. > > Note: if you use regularization then it is not exactly correct to convert > from the non-pivoted version to the pivoted one, since the algorithms will > give different results in those cases, though it is still possible to do it. > > On Wed, Feb 1, 2017 at 3:42 AM, Aseem Bansal <asmbans...@gmail.com> wrote: > >> *What I want to do* >> I have a trained a ml.classification.LogisticRegressionModel using spark >> ml package. >> >> It has 3 features and 3 classes. So the generated model has coefficients >> in (3, 3) matrix and intercepts in Vector of length (3) as expected. >> >> Now, I want to take these coefficients and convert this >> ml.classification.LogisticRegressionModel model to an instance of >> mllib.classification.LogisticRegressionModel model. >> >> *Why I want to do this* >> Computational Speed as SPARK-10413 is still in progress and scheduled for >> Spark 2.2 which is not yet released. >> >> *Why I think this is possible* >> I checked https://spark.apache.org/docs/latest/mllib-linear-me >> thods.html#logistic-regression and in that example a multinomial >> Logistic Regression is trained. So as per this the class >> mllib.classification.LogisticRegressionModel can encapsulate these >> parameters. >> >> *Problem faced* >> The only constructor in mllib.classification.LogisticRegressionModel >> takes a single vector as coefficients and single double as intercept but I >> have a Matrix of coefficients and Vector of intercepts respectively. >> >> I tried converting matrix to a vector by just taking the values (Guess >> work) but got >> >> requirement failed: LogisticRegressionModel.load with numClasses = 3 and >> numFeatures = 3 expected weights of length 6 (without intercept) or 8 (with >> intercept), but was given weights of length 9 >> >> So any ideas? >> > >