That may not be an issue if the app using the models runs by itself (not
bundled into an existing app), which may actually be the right way to
design it considering separation of concerns.

Regards
Sab

On Fri, Nov 13, 2015 at 9:59 AM, DB Tsai <dbt...@dbtsai.com> wrote:

> This will bring the whole dependencies of spark will may break the web app.
>
>
> Sincerely,
>
> DB Tsai
> ----------------------------------------------------------
> Web: https://www.dbtsai.com
> PGP Key ID: 0xAF08DF8D
>
> On Thu, Nov 12, 2015 at 8:15 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>>
>>
>> On Fri, Nov 13, 2015 at 2:04 AM, darren <dar...@ontrenet.com> wrote:
>>
>>> I agree 100%. Making the model requires large data and many cpus.
>>>
>>> Using it does not.
>>>
>>> This is a very useful side effect of ML models.
>>>
>>> If mlib can't use models outside spark that's a real shame.
>>>
>>
>> Well you can as mentioned earlier. You don't need Spark runtime for
>> predictions, save the serialized model and deserialize to use. (you need
>> the Spark Jars in the classpath though)
>>
>>>
>>>
>>> Sent from my Verizon Wireless 4G LTE smartphone
>>>
>>>
>>> -------- Original message --------
>>> From: "Kothuvatiparambil, Viju" <
>>> viju.kothuvatiparam...@bankofamerica.com>
>>> Date: 11/12/2015 3:09 PM (GMT-05:00)
>>> To: DB Tsai <dbt...@dbtsai.com>, Sean Owen <so...@cloudera.com>
>>> Cc: Felix Cheung <felixcheun...@hotmail.com>, Nirmal Fernando <
>>> nir...@wso2.com>, Andy Davidson <a...@santacruzintegration.com>, Adrian
>>> Tanase <atan...@adobe.com>, "user @spark" <user@spark.apache.org>,
>>> Xiangrui Meng <men...@gmail.com>, hol...@pigscanfly.ca
>>> Subject: RE: thought experiment: use spark ML to real time prediction
>>>
>>> I am glad to see DB’s comments, make me feel I am not the only one
>>> facing these issues. If we are able to use MLLib to load the model in web
>>> applications (outside the spark cluster), that would have solved the
>>> issue.  I understand Spark is manly for processing big data in a
>>> distributed mode. But, there is no purpose in training a model using MLLib,
>>> if we are not able to use it in applications where needs to access the
>>> model.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Viju
>>>
>>>
>>>
>>> *From:* DB Tsai [mailto:dbt...@dbtsai.com]
>>> *Sent:* Thursday, November 12, 2015 11:04 AM
>>> *To:* Sean Owen
>>> *Cc:* Felix Cheung; Nirmal Fernando; Andy Davidson; Adrian Tanase; user
>>> @spark; Xiangrui Meng; hol...@pigscanfly.ca
>>> *Subject:* Re: thought experiment: use spark ML to real time prediction
>>>
>>>
>>>
>>> I think the use-case can be quick different from PMML.
>>>
>>>
>>>
>>> By having a Spark platform independent ML jar, this can empower users to
>>> do the following,
>>>
>>>
>>>
>>> 1) PMML doesn't contain all the models we have in mllib. Also, for a ML
>>> pipeline trained by Spark, most of time, PMML is not expressive enough to
>>> do all the transformation we have in Spark ML. As a result, if we are able
>>> to serialize the entire Spark ML pipeline after training, and then load
>>> them back in app without any Spark platform for production scorning, this
>>> will be very useful for production deployment of Spark ML models. The only
>>> issue will be if the transformer involves with shuffle, we need to figure
>>> out a way to handle it. When I chatted with Xiangrui about this, he
>>> suggested that we may tag if a transformer is shuffle ready. Currently, at
>>> Netflix, we are not able to use ML pipeline because of those issues, and we
>>> have to write our own scorers in our production which is quite a duplicated
>>> work.
>>>
>>>
>>>
>>> 2) If users can use Spark's linear algebra like vector or matrix code in
>>> their application, this will be very useful. This can help to share code in
>>> Spark training pipeline and production deployment. Also, lots of good stuff
>>> at Spark's mllib doesn't depend on Spark platform, and people can use them
>>> in their application without pulling lots of dependencies. In fact, in my
>>> project, I have to copy & paste code from mllib into my project to use
>>> those goodies in apps.
>>>
>>>
>>>
>>> 3) Currently, mllib depends on graphx which means in graphx, there is no
>>> way to use mllib's vector or matrix. And
>>>
>>
>>
>>
>> --
>>
>> Thanks & regards,
>> Nirmal
>>
>> Team Lead - WSO2 Machine Learner
>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>> Mobile: +94715779733
>> Blog: http://nirmalfdo.blogspot.com/
>>
>>
>>
>


-- 

Architect - Big Data
Ph: +91 99805 99458

Manthan Systems | *Company of the year - Analytics (2014 Frost and Sullivan
India ICT)*
+++

Reply via email to