Re: mllib model in production web API

Aseem Bansal Wed, 12 Oct 2016 02:53:08 -0700

Hi

Faced a similar issue. Our solution was to load the model, cache it after
converting it to a model from mllib and then use that instead of ml model.


On Tue, Oct 11, 2016 at 10:22 PM, Sean Owen <so...@cloudera.com> wrote:

> I don't believe it will ever scale to spin up a whole distributed job to
> serve one request. You can look possibly at the bits in mllib-local. You
> might do well to export as something like PMML either with Spark's export
> or JPMML and then load it into a web container and score it, without Spark
> (possibly also with JPMML, OpenScoring)
>
>
> On Tue, Oct 11, 2016, 17:53 Nicolas Long <nicolasl...@gmail.com> wrote:
>
>> Hi all,
>>
>> so I have a model which has been stored in S3. And I have a Scala webapp
>> which for certain requests loads the model and transforms submitted data
>> against it.
>>
>> I'm not sure how to run this quickly on a single instance though. At the
>> moment Spark is being bundled up with the web app in an uberjar (sbt
>> assembly).
>>
>> But the process is quite slow. I'm aiming for responses < 1 sec so that
>> the webapp can respond quickly to requests. When I look the Spark UI I see:
>>
>> Summary Metrics for 1 Completed Tasks
>>
>> Metric    Min    25th percentile    Median    75th percentile    Max
>> Duration    94 ms    94 ms    94 ms    94 ms    94 ms
>> Scheduler Delay    0 ms    0 ms    0 ms    0 ms    0 ms
>> Task Deserialization Time    3 s    3 s    3 s    3 s    3 s
>> GC Time    2 s    2 s    2 s    2 s    2 s
>> Result Serialization Time    0 ms    0 ms    0 ms    0 ms    0 ms
>> Getting Result Time    0 ms    0 ms    0 ms    0 ms    0 ms
>> Peak Execution Memory    0.0 B    0.0 B    0.0 B    0.0 B    0.0 B
>>
>> I don't really understand why deserialization and GC should take so long
>> when the models are already loaded. Is this evidence I am doing something
>> wrong? And where can I get a better understanding on how Spark works under
>> the hood here, and how best to do a standalone/bundled jar deployment?
>>
>> Thanks!
>>
>> Nic
>>
>

Re: mllib model in production web API

Reply via email to