Hi Faced a similar issue. Our solution was to load the model, cache it after converting it to a model from mllib and then use that instead of ml model.
On Tue, Oct 11, 2016 at 10:22 PM, Sean Owen <so...@cloudera.com> wrote: > I don't believe it will ever scale to spin up a whole distributed job to > serve one request. You can look possibly at the bits in mllib-local. You > might do well to export as something like PMML either with Spark's export > or JPMML and then load it into a web container and score it, without Spark > (possibly also with JPMML, OpenScoring) > > > On Tue, Oct 11, 2016, 17:53 Nicolas Long <nicolasl...@gmail.com> wrote: > >> Hi all, >> >> so I have a model which has been stored in S3. And I have a Scala webapp >> which for certain requests loads the model and transforms submitted data >> against it. >> >> I'm not sure how to run this quickly on a single instance though. At the >> moment Spark is being bundled up with the web app in an uberjar (sbt >> assembly). >> >> But the process is quite slow. I'm aiming for responses < 1 sec so that >> the webapp can respond quickly to requests. When I look the Spark UI I see: >> >> Summary Metrics for 1 Completed Tasks >> >> Metric Min 25th percentile Median 75th percentile Max >> Duration 94 ms 94 ms 94 ms 94 ms 94 ms >> Scheduler Delay 0 ms 0 ms 0 ms 0 ms 0 ms >> Task Deserialization Time 3 s 3 s 3 s 3 s 3 s >> GC Time 2 s 2 s 2 s 2 s 2 s >> Result Serialization Time 0 ms 0 ms 0 ms 0 ms 0 ms >> Getting Result Time 0 ms 0 ms 0 ms 0 ms 0 ms >> Peak Execution Memory 0.0 B 0.0 B 0.0 B 0.0 B 0.0 B >> >> I don't really understand why deserialization and GC should take so long >> when the models are already loaded. Is this evidence I am doing something >> wrong? And where can I get a better understanding on how Spark works under >> the hood here, and how best to do a standalone/bundled jar deployment? >> >> Thanks! >> >> Nic >> >