Are you using the scheduler in fair mode instead of fifo mode? Sent from my iPhone
> On Sep 22, 2018, at 12:58 AM, Jatin Puri <purija...@gmail.com> wrote: > > Hi. > > What tactics can I apply for such a scenario. > > I have a pipeline of 10 stages. Simple text processing. I train the data with > the pipeline and for the fitted data, do some modelling and store the results. > > I also have a web-server, where I receive requests. For each request > (dataframe of single row), I transform against the same pipeline created > above. And do the respective action. The problem is: calling spark for single > row takes less than 1 second, but under higher load, spark becomes a > major bottleneck. > > One solution that I can think of, is to have scala re-implementation of the > same pipeline, and with the help of the model generated above, process the > requests. But this results in duplication of code and hence maintenance. > > Is there any way, that I can call the same pipeline (transform) in a very > light manner, and just for single row. So that it just works concurrently and > spark does not remain a bottlenect? > > Thanks > Jatin --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org