Are you using the scheduler in fair mode instead of fifo mode? 

Sent from my iPhone

> On Sep 22, 2018, at 12:58 AM, Jatin Puri <purija...@gmail.com> wrote:
> 
> Hi.
> 
> What tactics can I apply for such a scenario.
> 
> I have a pipeline of 10 stages. Simple text processing. I train the data with 
> the pipeline and for the fitted data, do some modelling and store the results.
> 
> I also have a web-server, where I receive requests. For each request 
> (dataframe of single row), I transform against the same pipeline created 
> above. And do the respective action. The problem is: calling spark for single 
> row takes less than  1 second, but under  higher  load, spark becomes  a 
> major bottleneck.
> 
> One solution  that I can  think of, is to have scala re-implementation of the 
> same pipeline, and with  the help of the model generated above, process the 
> requests. But this results in  duplication of code and hence maintenance.
> 
> Is there any way, that I can call the same pipeline (transform) in a very 
> light manner, and just for single row. So that it just works concurrently and 
> spark does not remain a bottlenect?
> 
> Thanks
> Jatin

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to