Re: Concurrent batch processing

Arush Kharbanda Thu, 12 Feb 2015 11:58:11 -0800

It could depend on the nature of your application but spark streaming would
use spark internally and concurrency should be there what is your use case?



Are you sure that your configuration is good?


On Fri, Feb 13, 2015 at 1:17 AM, Matus Faro <matus.f...@kik.com> wrote:

> Hi,
>
> Please correct me if I'm wrong, in Spark Streaming, next batch will
> not start processing until the previous batch has completed. Is there
> any way to be able to start processing the next batch if the previous
> batch is taking longer to process than the batch interval?
>
> The problem I am facing is that I don't see a hardware bottleneck in
> my Spark cluster, but Spark is not able to handle the amount of data I
> am pumping through (batch processing time is longer than batch
> interval). What I'm seeing is spikes of CPU, network and disk IO usage
> which I assume are due to different stages of a job, but on average,
> the hardware is under utilized. Concurrency in batch processing would
> allow the average batch processing time to be greater than batch
> interval while fully utilizing the hardware.
>
> Any ideas on what can be done? One option I can think of is to split
> the application into multiple applications running concurrently and
> dividing the initial stream of data between those applications.
> However, I would have to lose the benefits of having a single
> application.
>
> Thank you,
> Matus
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 

[image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com

Re: Concurrent batch processing

Reply via email to