Spark Streaming program, which consumes data
>> from Kakfa and does the group by operation on the data. I try to optimize
>> the running time of the program because it looks slow to me. It seems the
>> stage named:
>>
>> * combineByKey at ShuffledDStream.scala:42 *
>>
operation on the data. I try to optimize the
> running time of the program because it looks slow to me. It seems the stage
> named:
>
> * combineByKey at ShuffledDStream.scala:42 *
>
> always takes the longest running time. And If I open this stage, I only
> see two executors on th
Hi all,
I am currently running a Spark Streaming program, which consumes data from
Kakfa and does the group by operation on the data. I try to optimize the
running time of the program because it looks slow to me. It seems the stage
named:
* combineByKey at ShuffledDStream.scala:42 *
always