msung Mobile.
Original message From: Cody Koeninger
Date:19/07/2016 20:49 (GMT+05:30)
To: Diwakar Dhanuskodi Cc:
Martin Eden , user
Subject: Re: Spark streaming takes longer time to read json into
dataframes
Yes, if you need more parallelism, you need to either add more
msung Mobile.
Original message From: Cody Koeninger
Date:19/07/2016 20:49 (GMT+05:30)
To: Diwakar Dhanuskodi Cc:
Martin Eden , user
Subject: Re: Spark streaming takes longer time to read json into
dataframes
Yes, if you need more parallelism, you need to either add more
> as you set in Kafka.
>
> Have you seen this?
> http://spark.apache.org/docs/latest/streaming-kafka-integration.html
>
> M
>
> On Sat, Jul 16, 2016 at 5:26 AM, Diwakar Dhanuskodi
> wrote:
>>
>>
>> -- Forwarded message ------
>> From: Diwa
.
Original message From: Martin Eden
Date:16/07/2016 14:01 (GMT+05:30)
To: Diwakar Dhanuskodi Cc:
user Subject: Re: Spark streaming takes
longer time to read json into dataframes
Hi,
I would just do a repartition on the initial direct DStream since otherwise
each RDD in the stream
at 5:26 AM, Diwakar Dhanuskodi <
diwakar.dhanusk...@gmail.com> wrote:
>
> -- Forwarded message --
> From: Diwakar Dhanuskodi
> Date: Sat, Jul 16, 2016 at 9:30 AM
> Subject: Re: Spark streaming takes longer time to read json into dataframes
> To: Jean Geo
Do you need it on disk or just push it to memory? Can you try to increase
memory or # of cores (I know it sounds basic)
> On Jul 15, 2016, at 11:43 PM, Diwakar Dhanuskodi
> wrote:
>
> Hello,
>
> I have 400K json messages pulled from Kafka into spark streaming using
> DirectStream approach.