Koeninger [mailto:c...@koeninger.org]
Sent: Monday, March 14, 2016 9:39 PM
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
So what's happening here is that print() uses take(). Take() will try to
satisfy the request using on
e link to repository:
> https://github.com/guptamukul/sparktest.git
>
>
> From: Cody Koeninger
> Sent: 11 March 2016 23:04
> To: Mukul Gupta
> Cc: user@spark.apache.org
> Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
>
> Why are
efore.
Following is the link to repository:
https://github.com/guptamukul/sparktest.git
From: Cody Koeninger
Sent: 11 March 2016 23:04
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
t);
>
> JavaDStream processed = messages.map(new Function String>, String>() {
>
> @Override
> public String call(Tuple2 arg0) throws Exception {
>
> Thread.sleep(7000);
> return arg0._2;
> }
> });
>
> processed.print(90);
>
> try {
> jssc.start();
> jssc
___
From: Cody Koeninger
Sent: 11 March 2016 20:42
To: Mukul Gupta
Cc: user@spark.apache.org
Subject: Re: Kafka + Spark streaming, RDD partitions not processed in parallel
Can you post your actual code?
On Thu, Mar 10, 2016 at 9:55 PM, Mukul Gupta wrote:
> Hi All, I was running the following t
Can you post your actual code?
On Thu, Mar 10, 2016 at 9:55 PM, Mukul Gupta wrote:
> Hi All, I was running the following test: Setup 9 VM runing spark workers
> with 1 spark executor each. 1 VM running kafka and spark master. Spark
> version is 1.6.0 Kafka version is 0.9.0.1 Spark is using its ow