Yes, look at KafkaUtils.createRDD
On Wed, Jul 22, 2015 at 11:17 AM, Shushant Arora
wrote:
> Thanks !
>
> I am using spark streaming 1.3 , And if some post fails because of any
> reason, I will store the offset of that message in another kafka topic. I
> want to read these offsets in another spar
Thanks !
I am using spark streaming 1.3 , And if some post fails because of any
reason, I will store the offset of that message in another kafka topic. I
want to read these offsets in another spark job and from them the original
kafka topic's messages based on these offsets-
So is it possible in
Yes, you could unroll from the iterator in batch of 100-200 and then post
them in multiple rounds.
If you are using the Kafka receiver based approach (not Direct), then the
raw Kafka data is stored in the executor memory. If you are using Direct
Kafka, then it is read from Kafka directly at the tim
I can post multiple items at a time.
Data is being read from kafka and filtered after that its posted .
Does foreachPartition
load complete partition in memory or use an iterator of batch underhood? If
compete batch is not loaded will using custim size of 100-200 request in
one batch and post will
If you can post multiple items at a time, then use foreachPartition to post
the whole partition in a single request.
On Tue, Jul 21, 2015 at 9:35 AM, Richard Marscher
wrote:
> You can certainly create threads in a map transformation. We do this to do
> concurrent DB lookups during one stage for
You can certainly create threads in a map transformation. We do this to do
concurrent DB lookups during one stage for example. I would recommend,
however, that you switch to mapPartitions from map as this allows you to
create a fixed size thread pool to share across items on a partition as
opposed