Hi,
While using Kafka streaming with the direct API, does the checkpoint
consist of more than the Kafka offset if no 'window' operations are being
used ?
Is there any facility to check the contents of the checkpoint files ?
Thanks.
Hi,
I am trying to use Fair Scheduler Pools with Kafka Streaming. I am
assigning each Kafka partition to its own pool. The attempt is to give each
partition an equal share of compute time irrespective of the number of
messages in each time window for each partition.
However, I do not see fair sha
n Piu" wrote:
> Have you tried using fair scheduler and queues
> On 12 Feb 2016 4:24 a.m., "p pathiyil" wrote:
>
>> With this setting, I can see that the next job is being executed before
>> the previous one is finished. However, the processing of the 'hot
ned-to-executors-in-spark-streaming
>
>
> On Thu, Feb 11, 2016 at 9:33 AM, p pathiyil wrote:
>
>> Thanks for the response Cody.
>>
>> The producers are out of my control, so can't really balance the incoming
>> content across the various topics and partitio
Hi,
I am looking at a way to isolate the processing of messages from each Kafka
partition within the same driver.
Scenario: A DStream is created with the createDirectStream call by passing
in a few partitions. Let us say that the streaming context is defined to
have a time duration of 2 seconds.
Hi,
Are there any ways to store DStreams / RDD read from Kafka in memory to be
processed at a later time ? What we need to do is to read data from Kafka,
process it to be keyed by some attribute that is present in the Kafka
messages, and write out the data related to each key when we have
accumula