>> database. That would probably mean using a real database instead of
>> zookeeper though.
>>
>> On Tue, Oct 27, 2015 at 4:13 AM, Krot Viacheslav <
>> krot.vyaches...@gmail.com> wrote:
>>
>>> Any ideas? This is so important because we use kafka direc
zookeeper.
This way I loose data in offsets 10-20
How should this be handled correctly?
пн, 26 окт. 2015 г. в 18:37, varun sharma :
> +1, wanted to do same.
>
> On Mon, Oct 26, 2015 at 8:58 PM, Krot Viacheslav <
> krot.vyaches...@gmail.com> wrote:
>
>> Hi all,
>>
Hi all,
I wonder what is the correct way to stop streaming application if some job
failed?
What I have now:
val ssc = new StreamingContext
ssc.start()
try {
ssc.awaitTermination()
} catch {
case e => ssc.stop(stopSparkContext = true, stopGracefully = false)
}
It works but one problem
Hi all,
Is there a way to stop streaming context when some batch processing failed?
I want to set reasonable reties count, say 10, and if failed - stop context
completely.
Is that possible?
e a single instance of a partitioner in advance, and all it
> needs to know is the number of partitions (which is just the count of all
> the kafka topic/partitions).
>
> On Tue, Jun 2, 2015 at 12:40 PM, Krot Viacheslav <
> krot.vyaches...@gmail.com> wrote:
>
>> Cody,
>
x
> into the offset range array matches the (spark) partition id. That will
> also tell you what the value of numPartitions should be.
>
>
>
>
>
>
>
> On Tue, Jun 2, 2015 at 11:50 AM, Krot Viacheslav <
> krot.vyaches...@gmail.com> wrote:
>
>> Hi all,
&
Hi all,
In my streaming job I'm using kafka streaming direct approach and want to
maintain state with updateStateByKey. My PairRDD has message's topic name +
partition id as a key. So, I assume that updateByState could work within
same partition as KafkaRDD and not lead to shuffles. Actually this i