Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Matthias J. Sax
Hi Kanagha, For reading, KafkaSpout's internally used KafkaConsumer ensures that data is received in-order per partition. Because the spout might read multiple partitions, and emit only a single (logical) output stream, within this output stream, data from multiple partitions interleave (the relat

Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Matthias J. Sax
Some addition: Actually, I have some doubt that you need the order of a partitions to be preserved, you usually want to preserve the order per key -- and a partition contains multiple keys. Thus, in Storm it is also sufficient to preserve the order by key (and not per partition). Thus, you can jus

Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Matthias J. Sax
> Does the parallelism_hint set when a KafkaSpout is added to a topology, > need to match the number of partitions in a topic? No. On 06/05/2016 11:26 AM, Matthias J. Sax wrote: > Hi Kanagha, > > For reading, KafkaSpout's internally used KafkaConsumer ensures that > data is received in-order per

Re: Maintaining message ordering using KafkaSpout/Bolt

2016-06-05 Thread Kanagha
Hi Matthias, Thanks a lot for the clarifications. I agree, that using fieldGrouping would be sufficient to maintain ordering per key. If messages are keyed and stored in kafka, fieldGrouping would be sufficient. Only if kafka uses round robin partitioning in absence of a key, then custom code nee

?????? [Kafka Streams] java.lang.IllegalArgumentException:Invalidtimestamp -1

2016-06-05 Thread ????
Hi Guozhang, YES. the console producer, broker, and kafka streams are all 0.10.0.0 version. is it because of this bug ? https://issues.apache.org/jira/browse/KAFKA-3716?jql=project%20%3D%20KAFKA -- -- ??: "Guozhang Wang";; : 2016??6??

Re: Resetting the Offset of a Kafka Sink Connector

2016-06-05 Thread Jack Lund
Ah, cool. Thanks! On Sat, Jun 4, 2016 at 7:42 PM Ewen Cheslack-Postava wrote: > Connectors don't perform any data copying and don't rewind offsets -- > that's the job of Tasks. In your SinkTask implementation you have access to > the SinkTaskContext via its context field. > > -Ewen > > On Tue, M

Consumer offsets not committed in zookeeper during graceful shutdown

2016-06-05 Thread Gagan Agrawal
Hi, I am trying to shutdown kafka consumer (version 0.8.2) gracefully by calling consumer.shutdown (ConsumerConnector.shutdown) and then waiting for executor threads to finish. However what I have noticed is that during next start, some of the messages are replayed. I have auto commit enabled. I l