Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread James Cheng
Jeff, Your analysis is correct. I would say that it is known but unintuitive behavior. As an example of a problem caused by this behavior, it's possible for mirrormaker to miss messages on newly created topics, even thought it was subscribed to them before topics were creted. See the following

Re: Lost message with Kafka configuration

2017-01-05 Thread James Cheng
kafka-console-producer.sh defaults to acks=0, which means that the producer essentially throws messages at the broker and doesn't wait/retry to make sure they are properly received. In the kafka-console-producer.sh usage text: --request-required-acksrequests (default: 0) Try

Re: Lost message with Kafka configuration

2017-01-05 Thread Hoang Bao Thien
Hi James et all, Thanks for your help. It works well when that parameter, but only for one CSV file. If I run >=5 CSV files, each of size 110MB, the data is lost too (when I check the number of received messages and the number of messages of original files) I get many lots of errors after re-runni

Problem with processor API partition assignments

2017-01-05 Thread Brian Krahmer
Hey guys, I'm fighting an issue where I can currently only run one instance of my streams application because when other instances come up, the partition reassignment (looks to me) to be incorrect. I'm testing with docker-compose at the moment. When I scale my application to 3 instances a

Query on MirrorMaker Replication - Bi-directional/Failover replication

2017-01-05 Thread Greenhorn Techie
Hi, We are planning to setup MirrorMaker based Kafka replication for DR purposes. The base requirement is to have a DR replication from primary (site1) to DR site (site2)using MirrorMaker, However, we need the solution to work in case of failover as well i.e. where in the event of the site1 kafk

Re: MirrorMaker - Topics Identification and Replication

2017-01-05 Thread Greenhorn Techie
Thanks Ewen for your response. Just to summarise, here is my understanding. Apologies if something is mis-understood. I am new to Kafka and hence still short in knowledge. - MirrorMarker process automatically picks-up new topics added on the source cluster and hence no restart of the proce

Re: Problem with processor API partition assignments

2017-01-05 Thread Damian Guy
Hi Brian, It might be helpful if you provide some code showing your Topology. Thanks, Damian On Thu, 5 Jan 2017 at 10:59 Brian Krahmer wrote: > Hey guys, > >I'm fighting an issue where I can currently only run one instance of > my streams application because when other instances come up, t

Metric meaning

2017-01-05 Thread Robert Quinlivan
Hello, Are there more detailed descriptions available for the metrics exposed by Kafka via JMX? The current documentation provides some information but a few metrics are not listed in detail – for example, "Log flush rate and time." -- Robert Quinlivan Software Engineer, Signal

Re: Lost message with Kafka configuration

2017-01-05 Thread Protoss Hu
You mean the messages were lost on the way to broker before the broker actually received? Protoss Hu Blog: http://hbprotoss.github.io/ Weibo: http://weibo.com/hbprotoss 2017年1月5日 +0800 PM4:53 James Cheng ,写道: > kafka-console-producer.sh defaults to acks=0, which means that the producer > essent

Does offsetsForTimes use createtime of logsegment file?

2017-01-05 Thread Vignesh
Hi, offsetsForTimes function returns offset for a given timestamp. Does it use message's timestamp (which could be LogAppendTime or set by user) or creation time of logsegmen

Under-replicated Partitions while rolling Kafka nodes in AWS

2017-01-05 Thread Jack Lund
Hello, all. We're running multiple Kafka clusters in AWS, and thus multiple Zookeeper clusters as well. When we roll out changes to our zookeeper nodes (which involves changes to the AMI, which means terminating the zookeeper instance and bringing up a new one in its place) we have to restart our

Re: Aggregated windowed counts

2017-01-05 Thread Benjamin Black
I understand now. The commit triggers the output of the window data, whether or not the window is complete. For example, if I use .print() as you suggest: [KSTREAM-AGGREGATE-03]: [kafka@148363192] , (9<-null) [KSTREAM-AGGREGATE-03]: [kafka@1483631925000] , (5<-null) [KSTREAM-AG

Re: Lost message with Kafka configuration

2017-01-05 Thread Hoang Bao Thien
Yes, the problem is from producer configuration. And James Cheng has told me how to fix it. However I still get other poblem with a large file: org.apache.kafka.common.errors.TimeoutException: Batch containing 36 record(s) expired due to timeout while requesting metadata from brokers for MyTopic-0

Re: Problem with processor API partition assignments

2017-01-05 Thread Matthias J. Sax
It would also be helpful to know the number of partitions for each topic. -Matthias On 1/5/17 4:37 AM, Damian Guy wrote: > Hi Brian, > > It might be helpful if you provide some code showing your Topology. > > Thanks, > Damian > > On Thu, 5 Jan 2017 at 10:59 Brian Krahmer wrote: > >> Hey guys

Re: Aggregated windowed counts

2017-01-05 Thread Matthias J. Sax
On a clean restart on the same machine, the local RocksDB will just be reused as it contains the complete state. Thus there is no need to read the changelog topic at all. The changelog topic is only read when a state is moved from one node to another, or the state got corrupted due to an failure (

Re: Connect: SourceTask poll & commit interaction

2017-01-05 Thread Shikhar Bhushan
I have created https://issues.apache.org/jira/browse/KAFKA-4598 for this. On Wed, Dec 14, 2016 at 2:58 PM Shikhar Bhushan wrote: > Hi Mathieu, > > I think you are right, there is currently no mutual exclusion between > `task.commit()` and `task.poll()`. The solution you are thinking of with > ma

kafka CN domain and keyword

2017-01-05 Thread Thomas Liu
(Please forward this to your CEO, because this is urgent. Thanks) This is a formal email. We are the Domain Registration Service company in China. Here I have something to confirm with you. On Jan 3, 2017, we received an application from Baoda Ltd requested "kafka" as their internet keyword and

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-05 Thread Ben Stopford
Hi Jun Thanks for raising these points. Thorough as ever! 1) Changes made as requested. 2) Done. 3) My plan for handing returning leaders is to simply to force the Leader Epoch to increment if a leader returns. I don't plan to fix KAFKA-1120 as part of this KIP. It is really a separate issue with

Kafka Connect offset.storage.topic not receiving messages (i.e. how to access Kafka Connect offset metadata?)

2017-01-05 Thread Phillip Mann
I am working on setting up a Kafka Connect Distributed Mode application which will be a Kafka to S3 pipeline. I am using Kafka 0.10.1.0-1 and Kafka Connect 3.1.1-1. So far things are going smoothly but one aspect that is important to the larger system I am working with requires knowing offset in

Re: What makes a KStream app exit?

2017-01-05 Thread Guozhang Wang
Re: "UnsupportedOperationException: null org.apache.kafka.streams. processor.internals.StandbyContextImpl.recordCollector( StandyContextImpl.java:81)": I think this is a known issue that has been fixed in trunk: https://github.com/apache/kafka/commit/a4592a18641f84a1983c5fe7e697a8 b0ab43eb25 Guo

Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread Jeff Widman
Thanks James and Hans. Will this also happen when we expand the number of partitions in a topic? That also will trigger a rebalance, the consumer won't subscribe to the partition until the rebalance finishes, etc. So it'd seem that any messages published to the new partition in between the parti

Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread James Cheng
> On Jan 5, 2017, at 12:57 PM, Jeff Widman wrote: > > Thanks James and Hans. > > Will this also happen when we expand the number of partitions in a topic? > > That also will trigger a rebalance, the consumer won't subscribe to the > partition until the rebalance finishes, etc. > > So it'd see

Re: Lost message with Kafka configuration

2017-01-05 Thread James Cheng
> On Jan 5, 2017, at 8:23 AM, Hoang Bao Thien wrote: > > Yes, the problem is from producer configuration. And James Cheng has told > me how to fix it. > However I still get other poblem with a large file: > > org.apache.kafka.common.errors.TimeoutException: Batch containing 36 > record(s) expir

Re: Under-replicated Partitions while rolling Kafka nodes in AWS

2017-01-05 Thread James Cheng
> On Jan 5, 2017, at 7:55 AM, Jack Lund wrote: > > Hello, all. > > We're running multiple Kafka clusters in AWS, and thus multiple Zookeeper > clusters as well. When we roll out changes to our zookeeper nodes (which > involves changes to the AMI, which means terminating the zookeeper instance >

Re: Consumer Rebalancing Question

2017-01-05 Thread Pradeep Gollakota
I see... doesn't that cause flapping though? On Wed, Jan 4, 2017 at 8:22 PM, Ewen Cheslack-Postava wrote: > The coordinator will immediately move the group into a rebalance if it > needs it. The reason LeaveGroupRequest was added was to avoid having to > wait for the session timeout before compl

One big kafka connect cluster or many small ones?

2017-01-05 Thread Stephane Maarek
Hi, We like to operate in micro-services (dockerize and ship everything on ecs) and I was wondering which approach was preferred. We have one kafka cluster, one zookeeper cluster, etc, but when it comes to kafka connect I have some doubts. Is it better to have one big kafka connect with multiple

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-05 Thread Jun Rao
Hi, Ben, Thanks for the updated KIP. +1 1) In OffsetForLeaderEpochResponse, start_offset probably should be end_offset since it's the end offset of that epoch. 3) That's fine. We can fix KAFKA-1120 separately. Jun On Thu, Jan 5, 2017 at 11:11 AM, Ben Stopford wrote: > Hi Jun > > Thanks for r

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-05 Thread Joel Koshy
+1 This is a very well-written KIP! Minor: there is still a mix of terms in the doc that references the earlier LeaderGenerationRequest (which is what I'm assuming what it was called in previous versions of the wiki). Same for the diagrams which I'm guessing are a little harder to make consistent

Re: [VOTE] Vote for KIP-101 - Leader Epochs

2017-01-05 Thread Joel Koshy
(adding the dev list back - as it seems to have gotten dropped earlier in this thread) On Thu, Jan 5, 2017 at 6:36 PM, Joel Koshy wrote: > +1 > > This is a very well-written KIP! > Minor: there is still a mix of terms in the doc that references the > earlier LeaderGenerationRequest (which is wha

Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 3:12 PM, Stephane Maarek < steph...@simplemachines.com.au> wrote: > Hi, > > We like to operate in micro-services (dockerize and ship everything on ecs) > and I was wondering which approach was preferred. > We have one kafka cluster, one zookeeper cluster, etc, but when it co

Re: Kafka Connect offset.storage.topic not receiving messages (i.e. how to access Kafka Connect offset metadata?)

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 11:30 AM, Phillip Mann wrote: > I am working on setting up a Kafka Connect Distributed Mode application > which will be a Kafka to S3 pipeline. I am using Kafka 0.10.1.0-1 and Kafka > Connect 3.1.1-1. So far things are going smoothly but one aspect that is > important to th

Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Stephane Maarek
Thanks a lot for the guidance, I think we’ll go ahead with one cluster. I just need to figure out how our CD pipeline can talk to our connect cluster securely (because it’ll need direct access to perform API calls). Lastly, a question or maybe a piece of feedback… is it not possible to specify the

Re: Query on MirrorMaker Replication - Bi-directional/Failover replication

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 3:07 AM, Greenhorn Techie wrote: > Hi, > > We are planning to setup MirrorMaker based Kafka replication for DR > purposes. The base requirement is to have a DR replication from primary > (site1) to DR site (site2)using MirrorMaker, > > However, we need the solution to work

Re: Metric meaning

2017-01-05 Thread Ewen Cheslack-Postava
There's not currently anything more detaild than what is included in http://kafka.apache.org/documentation/#monitoring There's some work trying to automate the generation of that documentation ( https://issues.apache.org/jira/browse/KAFKA-3480). That combined with some addition to give longer descr

Apache Kafka integration using Apache Camel

2017-01-05 Thread Gupta, Swati
Hello All, I am trying to create a Consumer using Apache Camel for a topic in Apache Kafka. I am using Camel 2.17.0 and Kafka 0.10 and JDK 1.8. I have attached a file, KafkaCamelTestConsumer.java which is a standalone application trying to read from a topic "test1"created in Apache Kafka I am p

Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread Ewen Cheslack-Postava
The basic issue here is just that the auto.offset.reset defaults to latest, right? That's not a very good setting for a mirroring tool and this seems like something we might just want to change the default for. It's debatable whether it would even need a KIP. We have other settings in MM where we

Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-05 Thread Ewen Cheslack-Postava
On Wed, Jan 4, 2017 at 11:54 PM, Vignesh wrote: > Hi, > > offsetsForTimes > KafkaConsumer.html#offsetsForTimes(java.util.Map)> > function > returns offset for a given timestamp. Does it use message's timestamp > (which co

Re: MirrorMaker - Topics Identification and Replication

2017-01-05 Thread Ewen Cheslack-Postava
That all sounds right! Usually the delay for picking up the metadata update and starting to replicate the topic won't be an issue since it's a one-time issue (around topic creation) and the time window is pretty small for that. In steady state, none of the mentioned delays would apply. -Ewen On

Re: Consumer Rebalancing Question

2017-01-05 Thread Ewen Cheslack-Postava
Not sure I understand your question about flapping. The LeaveGroupRequest is only sent on a graceful shutdown. If a consumer knows it is going to shutdown, it is good to proactively make sure the group knows it needs to rebalance work because some of the partitions that were handled by the consumer

Re: Apache Kafka integration using Apache Camel

2017-01-05 Thread UMESH CHAUDHARY
Did you test that kafka console consumer is displaying the produced message? On Fri, Jan 6, 2017 at 9:18 AM, Gupta, Swati wrote: > Hello All, > > > > I am trying to create a Consumer using Apache Camel for a topic in Apache > Kafka. > I am using Camel 2.17.0 and Kafka 0.10 and JDK 1.8. > I have

Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Ewen Cheslack-Postava
On Thu, Jan 5, 2017 at 7:19 PM, Stephane Maarek < steph...@simplemachines.com.au> wrote: > Thanks a lot for the guidance, I think we’ll go ahead with one cluster. I > just need to figure out how our CD pipeline can talk to our connect cluster > securely (because it’ll need direct access to perform

Re: Apache Kafka integration using Apache Camel

2017-01-05 Thread Ewen Cheslack-Postava
More generally, do you have any log errors/messages or additional info? It's tough to debug issues like this from 3rd party libraries if they don't provide logs/exception info that indicates why processing a specific message failed. -Ewen On Thu, Jan 5, 2017 at 8:29 PM, UMESH CHAUDHARY wrote: >

RE: Apache Kafka integration using Apache Camel

2017-01-05 Thread Gupta, Swati
Yes, the kafka console consumer displays the message correctly. I also tested the same with a Java application, it works fine. There seems to be an issue with Camel route trying to consume. There is no error in the console. But, the logs show as below: kafka.KafkaCamelTestConsumer Connected to th

Re: Kafka Streams window retention period question

2017-01-05 Thread Alexander Demidko
Hi Matthias, Thanks for such a thorough response! I guess there are cases when a determinism might be preferred over computing "more correct results" (e.g. in unit tests, where one manually lays out an order of incoming events and wants to get an exact output), but from now on I can simply assume

Re: One big kafka connect cluster or many small ones?

2017-01-05 Thread Stephane Maarek
Thanks! So I just override the conf while doing the API call? It’d be great to have this documented somewhere on the confluent website. I couldn’t find it. On 6 January 2017 at 3:42:45 pm, Ewen Cheslack-Postava (e...@confluent.io) wrote: On Thu, Jan 5, 2017 at 7:19 PM, Stephane Maarek < steph...@

Re: Kafka Streams window retention period question

2017-01-05 Thread Sachin Mittal
What should happen when the window 00:00..01:00 will finally get purged (and the internal stream time will get bumped to say time 10:00) but then I receive an event ? Will it create the 00:00..01:00 window again or the event will be dropped because it's way older than the internal stream time? Onc

Re: Kafka Streams window retention period question

2017-01-05 Thread Matthias J. Sax
Hi Alex, if a window was purged because its retention time passed it will not accept any records anymore -- thus, if a very late record arrives, it will get dropped without any further notice. About stream time and partition: yes. And how time is advanced/tracked in independent for the window typ

Re: Does offsetsForTimes use createtime of logsegment file?

2017-01-05 Thread Vignesh
Thanks. I didn't realize ListOffsetRequestV1 is only available 0.10.1 (which has KIP-33, time index). When timestamp is set by user (CreationTime), and it is not always increasing, would this method still return the offset of first message with timestamp greater than equal to the provided timestamp

Re: Kafka Streams window retention period question

2017-01-05 Thread Alexander Demidko
Great, thanks for the info guys! On Thu, Jan 5, 2017 at 10:09 PM, Matthias J. Sax wrote: > Hi Alex, > > if a window was purged because its retention time passed it will not > accept any records anymore -- thus, if a very late record arrives, it > will get dropped without any further notice. > >