Can I tell mirrormaker2 to start (re)consuming from a specific offset?

2024-09-12 Thread James Cheng
Hi everyone, I am using Mirrormaker 2. From what I understand, mirrormaker2 does not store its offsets in the source cluster's __consumer_offsets topic. Rather, it stores its offsets in the destination cluster in the value defined by offset.storage.topic In the __consumer_offsets world, I coul

Re: [ANNOUNCE] Apache Kafka 3.2.0

2022-05-19 Thread James Cheng
Bruno, Congrats on the release! There is a small typo on the page. > KIP-791 > > adds method recordMetada() to the StateStoreContext, Should be > KIP-791 >

Re: [ANNOUNCE] Apache Kafka 2.6.1

2021-01-11 Thread James Cheng
Thank you Mickael for running the release. Good job everyone! -James Sent from my iPhone > On Jan 11, 2021, at 2:17 PM, Mickael Maison wrote: > > The Apache Kafka community is pleased to announce the release for > Apache Kafka 2.6.1. > > This is a bug fix release and it includes fixes and im

Re: [ANNOUNCE] New committer: David Jacot

2020-10-19 Thread James Cheng
Congratulations, David! -James > On Oct 16, 2020, at 9:01 AM, Gwen Shapira wrote: > > The PMC for Apache Kafka has invited David Jacot as a committer, and > we are excited to say that he accepted! > > David Jacot has been contributing to Apache Kafka since July 2015 (!) > and has been very act

Re: [ANNOUNCE] New committer: Chia-Ping Tsai

2020-10-19 Thread James Cheng
Congratulations Chia-Ping! -James > On Oct 19, 2020, at 10:24 AM, Guozhang Wang wrote: > > Hello all, > > I'm happy to announce that Chia-Ping Tsai has accepted his invitation to > become an Apache Kafka committer. > > Chia-Ping has been contributing to Kafka since March 2018 and has made 74

Re: [ANNOUNCE] New committer: A. Sophie Blee-Goldman

2020-10-19 Thread James Cheng
Congratulations, Sophie! -James > On Oct 19, 2020, at 9:40 AM, Matthias J. Sax wrote: > > Hi all, > > I am excited to announce that A. Sophie Blee-Goldman has accepted her > invitation to become an Apache Kafka committer. > > Sophie is actively contributing to Kafka since Feb 2019 and has > a

Re: [ANNOUNCE] New Kafka PMC member: Matthias J. Sax

2019-04-19 Thread James Cheng
Congrats!! -James Sent from my iPhone > On Apr 18, 2019, at 2:35 PM, Guozhang Wang wrote: > > Hello Everyone, > > I'm glad to announce that Matthias J. Sax is now a member of Kafka PMC. > > Matthias has been a committer since Jan. 2018, and since then he continued > to be active in the commu

Re: [ANNOUNCE] New Committer: Randall Hauch

2019-02-14 Thread James Cheng
Congrats, Randall! Well deserved! -James Sent from my iPhone > On Feb 14, 2019, at 6:16 PM, Guozhang Wang wrote: > > Hello all, > > The PMC of Apache Kafka is happy to announce another new committer joining > the project today: we have invited Randall Hauch as a project committer and > he has

Re: [ANNOUNCE] New Committer: Vahid Hashemian

2019-01-15 Thread James Cheng
Congrats, Vahid!! -James > On Jan 15, 2019, at 2:44 PM, Jason Gustafson wrote: > > Hi All, > > The PMC for Apache Kafka has invited Vahid Hashemian as a project committer > and > we are > pleased to announce that he has accepted! > > Vahid has made numerous contributions to the Kafka communi

Re: [ANNOUNCE] Apache Kafka 2.1.0

2018-11-21 Thread James Cheng
Thanks Dong for running the release, and congrats to everyone in the community! -James Sent from my iPhone > On Nov 21, 2018, at 10:09 AM, Dong Lin wrote: > > The Apache Kafka community is pleased to announce the release for Apache > Kafka 2.1.0 > > > This is a major release and includes sig

Re: [ANNOUNCE] New committer: Colin McCabe

2018-10-02 Thread James Cheng
Congrats, Colin! -James > On Sep 25, 2018, at 1:39 AM, Ismael Juma wrote: > > Hi all, > > The PMC for Apache Kafka has invited Colin McCabe as a committer and we are > pleased to announce that he has accepted! > > Colin has contributed 101 commits and 8 KIPs including significant > improvemen

Re: [ANNOUNCE] New Kafka PMC member: Dong Lin

2018-08-21 Thread James Cheng
Congrats Dong! -James > On Aug 20, 2018, at 3:54 AM, Ismael Juma wrote: > > Hi everyone, > > Dong Lin became a committer in March 2018. Since then, he has remained > active in the community and contributed a number of patches, reviewed > several pull requests and participated in numerous KIP d

Re: [ANNOUNCE] Apache Kafka 2.0.0 Released

2018-07-30 Thread James Cheng
Congrats and great job, everyone! Thanks Rajini for driving the release! -James Sent from my iPhone > On Jul 30, 2018, at 3:25 AM, Rajini Sivaram wrote: > > The Apache Kafka community is pleased to announce the release for > > Apache Kafka 2.0.0. > > > > > > This is a major release and i

Re: [ANNOUNCE] Apache Kafka 1.1.0 Released

2018-03-29 Thread James Cheng
Thanks Damian and Rajini for running the release! Congrats and good job everyone! -James Sent from my iPhone > On Mar 29, 2018, at 2:27 AM, Rajini Sivaram wrote: > > The Apache Kafka community is pleased to announce the release for > > Apache Kafka 1.1.0. > > > Kafka 1.1.0 includes a numbe

Re: [ANNOUNCE] New Committer: Dong Lin

2018-03-28 Thread James Cheng
Congrats, Dong! -James > On Mar 28, 2018, at 10:58 AM, Becket Qin wrote: > > Hello everyone, > > The PMC of Apache Kafka is pleased to announce that Dong Lin has accepted > our invitation to be a new Kafka committer. > > Dong started working on Kafka about four years ago, since which he has >

Re: [ANNOUNCE] Apache Kafka 1.0.1 Released

2018-03-06 Thread James Cheng
Congrats, everyone! Thanks for driving the release, Ewen! -James > On Mar 6, 2018, at 1:22 PM, Guozhang Wang wrote: > > Ewen, thanks for driving the release!! > > > Guozhang > > On Tue, Mar 6, 2018 at 1:14 PM, Ewen Cheslack-Postava wrote: > >> The Apache Kafka community is pleased to annou

Re: [ANNOUNCE] New Kafka PMC Member: Rajini Sivaram

2018-01-18 Thread James Cheng
Congrats Rajini! -James Sent from my iPhone > On Jan 17, 2018, at 10:48 AM, Gwen Shapira wrote: > > Dear Kafka Developers, Users and Fans, > > Rajini Sivaram became a committer in April 2017. Since then, she remained > active in the community and contributed major patches, reviews and KIP >

Re: [VOTE] KIP-247: Add public test utils for Kafka Streams

2018-01-18 Thread James Cheng
+1 (non-binding) -James Sent from my iPhone > On Jan 17, 2018, at 6:09 PM, Matthias J. Sax wrote: > > Hi, > > I would like to start the vote for KIP-247: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-247%3A+Add+public+test+utils+for+Kafka+Streams > > > -Matthias >

Re: [ANNOUNCE] New committer: Matthias J. Sax

2018-01-12 Thread James Cheng
Congrats, Matthias!! Well deserved! -James > On Jan 12, 2018, at 2:59 PM, Guozhang Wang wrote: > > Hello everyone, > > The PMC of Apache Kafka is pleased to announce Matthias J. Sax as our > newest Kafka committer. > > Matthias has made tremendous contributions to Kafka Streams API since earl

Re: Insanely long recovery time with Kafka 0.11.0.2

2018-01-11 Thread James Cheng
We saw this as well, when updating from 0.10.1.1 to 0.11.0.1. Have you restarted your brokers since then? Did it take 8h to start up again, or did it take its normal 45 minutes? I don't think it's related to the crash/recovery. Rather, I think it's due to the upgrade from 0.10.1.1 to 0.11.0.1

Re: [ANNOUNCE] New committer: Onur Karaman

2017-11-06 Thread James Cheng
Congrats Onur! Well deserved! -James > On Nov 6, 2017, at 9:24 AM, Jun Rao wrote: > > Hi, everyone, > > The PMC of Apache Kafka is pleased to announce a new Kafka committer Onur > Karaman. > > Onur's most significant work is the improvement of Kafka controller, which > is the brain of a Kafka

Re: [ANNOUNCE] Apache Kafka 1.0.0 Released

2017-11-01 Thread James Cheng
dy Chambers, > Apurva Mehta, Armin Braun, Attila Kreiner, Balint Molnar, Bart De Vylder, > Ben Stopford, Bharat Viswanadham, Bill Bejeck, Boyang Chen, Bryan Baugher, > Colin P. Mccabe, Koen De Groote, Dale Peakall, Damian Guy, Dana Powers, > Dejan Stojadinović, Derrick Or, Dong Lin, Zhen

How do I instantiate a metrics reporter in Kafka Streams, with custom config?

2017-11-01 Thread James Cheng
Hi, we have a KafkaStreams app. We specify a custom metric reporter by doing: Properties config = new Properties(); config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "kafka-broker1:9092"); config.put(StreamsConfig.METRIC_REPORTER_CLASSES_CONFIG, "com.mycompany.MetricRepo

Re: In which scenarios would "INVALID_REQUEST" be returned for "Offset Request"

2017-09-24 Thread James Cheng
Your client library might be sending a message that is too old or too new for your broker to understand. What version is your Kafka client library, and what version is your broker? -James Sent from my iPhone > On Sep 22, 2017, at 4:09 PM, Vignesh wrote: > > Hi, > > In which scenarios would

Re: Metrics: committed offset, client version

2017-09-20 Thread James Cheng
KIP-188 is expected to be in the upcoming 1.0.0 release. It will add client-side JMX metrics that show the client version number. https://cwiki.apache.org/confluence/display/KAFKA/KIP-188+-+Add+new+metrics+to+support+health+checks

Re: Kafka Internals Video/Blog

2017-09-20 Thread James Cheng
This recent meetup had a presentation of the internals of the Kafka Controller. https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/events/242656767/ The video is not yet available, but hopefully will be soon. -J

Re: Improving Kafka State Store performance

2017-09-16 Thread James Cheng
In addition to the measurements that you are doing yourself, Kafka Streams also has its own metrics. They are exposed via JMX, if you have that enabled: http://kafka.apache.org/documentation/#monitoring If you set metrics.recording.level="debu

Re: Consumer group metadata retention

2017-07-26 Thread James Cheng
The offsets.retention.minutes value (1440 = 24 hours = 1 day) is a broker level configuration, and can't be changed dynamically during runtime. You would have to modify the broker configurations, and restart the brokers. -James > On Jul 25, 2017, at 9:43 PM, Raghu Angadi wrote: > > I am writi

Re: Tuning up mirror maker for high thruput

2017-07-24 Thread James Cheng
t’s called receive.buffer.bytes. Again, you can set this to -1 to use > the OS configuration. Make sure to restart the applications after making > all these changes, of course. > > -Todd > > > On Sat, Jul 22, 2017 at 1:27 AM, James Cheng wrote: > >> Becket Qin from Linke

Re: Tuning up mirror maker for high thruput

2017-07-22 Thread James Cheng
Becket Qin from LinkedIn spoke at a meetup about how to tune the Kafka producer. One scenario that he described was tuning for situations where you had high network latency. See slides at https://www.slideshare.net/mobile/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600 and vid

Re: Consumer offsets partitions size much bigger than others

2017-07-18 Thread James Cheng
It's possible that the log-cleaning thread has crashed. That is the thread that implements log compaction. Look in the log-cleaner.log file in your kafka debuglog directory to see if there is any indication that it has crashed (error messages, stack traces, etc). What version of kafka are you u

Re: Mirroring multiple clusters into one

2017-07-06 Thread James Cheng
art with "mirror." This prevents us from creating mirroring loops. > Thanks. > --Vahid > > > > From: James Cheng > To: users@kafka.apache.org > Cc: dev > Date: 07/06/2017 12:37 PM > Subject:Re: Mirroring multiple clusters into one > >

Re: Mirroring multiple clusters into one

2017-07-06 Thread James Cheng
I'm not sure what the "official" recommendation is. At TiVo, we *do* run all our mirrormakers near the target cluster. It works fine for us, but we're still fairly inexperienced, so I'm not sure how strong of a data point we should be. I think the thought process is, if you are mirroring from a

Re: mirroring Kafka while preserving the order

2017-06-29 Thread James Cheng
MirrorMaker acts as a consumer+producer. So it will consume from the source topic and produce to the destination topic. That means that the destination partition is chosen using the same technique as the normal producer: * if the source record has a key, the key will be hashed and the hash will

Re: Slow Consumer Group Startup

2017-06-13 Thread James Cheng
Bryan, This sounds related to https://cwiki.apache.org/confluence/display/KAFKA/KIP-134%3A+Delay+initial+consumer+group+rebalance and https://issues.apache.org/jira/browse/KAFKA-4925. -James > On Jun 13, 2017, at 7:02 AM, Bryan Baugher wrote: > > The topics already exist prior to starting an

Re: [ANNOUNCE] New committer: Damian Guy

2017-06-09 Thread James Cheng
Congrats Damian! -James > On Jun 9, 2017, at 1:34 PM, Guozhang Wang wrote: > > Hello all, > > > The PMC of Apache Kafka is pleased to announce that we have invited Damian > Guy as a committer to the project. > > Damian has made tremendous contributions to Kafka. He has not only > contributed

Re: stunning error - Request of length 1550939497 is not valid, it is larger than the maximum size of 104857600 bytes

2017-04-26 Thread James Cheng
Ramya, Todd, Jiefu, David, Sorry to drag up an ancient thread. I was looking for something in my email archives, and ran across this, and I might have solved part of these mysteries. I ran across this post that talked about seeing weirdly large allocations when incorrect requests are accidental

Re: ISR churn

2017-03-22 Thread James Cheng
Marcos, Radu, Are either of you seeing messages saying "Cached zkVersion [...] not equal to that in zookeeper"? If so, you may be hitting https://issues.apache.org/jira/browse/KAFKA-3042 Not sure if that helps you, since I haven't been able i

Re: Offset commit request failing

2017-03-17 Thread James Cheng
I think it's due to the high number of partitions and the high number of consumers in the group. The group coordination info to keep track of the assignments actually happens via a message that travels through the __consumer_offsets topic. So with so many partitions and consumers, the message g

Re: How does offsets.retention.minutes work

2017-03-16 Thread James Cheng
Yes, that is correct. I filed a JIRA about that issue here: https://issues.apache.org/jira/browse/KAFKA-4682 -James > On Mar 15, 2017, at 8:51 PM, tao xiao wrote: > > Hi team, > > I know that Kafka deletes offset for a consumer group after a period of > time (configured by offsets.retention.m

Re: Clarification on min.insync.replicas​

2017-03-07 Thread James Cheng
> On Mar 7, 2017, at 12:18 PM, James Cheng wrote: > > >> On Mar 7, 2017, at 7:44 AM, Shrikant Patel wrote: >> >> Thanks for clarification. I am seeing strange behavior in that case, >> >> When I set min.insync.replicas=2 in my server.properties (resta

Re: Clarification on min.insync.replicas​

2017-03-07 Thread James Cheng
> On Mar 7, 2017, at 7:44 AM, Shrikant Patel wrote: > > Thanks for clarification. I am seeing strange behavior in that case, > > When I set min.insync.replicas=2 in my server.properties (restart the server) > and set the acks=all on producer, I am still able to publish to topic even > when on

Re: Question about messages in __consumer_offsets topic

2017-02-23 Thread James Cheng
Yup, this got fixed in 0.10.2 https://issues.apache.org/jira/browse/KAFKA-2000 -James > On Feb 23, 2017, at 11:10 AM, Jeff Widman wrote: > > The topic deletion only triggers tombstone on brokers >= 0.10.2, correct? I > thought there was an ou

Re: [ANNOUNCE] Apache Kafka 0.10.2.0 Released

2017-02-22 Thread James Cheng
Woohoo! Thanks for running the release, Ewen! -James > On Feb 22, 2017, at 12:33 AM, Ewen Cheslack-Postava wrote: > > The Apache Kafka community is pleased to announce the release for Apache > Kafka 0.10.2.0. This is a feature release which includes the completion > of 15 KIPs, over 200 bug fix

Re: KIP-122: Add a tool to Reset Consumer Group Offsets

2017-02-16 Thread James Cheng
Yeah, that's a good point. Some of the operations might make sense on multiple partitions at once. Moving to a timestamp might apply to all partitions, moving backwards and forwards by N offsets might apply to all partitions. However, moving to a specific offset ("set to offset 43") would most

Is anyone running Kafka on CoreOS?

2017-02-10 Thread James Cheng
Hi, (This question is kinda Kafka related, but mostly CoreOS related, so sorry if this is the wrong place to ask this.) Is anyone running Kafka on CoreOS? We run Kafka in docker containers on CoreOS. CoreOS has an OS-update policy where they will automatically install new OS updates, during wh

Re: Kafka docs for current trunk

2017-02-01 Thread James Cheng
+1 In particular, this would help when there is a doc change submitted to trunk which is applicable to the currently released version. It would help the change get out there faster. -James > On Feb 1, 2017, at 9:03 AM, Guozhang Wang wrote: > > +1 > > Guozhang > > > On Wed, Feb 1, 2017 at

Re: [ANNOUNCE] New committer: Grant Henke

2017-01-11 Thread James Cheng
Congrats, Grant!! -James > On Jan 11, 2017, at 11:51 AM, Gwen Shapira wrote: > > The PMC for Apache Kafka has invited Grant Henke to join as a > committer and we are pleased to announce that he has accepted! > > Grant contributed 88 patches, 90 code reviews, countless great > comments on discu

Re: Under-replicated Partitions while rolling Kafka nodes in AWS

2017-01-05 Thread James Cheng
> On Jan 5, 2017, at 7:55 AM, Jack Lund wrote: > > Hello, all. > > We're running multiple Kafka clusters in AWS, and thus multiple Zookeeper > clusters as well. When we roll out changes to our zookeeper nodes (which > involves changes to the AMI, which means terminating the zookeeper instance >

Re: Lost message with Kafka configuration

2017-01-05 Thread James Cheng
> On Jan 5, 2017, at 8:23 AM, Hoang Bao Thien wrote: > > Yes, the problem is from producer configuration. And James Cheng has told > me how to fix it. > However I still get other poblem with a large file: > > org.apache.kafka.common.errors.TimeoutException: Batch con

Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread James Cheng
ting consumer group via the join API " I'm guessing that this would affect any of those scenarios. -James > > > > On Thu, Jan 5, 2017 at 12:40 AM, James Cheng wrote: > >> Jeff, >> >> Your analysis is correct. I would say that it is known but unintu

Re: Lost message with Kafka configuration

2017-01-05 Thread James Cheng
kafka-console-producer.sh defaults to acks=0, which means that the producer essentially throws messages at the broker and doesn't wait/retry to make sure they are properly received. In the kafka-console-producer.sh usage text: --request-required-acksrequests (default: 0) Try

Re: Is this a bug or just unintuitive behavior?

2017-01-05 Thread James Cheng
Jeff, Your analysis is correct. I would say that it is known but unintuitive behavior. As an example of a problem caused by this behavior, it's possible for mirrormaker to miss messages on newly created topics, even thought it was subscribed to them before topics were creted. See the following

Why does consumer.subscribe(Pattern) require a ConsumerRebalanceListener?

2017-01-03 Thread James Cheng
Hi, I was looking at the docs for the consumer, and noticed that when calling subscribe() with a regex Pattern, that you are required to pass in a ConsumerRebalanceListener. On the other hand, when you use a fixed set of topic names (Collection), the ConsumerRebalanceListener is optional (that

When using mirrormaker, how are people creating topics?

2016-12-05 Thread James Cheng
Hi, We are using mirrormaker to mirror topics from one cluster to another, and I wanted to get some advice from the community on how people are doing mirroring. In particular, how are people dealing with topic creation? Do you turn on auto-topic creation in your destination clusters (auto.crea

Re: [ANNOUNCE] New committer: Jiangjie (Becket) Qin

2016-10-31 Thread James Cheng
Congrats, Becket! -James > On Oct 31, 2016, at 10:35 AM, Joel Koshy wrote: > > The PMC for Apache Kafka has invited Jiangjie (Becket) Qin to join as a > committer and we are pleased to announce that he has accepted! > > Becket has made significant contributions to Kafka over the last two years

Re: Read all record from a Topic.

2016-07-13 Thread James Cheng
Jean-Baptiste, I wrote a blog post recently on this exact subject. https://logallthethings.com/2016/06/28/how-to-read-to-the-end-of-a-kafka-topic/ Let me know if you find it useful. -James Sent from my iPhone > On Jul 13, 2016, at 7:16 AM, g...@netcourrier.com wrote: > > Hi, > > > I'm usin

Re: kafka + autoscaling groups fuckery

2016-07-03 Thread James Cheng
Charity, I'm not sure about the specific problem you are having, but about Kafka on AWS, Netflix did a talk at a meetup about their Kafka installation on AWS. There might be some useful information in there. There is a video stream as well as slides, and maybe you can get in touch with the spea

Re: Halting because log truncation is not allowed for topic __consumer_offsets

2016-06-26 Thread James Cheng
Peter, can you add some of your observations to those JIRAs? You seem to have a good understanding of the problem. Maybe there is something that can be improved in the codebase to prevent this from happening, or reduce the impact of it. Wanny, you might want to add a "me too" to the JIRAs as we

Re: 10MB message

2016-06-15 Thread James Cheng
Igor, This article talks about what to think about if putting large messages into Kafka: http://ingest.tips/2015/01/21/handling-large-messages-kafka/ The summary is that Kafka is not optimized for handling large messages, but if you really want to, it's possible to do it. That website is havin

Re: Will segments on no-traffic topics get deleted/compacted?

2016-05-24 Thread James Cheng
gt; Tom Crayford > Heroku Kafka > > On Fri, May 20, 2016 at 12:49 AM, James Cheng wrote: > >> Time-based log retention only happens on old log segments. And log >> compaction only happens on old segments as well. >> >> Currently, I believe segments only roll whene

Will segments on no-traffic topics get deleted/compacted?

2016-05-19 Thread James Cheng
Time-based log retention only happens on old log segments. And log compaction only happens on old segments as well. Currently, I believe segments only roll whenever a new record is written to the log. That is, during the write of the new record is when the current segment is evaluated to see if

Do consumer offsets stored in zookeeper ever get cleaned up?

2016-05-19 Thread James Cheng
I know that when offsets get stored in Kafka, they get cleaned up based on the offsets.retention.minutes config setting. This happens when using the new consumer, or when using the old consumer but offsets.storage=kafka. If using the old consumer where offsets are stored in Zookeeper, do old off

Re: unknown (kafka) offsets after restart

2016-05-06 Thread James Cheng
Is the log compaction thread correctly working? The offsets are stored in a log compacted topic, and we have seen issues where the log cleaner thread dies and therefore the offsets topic just grows forever, which means it will take a long time to read in the topic. You can look in the log-clean

Re: Consumers disappearing form __consumer_offsets

2016-04-11 Thread James Cheng
This may be related to offsets.retention.minutes. offsets.retention.minutes Log retention window in minutes for offsets topic It defaults to 1440 minutes = 24 hours. -James > On Apr 11, 2016, at 1:36 PM, Morellato, Wanny > wrote: > > Hi, > > I am trying to figure out why some of my consumers

Re: Reg. Partition Rebalancing

2016-03-29 Thread James Cheng
> On Mar 29, 2016, at 10:33 AM, Todd Palino wrote: > > There’s two things that people usually mean when they talk about > rebalancing. > > One is leader reelection, or preferred replica election, which is sometimes > confusingly referred to as “leader rebalance”. This is when we ask the > control

Re: kafka 0.9.0.1: FATAL exception on startup

2016-03-22 Thread James Cheng
Hi, we ran into this problem too. The only way we were able to bypass this was by stopping Kafka and deleting the log directory of the affected partition. Which means, we lost data for that partition on this broker. -James > On Mar 8, 2016, at 1:07 AM, Anatoly Deyneka wrote: > > Hi, > > I need

Re: Multi-threaded consumer?

2016-03-22 Thread James Cheng
Here's a good introductory blog post on the 0.9.0 consumer: http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client It shows the basics of using the consumer, as well as a section where they launch 3 threads, each with one consumer, to consume a single

What happens if controlled shutdown can't complete within controlled.shutdown.max.retries attempts?

2016-03-20 Thread James Cheng
The broker has the following parameters related to controlled shutdown: controlled.shutdown.enable Enable controlled shutdown of the server boolean truemedium controlled.shutdown.max.retries Controlled shutdown can fail for multiple reasons. This determines the number of

Re: Questions about unclean leader election and "Halting because log truncation is not allowed"

2016-03-15 Thread James Cheng
: > https://issues.apache.org/jira/browse/KAFKA-2143 > > Thank you, > > Tony > > On Thu, Feb 25, 2016 at 3:46 PM, James Cheng wrote: > >> Hi, >> >> I ran into a scenario where one of my brokers would continually shutdown, >> with the error messa

Re: Uneven GC behavior between nodes

2016-03-05 Thread James Cheng
Your partitions are balanced, but is your data being evenly written across all the partitions? How are you producing data? Are you producing them with keys? Is it possible that the majority of the messages being written to just a few partitions, and so the brokers for those partitions are seeing

Re: Writing a Producer from Scratch

2016-03-03 Thread James Cheng
Stephen, There is a mailing list for kafka client developers that you may find useful: https://groups.google.com/forum/#!forum/kafka-clients The d...@kafka.apache.org mailing list might also be a good resource: http://kafka.apache.org/contact.html Lastly, do you h

Re: About the number of partitions

2016-03-02 Thread James Cheng
Kim, Here's a good blog post from Confluent with advice on how to choose the number of partitions. http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ -James > On Mar 1, 2016, at 4:11 PM, BYEONG-GI KIM wrote: > > Hello. > > I have questions about how

Unavailable partitions (Leader: -1 and ISR is empty) and we can't figure out how to get them back online

2016-03-01 Thread James Cheng
Hi, We have 44 partitions in our cluster that are unavailable. kafka-topics.sh is reporting them with Leader: -1, and with no brokers in the ISR. Zookeeper says that broker 5 should be the partition leader for this topic partition. These are topics with replication-factor 1. Most of the topics

Re: Kafka Rest Proxy

2016-03-01 Thread James Cheng
Jan, I don't use the rest proxy, but Confluent has a mailing list where you can probably get more info: Here's the direct link: https://groups.google.com/forum/#!forum/confluent-platform And it is linked off of here: http://www.confluent.io/developer#documentation -James > On Mar 1, 2016, at

Questions about unclean leader election and "Halting because log truncation is not allowed"

2016-02-25 Thread James Cheng
Hi, I ran into a scenario where one of my brokers would continually shutdown, with the error message: [2016-02-25 00:29:39,236] FATAL [ReplicaFetcherThread-0-1], Halting because log truncation is not allowed for topic test, Current leader 1's latest offset 0 is less than replica 2's latest offs

Re: Discrepancy between JMX OfflinePartitionCount and kafka-topics.sh?

2016-02-09 Thread James Cheng
I ran into kind of a similar discrepancy, but about UnderReplicatedPartitions. kafka-topics.sh and zookeeper were saying that we had underreplicated partitions. But JMX said that there were none. I took one of the partitions that ZK was saying was under-replicated and I ran DumpLogSegments on t

Re: Detecting broker version programmatically

2016-02-04 Thread James Cheng
> On Feb 4, 2016, at 8:28 PM, Manikumar Reddy wrote: > > Currently it is available through JMX Mbean. It is not available on wire > protocol/requests. > The name of the JMX Mbean is kafka.server:type=app-info,id=4 Not sure what the id=4 means. -James > Pending JIRAs related to this: > https:/

Re: at-least-once delivery

2016-01-30 Thread James Cheng
> On Jan 30, 2016, at 4:21 AM, Franco Giacosa wrote: > > Sorry, this solved my questions: "Setting a value greater than zero will > cause the client to resend any record whose send fails with a potentially > transient error. Note that this retry is no different than if the client > resent the rec

Re: MongoDB Kafka Connect driver

2016-01-29 Thread James Cheng
Not sure if this will help anything, but just throwing it out there. The Maxwell and mypipe projects both do CDC from MySQL and support bootstrapping. The way they do it is kind of "eventually consistent". 1) At time T1, record coordinates of the end of the binlog as of T1. 2) At time T2, do a f

Re: Accumulating data in Kafka Connect source tasks

2016-01-29 Thread James Cheng
> On Jan 29, 2016, at 7:06 AM, Randall Hauch wrote: > > On January 28, 2016 at 7:07:02 PM, Ewen Cheslack-Postava (e...@confluent.io) > wrote: > Randall, > > Great question. Ideally you wouldn't need this type of state since it > should really be available in the source system. In your case, it m

Re: Accumulating data in Kafka Connect source tasks

2016-01-28 Thread James Cheng
> On Jan 28, 2016, at 5:06 PM, Ewen Cheslack-Postava wrote: > > Randall, > > Great question. Ideally you wouldn't need this type of state since it > should really be available in the source system. In your case, it might > actually make sense to be able to grab that information from the DB itself

Re: Offset storage issue with kafka(0.8.2.1)

2016-01-27 Thread James Cheng
> On Jan 27, 2016, at 8:25 PM, Sivananda Reddys Thummala Abbigari > wrote: > > Hi, > > # *Kafka Version*: 0.8.2.1 > > # *My consumer.propeties have the following properties*: >exclude.internal.topics=false >offsets.storage=kafka >dual.commit.enabled=false > > # With the above configu

Re: Kafka + ZooKeeper on the same hardware?

2016-01-18 Thread James Cheng
> On Jan 18, 2016, at 12:21 PM, Dick Davies wrote: > > Started an Ansible playbook using the Confluent platform RPM distro, > and it seems that co-locates zookeepers > on the brokers. > > So I'm assuming it's fine (at least on 0.9.x for the reasons Todd mentioned). > > Does anyone know if the Con

Re: how to reset kafka offset in zookeeper

2015-12-19 Thread James Cheng
This page describes what Kafka stores in Zookeeper: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper It looks like the info for a particular consumer groupId is stored at: /consumers// According to https://community.cloudera.com/t5/Cloudera-Labs/Kafka-Parcels

Re: kafka-connect-jdbc: ids, timestamps, and transactions

2015-12-18 Thread James Cheng
Mark, what database are you using? If you are using MySQL... There is a not-yet-finished Kafka MySQL Connector at https://github.com/wushujames/kafka-mysql-connector. It tails the MySQL binlog, and so will handle the situation you describe. But, as I mentioned, I haven't finished it yet. If

0.8.2 high level consumer with one-time-use group.id's?

2015-12-15 Thread James Cheng
When using the 0.8.2 high level consumer, what is the impact of creating many one-time use groupIds and checkpointing offsets using those? I have a use case where upon every boot, I want to consume an entire topic from the very beginning, all partitions. We are using the high level consumer for

Re: Where is replication factor stored?

2015-10-16 Thread James Cheng
mtime = Wed Aug 05 22:48:12 UTC 2015 pZxid = 0xc017a cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 79 numChildren = 0 I tried that for a number of different topics, and none of them have it. -James > Guozhang > > On Fri, Oct 16, 2015 at 12:33 PM, James Ch

Where is replication factor stored?

2015-10-16 Thread James Cheng
Hi, Where is the replication factor for a topic stored? It isn't listed at https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper. But the kafka-topics --describe command returns something. Where is it finding that? Thanks, -James ___

Re: New consumer client compatible with old broker

2015-10-15 Thread James Cheng
> On Oct 15, 2015, at 11:29 AM, tao xiao wrote: > > Hi team, > > Does new consumer client (the one in trunk) work with 0.8.2.x broker? I am > planning to use the new consumer in our development but don't want to > upgrade the broker to the latest. is it possible to do that? Tao, I recently trie

Re: Dealing with large messages

2015-10-05 Thread James Cheng
Here’s an article that Gwen wrote earlier this year on handling large messages in Kafka. http://ingest.tips/2015/01/21/handling-large-messages-kafka/ -James > On Oct 5, 2015, at 11:20 AM, Pradeep Gollakota wrote: > > Fellow Kafkaers, > > We have a pretty heavyweight legacy event logging system

Re: custom message handlers?

2015-09-28 Thread James Cheng
> On Sep 28, 2015, at 12:47 PM, Doug Tomm wrote: > > hello, > > i've noticed the addition of the custom message handler feature in the latest > code; a very useful feature. in what release will it be available, and when > might that be? at present i am building kafka from source to get this

Re: Log Cleaner Thread Stops

2015-09-24 Thread James Cheng
doesn’t it? I guess that means Burrow is only being used to monitor your mirror makers and auditor application, then? -James > -Todd > > > On Wed, Sep 23, 2015 at 3:21 PM, James Cheng wrote: > >> >> On Sep 18, 2015, at 10:25 AM, Todd Palino wrote: >> >>> I

Re: Log Cleaner Thread Stops

2015-09-23 Thread James Cheng
On Sep 18, 2015, at 10:25 AM, Todd Palino wrote: > I think the last major issue with log compaction (that it couldn't handle > compressed messages) was committed as part of > https://issues.apache.org/jira/browse/KAFKA-1374 in August, but I'm not > certain what version this will end up in. It ma

Documentation typo for offsets.topic.replication.factor ?

2015-08-05 Thread James Cheng
Hi, My kafka cluster has a __consumer_offsets topic with 50 partitions (the default for offsets.topic.num.partitions) but with a replication factor of just 1 (the default for offsets.topic.replication.factor should be 3). From the docs http://kafka.apache.org/documentation.html: offsets.topic.

Re: Checkpointing with custom metadata

2015-08-03 Thread James Cheng
of A to sorted-topic, that I would store "finished doing initial copy of A" into my checkpoint, and that upon restart, I would check that and know to start doing the merge sort of A B C. I have a couple other designs that seem cleaner, tho, so I might not actually need it. -James

Checkpointing with custom metadata

2015-08-03 Thread James Cheng
According to https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest, we can store custom metadata with our checkpoints. It looks like the high level consumer does not support committing offsets with metadata, and that in orde

Re: New consumer - poll/seek javadoc confusing, need clarification

2015-07-21 Thread James Cheng
> On Jul 21, 2015, at 9:15 AM, Ewen Cheslack-Postava wrote: > > On Tue, Jul 21, 2015 at 2:38 AM, Stevo Slavić wrote: > >> Hello Apache Kafka community, >> >> I find new consumer poll/seek javadoc a bit confusing. Just by reading docs >> I'm not sure what the outcome will be, what is expected

Consuming from Kafka but don't need to save offsets

2015-07-20 Thread James Cheng
Hi, I have a web service that serves up some data that it obtains from a kafka topic. When the process starts up, it wants to load the entire kafka topic into memory, and serve the data up from an in-memory hashtable. The data in the topic has primary keys and is log compacted, and so the total

Re: New producer in production

2015-07-17 Thread James Cheng
found it here: > http://kafka.apache.org/documentation.html#newproducerconfigs, the same > link is reported by James. > > @Joel: Thanks a lot for the info, I will use new producer > > Regards, > Siva. > > On Fri, Jul 17, 2015 at 12:02 PM, James Cheng wrote: > >&g

  1   2   >