Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-17 Thread Xiao
Linkedin Gabblin compaction tool is using Hive to perform the compaction. Does it mean Lumos is replaced? Confused… On Mar 17, 2015, at 10:00 PM, Xiao wrote: > Hi, all, > > Do you know whether Linkedin plans to open source Lumos in the near future? > > I found the answer from Qiao Lin’s po

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-17 Thread Xiao
Hi, all, Do you know whether Linkedin plans to open source Lumos in the near future? I found the answer from Qiao Lin’s post about replication from Oracle/mySQL to Hadoop. - https://engineering.linkedin.com/data-ingestion/gobblin-big-data-ease At the source side, it can be DataBus-ba

Re: No topic owner when using different assignment strategies

2015-03-17 Thread Jiangjie Qin
Yes, store info in zookeeper would work. In new consumer since the coordinator will resides on the server side, this would be easily detected. I’m not sure if it is still worth making this change on the old consumer, though. Especially this is a backward incompatible change in a sense that all the

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-17 Thread Arya Ketan
AFAIK , linkedin uses databus to do the same. Aesop is built on top of databus , extending its beautiful capabilities to mysql n hbase On Mar 18, 2015 7:37 AM, "Xiao" wrote: > Hi, all, > > Do you know how Linkedin team publishes changed rows in Oracle to Kafka? I > believe they already knew the w

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-17 Thread Xiao
Hi, all, Do you know how Linkedin team publishes changed rows in Oracle to Kafka? I believe they already knew the whole problem very well. Using triggers? or directly parsing the log? or using any Oracle GoldenGate interfaces? Any lesson or any standard message format? Could the Linkedin peo

Re: No topic owner when using different assignment strategies

2015-03-17 Thread tao xiao
The intention of this test is to check how kafka would behaves if two different assignment strategies are set in the same consumer group. In reality this would happen as we never know what configurations downstream consumers would use. What about we store the assignment strategy in zk and send out

Re: No topic owner when using different assignment strategies

2015-03-17 Thread Xiao
I think this is a usability issue. It might need an extra admin tool to verify if all configuration settings are correct, even if the broker can return an error message to the consumers. Thanks, Xiao Li On Mar 17, 2015, at 5:18 PM, Jiangjie Qin wrote: > The problem is the consumers are ind

Re: No topic owner when using different assignment strategies

2015-03-17 Thread Jiangjie Qin
The problem is the consumers are independent to each other. We purely depend on the same algorithm running on different consumers to achieve agreement on partition assignment. Breaking this assumption violates the design in the first place. On 3/17/15, 4:13 PM, "Mayuresh Gharat" wrote: >Probably

Re: No topic owner when using different assignment strategies

2015-03-17 Thread Mayuresh Gharat
Probably we should return an error response if you already have a partition assignment strategy inplace for a group and you try to use other strategy. Thanks, Mayuresh On Tue, Mar 17, 2015 at 2:10 PM, Jiangjie Qin wrote: > Yeah, using different partition assignment algorithms in the same consu

Re: Broker Exceptions

2015-03-17 Thread Mayuresh Gharat
We are trying to see what might have caused it. We had some questions : 1) Is this reproducible? That way we can dig deep. This looks interesting problem to solve and you might have caught a bug, but we need to verify the root cause before filing a ticket. Thanks, Mayuresh On Tue, Mar 17, 201

Re: Monitoring of consumer group lag

2015-03-17 Thread Otis Gospodnetic
Mathias, SPM for Kafka will give you Consumer Offsets by Host, Consumer Id, Topic, and Partition, and you can alert (thresholds and/or anomalies) on any combination of these, and of course on any of the other 100+ Kafka metrics there. See http://blog.sematext.com/2015/02/10/kafka-0-8-2-monitoring/

Fw: How to measure performance of Mirror Maker

2015-03-17 Thread Saladi Naidu
Any suggestions on how to measure throughput of the Mirror Maker Naidu Saladi - Forwarded Message - From: Saladi Naidu To: "users@kafka.apache.org" Sent: Monday, March 16, 2015 10:31 PM Subject: How to measure performance of Mirror Maker We have three Kafka clusters deploy

Kafka deployment across DC

2015-03-17 Thread Shrikant Patel
When I sent the email last time the message formatting was messed. Hopefully this does not have happen this time. We have very unique problem. We have an application deployed on WebLogic cluster that is spread across 2 datacenter (active-active) DC1 and DC2 (different LAN but same WAN). This pr

Re: Broker Exceptions

2015-03-17 Thread Zakee
> What version are you running ? Version 0.8.2.0 > Your case is 2). But the only thing weird is your replica (broker 3) is > requesting for offset which is greater than the leaders log end offset. So what could be the cause? Thanks Zakee > On Mar 17, 2015, at 11:45 AM, Mayuresh Gharat > w

Re: No topic owner when using different assignment strategies

2015-03-17 Thread Jiangjie Qin
Yeah, using different partition assignment algorithms in the same consumer group won¹t work. Is there a particular reason you want to do this? On 3/17/15, 8:32 AM, "tao xiao" wrote: >This is the corrected zk result > >Here is the result from zk >[zk: localhost:2181(CONNECTED) 0] get >/consumers/

Re: Monitoring of consumer group lag

2015-03-17 Thread Robin Yamaguchi
Hi Mathias, We call bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker via NRPE, and alert through Nagios. -Robin On Tue, Mar 17, 2015 at 2:46 AM, Kasper Mackenhauer Jacobsen < kas...@falconsocial.com> wrote: > Hi Mathias, > > We're currently using a custom solution that queries kafka and

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-17 Thread James Cheng
This is a great set of projects! We should put this list of projects on a site somewhere so people can more easily see and refer to it. These aren't Kafka-specific, but most seem to be "MySQL CDC." Does anyone have a place where they can host a page? Preferably a wiki, so we can keep it up to d

Re: schema.registry.url = null

2015-03-17 Thread Ewen Cheslack-Postava
Clint, Your code looks fine and the output doesn't actually have any errors, but you're also not waiting for the messages to be published. Try changing producer.send(data); to producer.send(data).get(); to wait block until the message has been acked. If it runs and exits cleanly, then you shou

Re: Broker Exceptions

2015-03-17 Thread Mayuresh Gharat
What version are you running ? The code for latest version says that : 1) if the log end offset of the replica is greater than the leaders log end offset, the replicas offset will be reset to logEndOffset of the leader. 2) Else if the log end offset of the replica is smaller than the leaders log

Re: Support for Java 1.8?

2015-03-17 Thread Roger Hoover
Thanks, Jon. That helps. On Tue, Mar 17, 2015 at 11:34 AM, Jon Bringhurst < jbringhu...@linkedin.com.invalid> wrote: > At LinkedIn, we're running on 1.8.0u5. YRMV depending on hardware and > load, but this is what we typically run with: > > -server > -Xms4g > -Xmx4g > -XX:PermSize=96m > -XX:MaxP

Re: Support for Java 1.8?

2015-03-17 Thread Jon Bringhurst
At LinkedIn, we're running on 1.8.0u5. YRMV depending on hardware and load, but this is what we typically run with: -server -Xms4g -Xmx4g -XX:PermSize=96m -XX:MaxPermSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTim

Re: Broker Exceptions

2015-03-17 Thread Mayuresh Gharat
cool. On Tue, Mar 17, 2015 at 10:15 AM, Zakee wrote: > Hi Mayuresh, > > The logs are already attached and are in reverse order starting backwards > from [2015-03-14 07:46:52,517] to the time when brokers were started. > > Thanks > Zakee > > > > > On Mar 17, 2015, at 12:07 AM, Mayuresh Gharat < >

Re: Support for Java 1.8?

2015-03-17 Thread Roger Hoover
Resurrecting an old thread. Are people running Kafka on Java 8 now? On Sun, Aug 10, 2014 at 11:44 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Just curious if you saw any issues with Java 1.8 or if everything went > smoothly? > > Otis > -- > Performance Monitoring * Log Analytics

Re: Broker Exceptions

2015-03-17 Thread Zakee
Hi Mayuresh, The logs are already attached and are in reverse order starting backwards from [2015-03-14 07:46:52,517] to the time when brokers were started. Thanks Zakee > On Mar 17, 2015, at 12:07 AM, Mayuresh Gharat > wrote: > > Hi Zakee, > > Thanks for the logs. Can you paste earlier l

Re: consumer groups in python

2015-03-17 Thread Kasper Mackenhauer Jacobsen
We set the partitions the python consumers needs manually for now, I'm looking into a solution using zookeeper (possibly) to balance them out automatically though. On Tue, Mar 17, 2015 at 2:51 PM, Todd Palino wrote: > Yeah, this is exactly correct. The python client does not implement the > Zook

Re: No topic owner when using different assignment strategies

2015-03-17 Thread tao xiao
This is the corrected zk result Here is the result from zk [zk: localhost:2181(CONNECTED) 0] get /consumers/test/owners/mm-benchmark-test/0 Node does not exist: /consumers/test/owners/mm-benchmark-test/0 [zk: localhost:2181(CONNECTED) 1] get /consumers/test/owners/mm-benchmark-test1/0 test-loca

No topic owner when using different assignment strategies

2015-03-17 Thread tao xiao
Hi team, I have two consumer instances with the same group id connecting to two different topics with 1 partition created for each. One consumer uses partition.assignment.strategy=roundrobin and the other one uses default assignment strategy. Both consumers have 1 thread spawned internally and con

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-17 Thread Hisham Mardam-Bey
Pretty much a hijack / plug as well (= https://github.com/mardambey/mypipe "MySQL binary log consumer with the ability to act on changed rows and publish changes to different systems with emphasis on Apache Kafka." Mypipe currently encodes events using Avro before pushing them into Kafka and is

Re: Kafka High Level Consumer OOME

2015-03-17 Thread Guozhang Wang
Hello Dima, The current consumer does not have explicit memory control mechanism, but you can try to indirectly bound the memory usage via the following configs: fetch.message.max.bytes and queued.max.message.chunks. Details can be found at http://kafka.apache.org/documentation.html#consumerconfig

schema.registry.url = null

2015-03-17 Thread Clint Mcneil
Hi I can't get the Kafka/Avro serializer producer example to work. import org.apache.avro.Schema; import org.apache.avro.generic.GenericData; import org.apache.avro.generic.GenericRecord; import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.ProducerConf

Kafka High Level Consumer OOME

2015-03-17 Thread Dima Dimas
Hi I face to OOME while trying to consume from one topic 10 partitions (100 000 messages each partition) 5 consumers(consumer groups), consumer.timeout=10ms. OOME was gotten after 1-2 minutes after start. Java heap - Xms=1024M LAN about 10Gbit This is standalone application. Kafka version 0.8.2

Re: consumer groups in python

2015-03-17 Thread Todd Palino
Yeah, this is exactly correct. The python client does not implement the Zookeeper logic that would be needed to do a balanced consumer. While it's certainly possible to do it (for example, Joe implemented it in Go), the logic is non-trivial and nobody has bothered to this point. I don't think anyon

RE: consumer groups in python

2015-03-17 Thread Sloot, Hans-Peter
Thanks I just came across this https://github.com/mumrah/kafka-python/issues/112 It says: That contract of one message per consumer group only works for the coordinated consumers which are implemented for the JVM only (i.e., Scala and Java clients). -Original Message- From: Ste

Re: consumer groups in python

2015-03-17 Thread Steve Miller
It's possible that I just haven't used it but I am reasonably sure that the python API doesn't have a way to store offsets in ZK. You would need to implement something more or less compatible with what the Scala/Java API does, presumably. On the plus side the python API -- possibly just becaus

consumer groups in python

2015-03-17 Thread Sloot, Hans-Peter
Hi, I wrote a small python script to consume messages from kafka. The consumer is defined as follows: kafka = KafkaConsumer('my-replicated-topic', metadata_broker_list=['localhost:9092'], group_id='my_consumer_group', auto_commi

Re: Monitoring of consumer group lag

2015-03-17 Thread Kasper Mackenhauer Jacobsen
Hi Mathias, We're currently using a custom solution that queries kafka and zookeeper (2 different processes) for topic size and consumer offset and submits the information to a collectd/statsd instance that ships it on to graphite, so we can track it in grafana. There's no alerting built in, but

Re: Monitoring of consumer group lag

2015-03-17 Thread Mathias Söderberg
Hi Lance, I tried Kafka Offset Monitor a while back, but it didn't play especially nice with a lot of topics / partitions (we currently have around 1400 topics and 4000 partitions in total). Might be possible to make it work a bit better, but not sure it would be the best way to do alerting. Than

Re: Broker Exceptions

2015-03-17 Thread Mayuresh Gharat
Hi Zakee, Thanks for the logs. Can you paste earlier logs from broker-3 up to : [2015-03-14 07:46:52,517] ERROR [ReplicaFetcherThread-2-4], Current offset 1754769769 for partition [Topic22kv,5] out of range; reset offset to 1400864851 (kafka.server.ReplicaFetcherThread) That would help us figure