Re: Database Replication Question

2015-03-16 Thread Josh Rader
Thanks Guozhang. Are there JIRAs created for tracking idempotent producer or transactional messaging features? Maybe we can pick up some of the tasks to expedite the release? On Fri, Mar 6, 2015 at 1:53 AM, Guozhang Wang wrote: > Josh, > > Dedupping on the consumer side may be tricky as it req

Documentation is not proper for providing custom partitioner new Producer in http://kafka.apache.org/documentation.html

2015-03-16 Thread ankit tyagi
Hi, I have been going through http://kafka.apache.org/documentation.html and read below for providing custom partitioner. - provides software load balancing through an optionally user-specified Partitioner - The routing decision is influenced by the kafka.producer.Partitioner. inte

Re: Log cleaner patch (KAFKA-1641) on 0.8.2.1

2015-03-16 Thread Marc Labbe
So far no. It happens periodically and seems to resolve when we restart the broker (only to eventually reappear). There are a couple of things I need to get out of the way in our setup first but I'll see what I can do to reproduce it. We had a couple of bad shutdowns performed and brokers restarte

[ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-16 Thread Pierre-Yves Ritschard
Hi kafka, I just wanted to mention I published a very simple project which can connect as MySQL replication client and stream replication events to kafka: https://github.com/pyr/sqlstream When you don't have control over an application, it can provide a simple way of consolidating SQL data in kaf

Monitoring of consumer group lag

2015-03-16 Thread Mathias Söderberg
Good day, I'm looking into using SimpleConsumer#getOffsetsBefore and offsets committed in ZooKeeper for monitoring the lag of a consumer group. Our current use case is that we have a service that is continuously consuming messages of a large number of topics and persisting the messages to S3 at s

High level consumer group

2015-03-16 Thread sunil kalva
How do i deploy high level consumer group for one topic in multiple instances of consumer applications, i mean i have two consumer applications deployed in two different boxes reading same topic which contains multiple partitions. Can i achieve this ? -- SunilKalva

Re: High level consumer group

2015-03-16 Thread Gwen Shapira
If you want each application to handle half of the partitions for the topic, you need to configure the same group.id for both applications. In this case, Kafka will just see each app as a consumer in the group and it won't care that they are on different boxes. If you want each application to hand

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-16 Thread James Cheng
Super cool, and super simple. I like how it is pretty much a pure translation of the binlog into Kafka, with no interpretation of the events. That means people can layer whatever they want on top of it. They would have to understand what the mysql binary events mean, but they would just have to

KafkaException: Wrong request type 17260

2015-03-16 Thread Zakee
Does anyone know what this error means? [2015-03-16 10:07:37,513] ERROR Closing socket for /176.*.*.244 because of error (kafka.network.Processor) kafka.common.KafkaException: Wrong request type 17260 at kafka.api.RequestKeys$.deserializerForKey(RequestKeys.scala:64) at kafka.ne

Re: Broker Exceptions

2015-03-16 Thread Mayuresh Gharat
Can you provide more logs (complete) on Broker 3 till time : *[2015-03-14 07:46:52,517*] WARN [ReplicaFetcherThread-2-4], Replica 3 for partition [Topic22kv,5] reset its fetch offset from 1400864851 to current leader 4's start offset 1400864851 (kafka.server.ReplicaFetcherThread) I would like to

Re: KafkaException: Wrong request type 17260

2015-03-16 Thread Gwen Shapira
Kafka currently has request types 0-12. If the bytes Kafka got were parsed to request type 17260, it looks like someone is sending malformed or even random data. Do you recognize the sending IP? do you know about client errors from the same point in time? Perhaps someone is running a port scanner?

Re: High level consumer group

2015-03-16 Thread sunil kalva
Say for example i have a topic A with 6 partitions and two consumer applications with 3 consumer threads each and with the same groupid. If i start one application does it start consuming only 3 partitions, and if start later the other one, does it start picking the rest of the partitions ?. or If

Re: High level consumer group

2015-03-16 Thread Gwen Shapira
When you start one application it will consume all 6 partitions - 2 partitions by each thread. When you later start the second app, the consumer will rebalance and allocate 3 partitions to the new application. On Mon, Mar 16, 2015 at 10:59 AM, sunil kalva wrote: > > Say for example i have a top

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-16 Thread Pierre-Yves Ritschard
Hi James, Thanks for the kind words. I will definitely work on the persitence of binlog position, with a couple of persistence options. The trickier part is figuring out the way to correctly figure out a key for the topic. Not all events indicate which database/table/entity they operate on. Getti

Re: KafkaException: Wrong request type 17260

2015-03-16 Thread Zakee
Yes, I don’t have any clients associated with the sending IP. "a port scanner” is a good clue. Thanks for pointing it out. Thanks Zakee > On Mar 16, 2015, at 10:52 AM, Gwen Shapira wrote: > > Kafka currently has request types 0-12. > If the bytes Kafka got were parsed to request type 17260

Assigning partitions to specific nodes

2015-03-16 Thread Daniel Haviv
Hi, Is it possible to assign specific partitions to specific nodes? I want to upload files to HDFS, find out on which nodes the file resides and then push their path into a topic and partition it by nodes. This way I can ensure that the consumer (Spark Streaming) will consume both the message and f

Re: Monitoring of consumer group lag

2015-03-16 Thread Lance Laursen
Hey Mathias, Kafka Offset Monitor will give you a general idea of where your consumer group(s) are at: http://quantifind.com/KafkaOffsetMonitor/ However, I'm not sure how useful it will be with "a large number of topics" / turning its output into a script that alerts upon a threshold. Could take

Re: Assigning partitions to specific nodes

2015-03-16 Thread Gwen Shapira
Any reason not to use SparkStreaming directly with HDFS files, so you'll get locality guarantees from the Hadoop framework? StreamContext has textFileStream() method you could use for this. On Mon, Mar 16, 2015 at 12:46 PM, Daniel Haviv wrote: > Hi, > Is it possible to assign specific partitions

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-16 Thread Gwen Shapira
Really really nice! Thank you. On Mon, Mar 16, 2015 at 7:18 AM, Pierre-Yves Ritschard wrote: > Hi kafka, > > I just wanted to mention I published a very simple project which can > connect as MySQL replication client and stream replication events to > kafka: https://github.com/pyr/sqlstream > >

Re: Assigning partitions to specific nodes

2015-03-16 Thread Daniel Haviv
Hi, The reason we want to use this method is that this way a file can be consumed by different streaming apps simultaneously (they just consume it's path from kafka and open it locally). With fileStream to parallelize the processing of a specific file I will have to make several copies of it,

Random failure testing

2015-03-16 Thread John Lonergan
Re kafka-1539 Is the community executing random failure testing for Kafka? It would seem that sick testing would have found 1539 and some other bugs that were recently fixed. Is the community considering such testing? Thanks John

Re: Assigning partitions to specific nodes

2015-03-16 Thread Gwen Shapira
Probably off-topic for Kafka list, but why do you think you need multiple copies of the file to parallelize access? You'll have parallel access based on how many containers you have on the machine (if you are using YARN-Spark). On Mon, Mar 16, 2015 at 1:20 PM, Daniel Haviv wrote: > Hi, > The reas

Re: Assigning partitions to specific nodes

2015-03-16 Thread Daniel Haviv
Let's say I have 3 different types of algorithms that are implemented by three streaming apps (on Yarn). They are completely separate, meaning that they can run in parallel on the same data and not sequentially. *Using Kafka: *File X is loaded into HDFS and I want Algorithms A and B to process it

Kafka deployment across DC.

2015-03-16 Thread shrikant patel
Wehave very unique problem.  Wehave a application deployed on weblogic cluster that is spread across 2 datacenter(active-active) DC1 and DC2 (different LAN but same WAN). This producer app generatesdifferent user events, which other apps (consumer apps) are interested in.Right now we use JMS a

Problem starting Kafka Broker

2015-03-16 Thread Zakee
Running into problem starting Kafka Brokes after migrating to a new env, it keeps shutting down with this error. I checked access is fine for the log directories as listed in log.dirs properties. Any clues? [2015-03-16 13:53:19,800] ERROR There was an error in one of the threads during logs

Re: Assigning partitions to specific nodes

2015-03-16 Thread Gwen Shapira
Ah, got it. To actually answer your question: Replica assignment tool allows you to assign partitions to specific brokers. Since you always read from the lead replica, you can mark specific replicas as "preferred replica" and they will be the leader if they are available. You'll need to get each

Re: Problem starting Kafka Broker

2015-03-16 Thread Gwen Shapira
Your kafka log directory (in config file, under log.dir) contains directories that are not KafkaTopics. Possibly hidden directory. Check what "ls -la" shows in that directory. Gwen On Mon, Mar 16, 2015 at 1:58 PM, Zakee wrote: > Running into problem starting Kafka Brokes after migrating to a ne

Re: Random failure testing

2015-03-16 Thread Jiangjie Qin
We are planning to develop a Chaos Monkey test in LinkedIn and will open source it. You can check out KAFKA-2014. Jiangjie (Becket) Qin On 3/16/15, 1:24 PM, "John Lonergan" wrote: >Re kafka-1539 > >Is the community executing random failure testing for Kafka? > >It would seem that sick testing w

Re: Problem starting Kafka Broker

2015-03-16 Thread Zakee
Ah, you are right. Thanks Zakee > On Mar 16, 2015, at 2:05 PM, Gwen Shapira wrote: > > Your kafka log directory (in config file, under log.dir) contains > directories that are not KafkaTopics. Possibly hidden directory. > > Check what "ls -la" shows in that directory. > > Gwen > > On Mon,

Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream

2015-03-16 Thread Arya Ketan
Great work. Sorry for kinda hijacking this thread, but I though that we had built some-thing on mysql bin log event propagator and wanted to share it . You guys can also look into Aesop ( https://github.com/Flipkart/aesop). Its a change propagation frame-work. It has relays which listens to bin log

How to measure performance of Mirror Maker

2015-03-16 Thread Saladi Naidu
We have three Kafka clusters deployed in 3 DC's, each one having their own topics. We are using Mirror Maker to keep all the three clusters up to date with continuous  replication using Mirror Maker. We used Perf producer and Perf consumer to conduct basic testing, all seems to work great. Our 2

Re: Broker Exceptions

2015-03-16 Thread Kazim Zakee
Hi Mayuresh, Here are the logs. Broker-4 [2015-03-13 17:49:40,514] INFO Partition [Topic22kv,5] on broker 4: Shrinking ISR for partition [Topic22kv,5] from 2,4,3 to 2,4 (kafka.cluster.Partition) [2015-03-13 17:49:40,514] INFO Partition [Topic22kv,5] on broker 4: Shrinking ISR for partition [To

Re: Broker Exceptions

2015-03-16 Thread Zakee
Hi Mayuresh, Here are the logs. Old School Yearbook Pics View Class Yearbooks Online Free. Search by School & Year. Look Now! http://thirdpartyoffers.netzero.net/TGL3231/5507ca8137dc94a805e6bst01vucBroker-4 [2015-03-13 17:49:40,514] IN