Re: Database Replication Question

2015-03-09 Thread Xiao
Hi, Jay, Thank you! The Kafka document shows “Kafka should run well on any unix system". I assume it includes the major two Unix versions, IBM AIX and HP-UX. Right? 1. Unfortunately, we aims at supporting all the platforms, Linux, Unix, Windows and especially z/OS. I know z/OS is not easy to

Re: How replicas catch up the leader

2015-03-09 Thread sy.pan
Hi, tao xiao and Jiangjie Qin I encounter with the same issue, my node had recovered from high load problem (caused by other application) this is the kafka-topic show: Topic:ad_click_sts PartitionCount:6ReplicationFactor:2 Configs: Topic: ad_click_sts Partition: 0

Re: Working DR patterns for Kafka

2015-03-09 Thread John
There was a typo in the question - should have been ... I can tolerate the [replicant]

Re: integrate Camus and Hive?

2015-03-09 Thread Pradeep Gollakota
If I understood your question correctly, you want to be able to read the output of Camus in Hive and be able to know partition values. If my understanding is right, you can do so by using the following. Hive provides the ability to provide custom patterns for partitions. You can use this in combin

Re: Kafka Questions

2015-03-09 Thread Jiangjie Qin
Hi Mark, For global Pub/Sub between clusters, I think you might need another layer of service to direct users to the right Kafka cluster. Yes, mirror maker could be used for cross colo replication. Currently mirror maker cannot be used for bi-direction mirror yet, after KAFKA-1997 get checked i

Kafka Questions

2015-03-09 Thread Mark Flores
One of our development teams is considering implementing a Kafka solution. If the development team were to assume implementing 6 separate regional Kafka clusters: *How could we implement global Pub/Sub between clusters? *Can we do bi-directional replication with MirrorMaker betwe

Re: Kafka Mailing List for General Questions

2015-03-09 Thread Jiangjie Qin
Hi Mark, You’ve already asked a question in the right place – sending email to users@kafka.apache.org is the right way. If it is a development question, you can send to d...@kakfa.apache.org. Jiangjie (Becket) Qin From: Mark Flores mailto:mark.flo...@expeditors.com>> Reply-To: "users@kafka.apa

Re: Kafka Mailing List for General Questions

2015-03-09 Thread Otis Gospodnetic
Looks like you subscribed. Just start a new thread and ask away. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Mon, Mar 9, 2015 at 4:27 PM, Mark Flores wrote: > Hi, > > > > I would like to subscribe to th

kafka log ERROR Closing socket for IP -- Connection reset by peer

2015-03-09 Thread Stuart Reynolds
I'm calling ConsumerConnector.shutdown to close a consumer connection and kafka's log reports an error? I don't see a similar error when using SimpleConsumer. Is there a way to close ConsumerConnector so that the errors aren't reported in the kafka log (this is making it very difficult to sift th

Kafka Mailing List for General Questions

2015-03-09 Thread Mark Flores
Hi, I would like to subscribe to the Kafka mailing list for general questions. Please let me know what I need to do in order to submit questions to the Kafka general mailing list. Thanks. Regards, Mark Flores Project Manager, Enterprise Technology Direct206-576-2675 Email mark.flo.

integrate Camus and Hive?

2015-03-09 Thread Yang
I believe many users like us would export the output from camus as a hive external table. but the dir structure of camus is like //MM/DD/xx while hive generally expects /year=/month=MM/day=DD/xx if you define that table to be partitioned by (year, month, day). otherwise you'd have

Re: Is it actually a bad idea to set 'consumer.id' explicitly?

2015-03-09 Thread Jiangjie Qin
Hi Kevin, You can use partition.assignment.strategy=roundrobin. This will balance all the partition of all the topics across consumer thread. I think the rationale behind using default consumer id is that you will have better information to identify a consumer. But if you want to have some specif

Re: Broker Exceptions

2015-03-09 Thread Kazim Zakee
No broker restarts. Created a kafka issue: https://issues.apache.org/jira/browse/KAFKA-2011 >> Logs for rebalance: >> [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica >> election for partitions: (kafka.controller.Kafka

Re: Broker Exceptions

2015-03-09 Thread Zakee
No broker restarts. Created a kafka issue: https://issues.apache.org/jira/browse/KAFKA-2011 >> Logs for rebalance: >> [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica >> election for partitions: (kafka.controller.Kafka

Working DR patterns for Kafka

2015-03-09 Thread John Lonergan
There are various prior questions including.. http://search-hadoop.com/m/4TaT4ts2oz1/disaster+recovery/v=threaded Is there a clear document on disaster recovery patterns for K and their respective trade offs. How are actual prod deployments dealing with this. For instance I want my topics replicat

Is it actually a bad idea to set 'consumer.id' explicitly?

2015-03-09 Thread Kevin Scaldeferri
https://github.com/apache/kafka/blob/0.8.2/core/src/main/scala/kafka/consumer/ConsumerConfig.scala#L101 suggests that 'consumer.id' should only be set explicitly for testing purposes. Is there a reason that it would be a bad idea to set it ourselves for production use? The reason I am asking is t

Re: Multiple consumer groups with same group id on a single topic

2015-03-09 Thread Jiangjie Qin
Yes, Kevin is right. It does not matter whether you run the consumer from the same JVM or not, as long as the consumers has same group id, they are in the same group. So in your case, you have 6 consumers in the same consumer group. Since you have 6 partitions in the topic, assuming you have only o

Re: Broker Exceptions

2015-03-09 Thread Jiangjie Qin
Is there anything wrong with brokers around that time? E.g. Broker restart? The log you pasted are actually from replica fetchers. Could you paste the related logs in controller.log? Thanks. Jiangjie (Becket) Qin On 3/9/15, 10:32 AM, "Zakee" wrote: >Correction: Actually the rebalance happened

Re: Batching at the socket layer

2015-03-09 Thread Jiangjie Qin
The stickiness of partition only applies to old producer. In new producer we have the round robin for each message. The batching in new producer is per topic partition, the batch size it is controlled by both max batch size and linger time config. Jiangjie (Becket) Qin On 3/9/15, 10:10 AM, "Corey

kafka Issue#2011 https://issues.apache.org/jira/browse/KAFKA-2011

2015-03-09 Thread Zakee
Opened a kafka issue for rebalance happening with auto.rebalance set to false. https://issues.apache.org/jira/browse/KAFKA-2011 >> Logs for rebalance: >> [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica >> election for partitions: (kafka.controller.KafkaController) >> [2

Group name while consuming in 0.8.2

2015-03-09 Thread Mhaskar, Tushar
Hi, How to specify group name when using kafka-console-consmer.sh in 0.8.2. Kafka 0.8.1 had --group option while running the above script. I need group name to run offset checker after running the consumer. Thanks, Tushar

Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-09 Thread Jun Rao
The following are the results of the votes. +1 binding = 3 votes +1 non-binding = 2 votes -1 = 0 votes 0 = 0 votes The vote passes. I will release artifacts to maven central, update the dist svn and download site. Will send out an announce after that. Thanks everyone that contributed to the wor

Re: Multiple consumer groups with same group id on a single topic

2015-03-09 Thread Kevin Scaldeferri
On Mon, Mar 9, 2015 at 10:38 AM, Phill Tomlinson wrote: > Hi, > > I have two separate consumer groups on different JVM processes, but both > have the same "group.id". You've said this twice, and I think it's creating some confusion, because the group.id is exactly what determines the members o

Re: [kafka-clients] Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-09 Thread Jun Rao
I was trying to see if kafka-2010 is a blocker to the 0.8.2.1 release. It doesn't seem to be since it won't affect the common usage when the controlled shutdown is enabled (by default). I will wrap up the 0.8.2.1 release. Thanks, Jun On Mon, Mar 9, 2015 at 8:25 AM, Solon Gordon wrote: > Any ti

RE: Multiple consumer groups with same group id on a single topic

2015-03-09 Thread Phill Tomlinson
Hi, I have two separate consumer groups on different JVM processes, but both have the same "group.id". They are high level consumer groups with each group containing 3 consumers. Only one group consumes at a given time - and I would like both groups, with the same id to share the load and comm

Re: Broker Exceptions

2015-03-09 Thread Zakee
Correction: Actually the rebalance happened quite until 24 hours after the start, and thats where below errors were found. Ideally rebalance should not have happened at all. Thanks Zakee > On Mar 9, 2015, at 10:28 AM, Zakee wrote: > >> Hmm, that sounds like a bug. Can you paste the log of

Re: Broker Exceptions

2015-03-09 Thread Zakee
> Hmm, that sounds like a bug. Can you paste the log of leader rebalance > here? Thanks for you suggestions. It looks like the rebalance actually happened only once soon after I started with clean cluster and data was pushed, it didn’t happen again so far, and I see the partitions leader counts

Re: Does Kafka 0.8.2 producer has a lower throughput in sync-mode, comparing with 0.8.1.x?

2015-03-09 Thread Jiangjie Qin
Hi Yang, In the code suggested by Manikumar, yes, it is possible message 3 still got sent even message 2 failed. There is no single line code for send a batch of message synchronously now, but after KAFKA-1660 is checked in, you may be able to achieve this by doing the following: Set a callback fo

Fwd: Verioning

2015-03-09 Thread Corey Nolet
I'm new to Kafka and I'm trying to understand the version semantics. We want to use Kafka w/ Spark but our version of Spark is tied to 0.8.0. We were wondering what guarantees are made about backwards compatbility across 0.8.x.x. At first glance, given the 3 digits used for versions, I figured 0.8.

Fwd: Batching at the socket layer

2015-03-09 Thread Corey Nolet
I'm curious what type of batching Kafka producers do at the socket layer. For instance, if I have a partitioner that round robin's n messages to a different partition, am I guaranteed to get n different messages sent over the socket or is there some micro-batching going on underneath? I am trying

Re: kafka mirroring ...!

2015-03-09 Thread Jiangjie Qin
Hi Sunilkalva, We are rewriting mirror maker in KAFKA-1997 with a handful of enhancement. With that new mirror maker, you will be able to mirror to a different topic by using the message handler. Jiangjie (Becket) Qin On 3/9/15, 4:41 AM, "sunil kalva" wrote: >I think it will be very usefull if

Re: Multiple consumer groups with same group id on a single topic

2015-03-09 Thread Mayuresh Gharat
If you have 2 consumer groups, each group will read from all partitions automaticcally if you are using HighLevel consumer ( In your case it would be each consumer gets 2 partitons). You don't have to specify the partitions it should read from. Thanks, Mayuresh On Mon, Mar 9, 2015 at 9:59 AM, Ji

Re: Topics are not evenly distributed to streams using Range partition assignment

2015-03-09 Thread Jiangjie Qin
Hi Tao, That is expected behavior. You can use set partition.assignment.strategy=roundrobin in consumer config. It will take all the partitions from all topics and do a round robin assignment, whereas range only take partitions for each individual topic for assignment. Jiangjie (Becket) Qin On 3

Re: Multiple consumer groups with same group id on a single topic

2015-03-09 Thread Jiangjie Qin
HI Phill, Do you mean you are using 6 consumers with the same group id? Or you have 3 consumers using one group id, and another 3 using another different group id? For the example you mentioned, what you can do is to run several consumers on different physical machine with the same group id, they

Re: kafka topic information

2015-03-09 Thread Yuheng Du
Thanks, got it! best, Yuheng On Mon, Mar 9, 2015 at 11:52 AM, Harsha wrote: > In general users are expected to run zookeeper cluster of 3 or 5 nodes. > Zookeeper requires quorum of servers running which means at least ceil(n/2) > servers need to be up. For 3 zookeeper nodes there needs to be at

Re: kafka topic information

2015-03-09 Thread Harsha
In general users are expected to run zookeeper cluster of 3 or 5 nodes. Zookeeper requires quorum of servers running which means at least ceil(n/2) servers need to be up. For 3 zookeeper nodes there needs to be atleast 2 zk nodes up at any time , i.e your cluster can function  fine incase of 1 m

Re: Does Kafka 0.8.2 producer has a lower throughput in sync-mode, comparing with 0.8.1.x?

2015-03-09 Thread Yu Yang
If a send request in the middle of the list fails, will all send requests that follows it fail? Or only the messages that are put in the same batch by the underneath transportation layer fail? On Mon, Mar 9, 2015 at 1:31 AM, Manikumar Reddy wrote: > 1. We can send list of messages and wait on

Re: kafka topic information

2015-03-09 Thread Yuheng Du
Harsha, Thanks for reply. So what if the zookeeper cluster fails? Will the topics information be lost? What fault-tolerant mechanism does zookeeper offer? best, On Mon, Mar 9, 2015 at 11:36 AM, Harsha wrote: > Yuheng, > kafka keeps cluster metadata in zookeeper along with topic > met

Re: kafka topic information

2015-03-09 Thread Harsha
Yuheng,           kafka keeps cluster metadata in zookeeper along with topic metadata as well. You can use zookeeper-shell.sh or zkCli.sh to check zk nodes, /brokers/topics will give you the list of topics . --  Harsha On March 9, 2015 at 8:20:59 AM, Yuheng Du (yuheng.du.h...@gmail.com) wrote:

Re: [kafka-clients] Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-09 Thread Solon Gordon
Any timeline on an official 0.8.2.1 release? Were there any issues found with rc2? Just checking in because we are anxious to update our brokers but waiting for the patch release. Thanks. On Thu, Mar 5, 2015 at 12:01 AM, Neha Narkhede wrote: > +1. Verified quick start, unit tests. > > On Tue, Ma

kafka topic information

2015-03-09 Thread Yuheng Du
I am wondering where does kafka cluster keep the topic metadata (name, partition, replication, etc)? How does a server recover the topic's metadata and messages after restart and what data will be lost? Thanks for anyone to answer my questions. best, Yuheng

Re: kafka mirroring ...!

2015-03-09 Thread sunil kalva
I think it will be very usefull if we can mirror to a different topic name on destination side. We have a use case to merge data from multiple colos to one central colo. SunilKalva On Mon, Mar 9, 2015 at 4:29 PM, tao xiao wrote: > I don't think you can mirror messages to a different topic name

Topics are not evenly distributed to streams using Range partition assignment

2015-03-09 Thread tao xiao
Hi, I created a message stream in my consumer using connector .createMessageStreamsByFilter(new Whitelist("mm-benchmark-test\\w*"), 5); I have 5 topics in my cluster and each of the topic has only one partition. My understanding of wildcard stream is that multiple streams are shared between selec

Re: kafka mirroring ...!

2015-03-09 Thread tao xiao
I don't think you can mirror messages to a different topic name in the current mirror maker implementation. Mirror maker sends the message to destination topic based on the topic name it reads from source On Mon, Mar 9, 2015 at 5:00 PM, sunil kalva wrote: > Can i configure different topic name i

Re: kafka mirroring ...!

2015-03-09 Thread sunil kalva
Can i configure different topic name in destination cluster, i mean can i have different topic names for source and destination cluster for mirroring. If yes how can i map source topic with destination topic name ? SunilKalva On Mon, Mar 9, 2015 at 6:41 AM, tao xiao wrote: > Ctrl+c is clean shu

Re: Does Kafka 0.8.2 producer has a lower throughput in sync-mode, comparing with 0.8.1.x?

2015-03-09 Thread Manikumar Reddy
1. We can send list of messages and wait on the returned futures List responses = new ArrayList(); for(input: recordBatch) responses.add(producer.send(input)); for(response: responses) response.get 2. messages will be send in the submission order. On Mon, Mar 9, 2015 at 1:56 PM, Manik

Re: Does Kafka 0.8.2 producer has a lower throughput in sync-mode, comparing with 0.8.1.x?

2015-03-09 Thread Manikumar Reddy
1 . On Mon, Mar 9, 2015 at 1:03 PM, Yu Yang wrote: > The confluent blog > > mentions > that the the batching is done whenever possible now. "The sync producer, > under load, can get performance as good as the async produ

Multiple consumer groups with same group id on a single topic

2015-03-09 Thread Phill Tomlinson
Hi, I have a topic with 6 partitions. I have two consumer groups with 3 consumers each, both with the same group.id. However only one group appears to consume from the topic. Is this expected behaviour? I would expect to be able to concurrently use two consumer groups on the same topic to prov

Re: Does Kafka 0.8.2 producer has a lower throughput in sync-mode, comparing with 0.8.1.x?

2015-03-09 Thread Yu Yang
The confluent blog mentions that the the batching is done whenever possible now. "The sync producer, under load, can get performance as good as the async producer. " Does it mean that kafka 0.8.2 guarantees that the sequenc

Does Kafka 0.8.2 producer has a lower throughput in sync-mode, comparing with 0.8.1.x?

2015-03-09 Thread Yu Yang
Hi, Kafka 0.8.1.1 allows us to send a list of messages in sync mode: public void send(List messages); I did not find a counter-part of this api in the new producer that is introduced in kafka 0.8.2. It seems that we can use the following method to do sync send in kafka 0.8.2: producer.