RE: batching causes replica out of sync

2015-03-05 Thread Aditya Auradkar
Xiaoyu, Just FYI - Here's a discussion on this issue if you are interested. https://issues.apache.org/jira/browse/KAFKA-1546 Aditya From: Mayuresh Gharat [gharatmayures...@gmail.com] Sent: Thursday, March 05, 2015 4:41 PM To: users@kafka.apache.org Subject

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Thanks a lot, really appreciate you guys help!!! On Thursday, March 5, 2015 9:17 PM, tao xiao wrote: The reason you need to use "a".getBytes is because the default serializer.class is kafka.serializer.DefaultEncoder which takes byte[] as input. The way the array returns hash code is n

JMS to Kafka: Inbuilt JMSAdaptor/JMSProxy/JMSBridge (Client can speak JMS but hit Kafka)

2015-03-05 Thread Joshi, Rekha
Hi, Kafka is a great alternative to JMS, providing high performance, throughput as scalable, distributed pub sub/commit log service. However there always exist traditional systems running on JMS. Rather than rewriting, it would be great if we just had an inbuilt JMSAdaptor/JMSProxy/JMSBridge by

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread tao xiao
The reason you need to use "a".getBytes is because the default serializer.class is kafka.serializer.DefaultEncoder which takes byte[] as input. The way the array returns hash code is not based on equality of the elements hence every time a new byte array is created which is the case in your sample

Re: Mirror maker end to end latency metric

2015-03-05 Thread tao xiao
Thanks Jon and Guangzhou for the info On Fri, Mar 6, 2015 at 1:10 AM, Jon Bringhurst < jbringhu...@linkedin.com.invalid> wrote: > Hey Tao, > > Slides 27-30 on > http://www.slideshare.net/JonBringhurst/kafka-audit-kafka-meetup-january-27th-2015 > has > a diagram to visually show that Guozhang is

Re: batching causes replica out of sync

2015-03-05 Thread Mayuresh Gharat
Whats the size of the batch that you are sending? If the messages produced are more that 4000 at any point of time there is a chance of the replica falling behind more than 4000 messages and being kicked out of the ISR. This happens because the thread that checks for in sync replicas is asynchronou

topics still showing up using list command after deletion

2015-03-05 Thread max square
Hi All, I am using Kafka version 0.8.2.1. Like mentioned in the documentation, I enabled the delete.kafka.topic property in the config, restarted the brokers and issued the delete command. Then, I tried listing the topics and the topics that I deleted still shows up in the list. However, if I try

batching causes replica out of sync

2015-03-05 Thread xiaoyu wang
Hi all, We previously have replica.max.lag.message set to 4000 and use sync producer to send data to kafka, one message at a time. With this, we don't see many unclean leader election. Recently, we switched to use sync producer and batch messages. After that, we see unclean leader election more o

Re: Increasing the throughput of Kafka Publisher

2015-03-05 Thread Otis Gospodnetic
Roger, Consider using rsyslog with omkafka. rsyslog rocks! And it's pretty popular, too - http://blog.sematext.com/2014/10/06/top-5-most-popular-log-shippers/ Oh, and it's FAST - some numbers and charts with an older version from 1 year ago: http://blog.sematext.com/2014/01/20/rsyslog-8-1-elasti

Re: Database Replication Question

2015-03-05 Thread Roger Hoover
Hi Jonathan, TCP will take care of re-ordering the packets. On Wed, Mar 4, 2015 at 6:05 PM, Jonathan Hodges wrote: > Thanks James. This is really helpful. Another extreme edge case might be > that the single producer is sending the database log changes and the > network causes them to reach K

Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
will do, Thanks! On Thu, Mar 5, 2015 at 3:35 PM, Gwen Shapira wrote: > Did you take a look at the quick-start guide? > https://kafka.apache.org/082/quickstart.html > > It shows how to set up a single node, how to validate that its working > and then how to set up multi-node cluster. > > Good luc

Re: Set up kafka cluster

2015-03-05 Thread Gwen Shapira
Did you take a look at the quick-start guide? https://kafka.apache.org/082/quickstart.html It shows how to set up a single node, how to validate that its working and then how to set up multi-node cluster. Good luck! On Thu, Mar 5, 2015 at 12:30 PM, Yuheng Du wrote: > Thank you Gwen, > > I also

Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
Thank you Gwen, I also need the kafka cluster continue to provide message brokering service to a Storm cluster after the benchmarking. I am fairly new to cluster setups. So is there an instruction telling me how to set up the three-node kafka cluster before running benchmarking? That would be real

mapping between disk and partition

2015-03-05 Thread sunil kalva
Hi Can i map a specific partition to a different disk in a broker. And what is the general recommendations for disk to partition mapping for which that broker is leader. and also for replications that broker handles. -- SunilKalva

Re: Set up kafka cluster

2015-03-05 Thread Gwen Shapira
Jay Kreps has a gist with step by step instructions for reproducing the benchmarks used by LinkedIn: https://gist.github.com/jkreps/c7ddb4041ef62a900e6c And the blog with the results: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Gwe

Set up kafka cluster

2015-03-05 Thread Yuheng Du
Hi everyone, I am trying to set up a kafka cluster consisting of three machines. I wanna run a benchmarking program in them. Can anyone recommend a step by step tutorial/instruction of how I can do it? Thanks. best, Yuheng

Re: Database Replication Question

2015-03-05 Thread Jay Kreps
Hey Xiao, That's not quite right. Fsync is controlled by either a time based criteria (flush every 30 seconds) or a number of messages criteria. So if you set the number of messages to 1 the flush is synchronous with the write, which I think is what you are looking for. -Jay On Thu, Mar 5, 2015

Re: REST/Proxy Consumer access

2015-03-05 Thread Andrew Otto
BTW, Wikimedia uses varnishkafka to produce http requests to Kafka, and we are pretty happy with it. https://github.com/wikimedia/varnishkafka > On Mar 5, 2015, at 13:09, Ewen Cheslack-Postava wrote: > > Yes, Confluent built a REST proxy that gives access to cluster metadata > (e.g. list to

Re: Database Replication Question

2015-03-05 Thread James Cheng
On Mar 5, 2015, at 12:59 AM, Xiao wrote: > Hi, James, > > This design regarding the restart point has a few potential issues, I think. > > - The restart point is based on the messages that you last published. The > message could be pruned. How large is your log.retention.hours? That's a go

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
And also there something that I think worth mentioning,when I call  prod.send(KeyedMessage("foo", "a", "test message")), the data can't be delivered to the brokers, the only way to make it work is through:prod.send(KeyedMessage("foo", "a".getBytes, "test message".getBytes)). When I convert the d

Re: REST/Proxy Consumer access

2015-03-05 Thread Ewen Cheslack-Postava
Yes, Confluent built a REST proxy that gives access to cluster metadata (e.g. list topics, leaders for partitions, etc), producer (send binary or Avro messages to any topic), and consumer (run a consumer instance and consume messages from a topic). And you are correct, internally it uses Jetty and

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Hi Guozhang,I'm using kafka 0.8.2.0  Thanks On Thursday, March 5, 2015 12:57 PM, Guozhang Wang wrote: Zijing, Which version of Kafka client are you using? On Thu, Mar 5, 2015 at 8:50 AM, Zijing Guo wrote: > Hi community,I have a 2 nodes test cluster with 2 zk instance and 2 broke

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Guozhang Wang
Just got the previous emails. Mayuresh is right, it seems your keys are not "a". On Thu, Mar 5, 2015 at 9:57 AM, Guozhang Wang wrote: > Zijing, > > Which version of Kafka client are you using? > > On Thu, Mar 5, 2015 at 8:50 AM, Zijing Guo > wrote: > >> Hi community,I have a 2 nodes test clust

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Guozhang Wang
Zijing, Which version of Kafka client are you using? On Thu, Mar 5, 2015 at 8:50 AM, Zijing Guo wrote: > Hi community,I have a 2 nodes test cluster with 2 zk instance and 2 broker > instance running and I'm experimenting kafka producer in a cluster > environment. So I create a topic "foo" with

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Hi, Thanks for your response. That's just my typo, I was meant to say  KeyedMessage("foo","a", "test message" + e). On Thursday, March 5, 2015 12:49 PM, Mayuresh Gharat wrote: I suppose the keyedMessage constructor is KeyedMessage(topic, key, message), so in your case key is "test me

REST/Proxy Consumer access

2015-03-05 Thread Julio Castillo
I read the description of the new Confluent Platform and it briefly describes some REST access to a producer and a consumer. Does this mean there is a new process(es) running (Jetty based)? This process integrates both the consumer and producer libraries? Thanks Julio Castillo NOTICE: This e-mai

Re: Topicmetadata response miss some partitions information sometimes

2015-03-05 Thread Mayuresh Gharat
Yeah, but that gives them all the partitions and does not differentiate between available vs unavailable right. Thanks, Mayuresh On Thu, Mar 5, 2015 at 9:14 AM, Guozhang Wang wrote: > I think today people can get the available partitions by calling > partitionsFor() API, and iterate the partit

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Mayuresh Gharat
I suppose the keyedMessage constructor is KeyedMessage(topic, key, message), so in your case key is "test message" + e. Thanks, Mayuresh On Thu, Mar 5, 2015 at 9:25 AM, Zijing Guo wrote: > And I'm using kafka version 0.8.2.0 > > On Thursday, March 5, 2015 11:51 AM, Zijing Guo > wrote: >

Re: Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
And I'm using kafka version 0.8.2.0 On Thursday, March 5, 2015 11:51 AM, Zijing Guo wrote: Hi community,I have a 2 nodes test cluster with 2 zk instance and 2 broker instance running and I'm experimenting kafka producer in a cluster environment. So I create a topic "foo" with 2 par

Re: Topicmetadata response miss some partitions information sometimes

2015-03-05 Thread Guozhang Wang
I think today people can get the available partitions by calling partitionsFor() API, and iterate the partitions and filter those whose leader is null, right? On Wed, Mar 4, 2015 at 4:40 PM, Mayuresh Gharat wrote: > Cool. So then this is a non issue then. To make things better we can expose > th

Re: Mirror maker end to end latency metric

2015-03-05 Thread Jon Bringhurst
Hey Tao, Slides 27-30 on http://www.slideshare.net/JonBringhurst/kafka-audit-kafka-meetup-january-27th-2015 has a diagram to visually show that Guozhang is talking about. -Jon On Mar 5, 2015, at 9:03 AM, Guozhang Wang wrote: > There is no end2end latency metric in MM, since such a metric req

Re: Mirror maker end to end latency metric

2015-03-05 Thread Guozhang Wang
There is no end2end latency metric in MM, since such a metric requires timestamp info on the source / dest Kafka clusters. For example, at LinkedIn we add a timestamp in the message header, and let a separate consumer to fetch the message on both ends to measure the latency. Guozhang On Wed, Mar

Re: Database Replication Question

2015-03-05 Thread Guozhang Wang
Josh, Dedupping on the consumer side may be tricky as it requires some sequence number on the messages in order to achieve idempotency. On the other hand, we are planning to add idempotent producer or transactional messaging https://cwiki.apache.org/confluence/display/KAFKA/Idempotent+Producer h

Re: TopicFilters and 0.9 Consumer

2015-03-05 Thread Guozhang Wang
Vinoth, Yes we do have plans to continue supporting topic filters in 0.9 consumers, the APIs are not there yet though. Guozhang On Thu, Mar 5, 2015 at 8:32 AM, Vinoth Chandar wrote: > Hi guys, > > I was wondering what the plan in 0.9, was for the topic filters that are > today in the High leve

Kafka DefaultPartitioner is not behaved as expected.

2015-03-05 Thread Zijing Guo
Hi community,I have a 2 nodes test cluster with 2 zk instance and 2 broker instance running and I'm experimenting kafka producer in a cluster environment. So I create a topic "foo" with 2 partitions and replication 1.I create a async Producer without defining partition.class (so the partitioner

TopicFilters and 0.9 Consumer

2015-03-05 Thread Vinoth Chandar
Hi guys, I was wondering what the plan in 0.9, was for the topic filters that are today in the High level consumer. The new API' s subscribe methods, seem to be working with

Re: Database Replication Question

2015-03-05 Thread Xiao
Hey, Jay, Thank you for your answer! Based on my understanding, Kafka fsync is regularly issued by a dedicated helper thread. It is not issued based on the semantics. The producers are unable to issue a COMMIT to trigger fsync. Not sure if this requirement is highly desirable to the others

Re: kafka monitoring

2015-03-05 Thread Vladimir Tretyakov
Hi Sa Li, For the monitoring piece there is SPM - see *http://blog.sematext.com/2015/02/10/kafka-0-8-2-monitoring/ *. Demo https://apps.sematext.com/demo (just select 'SPM.Prod.Kafka' system after you login as DEMO user) It will monitor

Re: Database Replication Question

2015-03-05 Thread Xiao
Hi, James, This design regarding the restart point has a few potential issues, I think. - The restart point is based on the messages that you last published. The message could be pruned. How large is your log.retention.hours? - If the Kafka message order is different from your log sequence, yo

Re: Increasing the throughput of Kafka Publisher

2015-03-05 Thread Roger Hoover
I think my test include some grok filters and file input so it's not necessarily bottlenecked on Kafka producer. On Thu, Mar 5, 2015 at 12:37 AM, Vineet Mishra wrote: > Hey Roger, > > As per your stats you have around 5k msg/s of size 42 bytes > > 5000msgs * 42 byte = 21 = ~ 205kbps > > whil

Re: Increasing the throughput of Kafka Publisher

2015-03-05 Thread Vineet Mishra
Hey Roger, As per your stats you have around 5k msg/s of size 42 bytes 5000msgs * 42 byte = 21 = ~ 205kbps while I am getting around 500 msgs of around 350 bytes. 500msgs * 350 = 175000 = ~ 170kbps Which is even collectively very degrading write throughput. It seems this rate of publishi