load balancing

2015-03-02 Thread sunil kalva
Is kafka load balancing based on number of partitions of a topic or number partitions of all topics in a cluster ? -- SunilKalva

Using 0.8.2 jars in consumer with producer of version 0.8.1.1

2015-03-02 Thread Jianshi Huang
Hi, I'd like to use Scala 2.11 with Kafka, which is only supported from 0.8.2. Can I use 0.8.2 jars for my consumer with producer of older version (mine is 0.8.1.1), which I have no control over it. Thanks, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.gith

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread Ivan Balashov
Guozhang, Thanks for the suggestion, however, I'm afraid cardinality of keys will grow indefinitely and AFAIU keys are permanent with log compaction. Any chance keys could also be removed during compaction? Thanks, 2015-03-02 5:27 GMT+03:00 Guozhang Wang : > > From your description it seems Kafk

Kafka 0.8.2.0 and JDK8

2015-03-02 Thread Fabio Oliveira
Hello Everyone, Trust you’re all well. We have been successfully using Kafka 0.8.1.1 and JDK7u51 as per recommendations in the official documentation. Such an amazing tool! As we are now considering to adopt 0.8.2.0 and, whilst we’ve noticed that the Java version recommendation remains the sam

Got negative offset lag after restarting brokers

2015-03-02 Thread tao xiao
Hi team, I have 2 brokers (0 and 1) serving a topic mm-benchmark-test. I did some tests on the two brokers to verify how leader got elected. Here are the steps: 1. started 2 brokers 2. created a topic with partition=1 and replication-factor=2. Now brokers 1 was elected as leader 3. sent 1000 mess

Re: 0.7 design doc?

2015-03-02 Thread Philip O'Toole
Thanks Guozhang -- no this isn't quite it. The doc I read before contained the rationale for using physical offsets in the file, not logical offsets. I know the current version of Kafka now uses logical offsets again.  It's not a big deal though, I generally remember the contents of the page, an

Kafka broker failed to work when migrating the broker to another machine

2015-03-02 Thread 俞忠静
Hi, I set up a Kafka cluster with 5 brokers, with the broker ids 0, 1, 2, 3, 4. I stop the node whose broke id is 4, and deploy a new broker in a new machine with the broker id 4. But I find the new broker failed to work, it never appears in ISR. [zk: localhost:2181(CONNECTED) 3] ls /kafka/test

moving replications

2015-03-02 Thread sunil kalva
Hi How to move replications from one broker to another broker ? -- SunilKalva

Re: cross-colo writing/reading?

2015-03-02 Thread Todd Palino
Latencies like this are one of the big reasons that we run our Kafka clusters local to the producers and consumers. Another is network partition. As Jeff noted, mirror maker is the way to connect them together. Our architecture uses a local cluster in each datacenter, and then an aggregate cluster

Re: 0.7 design doc?

2015-03-02 Thread Guozhang Wang
Kafka use write() calls to append data to log files, note it is sequential writes. In 0.8 we include an index file to improve searching for physical positions given the offsets, which used mmaping. On Sun, Mar 1, 2015 at 9:42 PM, Philip O'Toole < philip.oto...@yahoo.com.invalid> wrote: > Thanks G

Re: 0.7 design doc?

2015-03-02 Thread Harsha
These docs might help https://kafka.apache.org/08/design.html http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf -Harsha On Sun, Mar 1, 2015, at 09:42 PM, Philip O'Toole wrote: > Thanks Guozhang -- no this isn't quite it. The doc I read before > contai

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread Guozhang Wang
Currently Kafka log compaction does not support removing keys, but as long as you also have log cleaning done at the app level #.keys will not increase indefinitely. On Mon, Mar 2, 2015 at 2:08 AM, Ivan Balashov wrote: > Guozhang, > > Thanks for the suggestion, however, I'm afraid cardinality of

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread Ivan Balashov
Guozhang, I agree, but upon restart the application still needs to init KV-storage. And even though values are empty, keys will generate traffic (delaying app startup time). Besides, the idea of keeping needless data in kafka forever, even keys only, sounds rather unsettling. I guess we could try

Re: Anyone interested in speaking at Bay Area Kafka meetup @ LinkedIn on March 24?

2015-03-02 Thread Jon Bringhurst
The meetups are recorded. For example, here's a link to the January meetup: http://www.ustream.tv/recorded/58109076 The links to the recordings are usually posted to the comments for each meetup on http://www.meetup.com/http-kafka-apache-org/ -Jon On Feb 23, 2015, at 3:24 PM, Ruslan Khafizov

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread svante karlsson
Wouldn't it be rather simple to add a retention time on "deleted" items ie keys with null value for topics that are compacted? The retention time would then be set to some "large" time to allow all consumers to understand that a previous k/v is being deleted. 2015-03-02 17:30 GMT+01:00 Ivan Bal

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread Ivan Balashov
Svante, Not sure if I understand your suggestion correctly, but I do think that enabling retention for deleted values would make a useful addition to the "compact" policy. Otherwise some data is bound to be hanging around not used. Guozhang, could this potentially deserve a feature request? Than

Add repository URL to git repo page

2015-03-02 Thread Andrew Pennebaker
When newbies visit https://git1-us-west.apache.org/repos/asf?p=kafka.git, they may have trouble finding the correct git URL to clone. Could we please add a repository URL (git://git.apache.org/kafka.git) to make it easier for newbies to find this?

Re: moving replications

2015-03-02 Thread Gwen Shapira
Take a look at the Reassign Partition Tool. It lets you specify which replica exists on which broker: https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool Its a bit tricky to use, so feel free to follow up with more questions :) Gwen On Mo

Re: Got negative offset lag after restarting brokers

2015-03-02 Thread Mayuresh Gharat
This is interesting test. I suppose this is because while broker 0 was the leader broker 1 was completely down and broker 1's log end offset never increased. So when it came back up and since broker 0 was down you got a lag of -1000. Thanks, Mayuresh On Mon, Mar 2, 2015 at 3:15 AM, tao xiao wr

Re: Got negative offset lag after restarting brokers

2015-03-02 Thread Stuart Reynolds
Each topic has: earliest and latest offsets (per partition) Each consumer group has a current offset (per topic, partition pair) I see -1 for the current offsets new consumer groups that haven't yet committed an offset. I think it means that the offsets for that consumer group are undefined. Is

Apache Samza Meetup - March 4 @6PM hosted at LinkedIn's campus in Mountain View CA

2015-03-02 Thread Ed Yakabosky
Hi all - I would like to announce the first Bay Area Apache Samza Meetup hosted at LinkedIn in Mountain View, CA on March 4, 2015 @6PM. We plan to host the event every 2-months to encourage knowledge sharing & collaboration in Samz

Re: moving replications

2015-03-02 Thread sunil kalva
Shapira tx for quick reply, but this tool elects a new leader for a given partition from the existing replicas of that partition. But my problem is basically move one replica completely from old broker to new broker and eventually move leader also to new broker (with out incrementing replica count

Re: Using 0.8.2 jars in consumer with producer of version 0.8.1.1

2015-03-02 Thread Jiangjie Qin
Which server version are you running? On 3/2/15, 2:05 AM, "Jianshi Huang" wrote: >Hi, > >I'd like to use Scala 2.11 with Kafka, which is only supported from 0.8.2. > >Can I use 0.8.2 jars for my consumer with producer of older version (mine >is 0.8.1.1), which I have no control over it. > >Thank

Re: load balancing

2015-03-02 Thread Jiangjie Qin
There are two algorithms: range and round robin. Range algorithm does balance for each topic independently. Round robin balance across all the topics the consumer is consuming from. Jiangjie (Becket) Qin On 3/2/15, 2:05 AM, "sunil kalva" wrote: >Is kafka load balancing based on number of partit

Re: Got negative offset lag after restarting brokers

2015-03-02 Thread Jiangjie Qin
In this case you have data loss. In step 6, when broker 1 comes up, it becomes the leader and has log end offset 1000. When broker 0 comes up, it becomes follower and will truncate its log to 1000, i.e. 1000 messages were lost. Next time when the consumer starts, its offset will be reset to either

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread Mayuresh Gharat
This would be a good feature to add to log Cleaner. Thanks, Mayuresh On Mon, Mar 2, 2015 at 8:57 AM, Ivan Balashov wrote: > Svante, > > Not sure if I understand your suggestion correctly, but I do think > that enabling retention for deleted values would make a useful > addition to the "compact

Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-02 Thread Solon Gordon
+1

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread James Cheng
Ivan, I think log.cleaner.delete.retention.ms does just that? "The amount of time to retain delete tombstone markers for log compacted topics. This setting also gives a bound on the time in which a consumer must complete a read if they begin from offset 0 to ensure that they get a valid snapsh

Re: Kafka 0.8.2 log cleaner

2015-03-02 Thread Ivan Balashov
James, Indeed, does exactly what is needed. Thanks for noticing! 2015-03-02 22:34 GMT+03:00 James Cheng : > Ivan, > > I think log.cleaner.delete.retention.ms does just that? > > "The amount of time to retain delete tombstone markers for log compacted > topics. This setting also gives a bound o

Re: NetworkProcessorAvgIdlePercent

2015-03-02 Thread Zakee
Thanks, I have added them for monitoring. -Zakee > On Feb 27, 2015, at 9:21 AM, Jun Rao wrote: > > Zakee, > > It would be useful to get the following. > > kafka.network:name=RequestQueueSize,type=RequestChannel > kafka.network:name=RequestQueueTimeMs,request=Fetch,type=RequestMetrics > kaf

Re: Topicmetadata response miss some partitions information sometimes

2015-03-02 Thread Guozhang Wang
That is a valid point, today the returned metadata response already contains partitions even with error code, so we can expose that in the Cluster / KafkaProducer class. Could you file a JIRA? Guozhang On Sun, Mar 1, 2015 at 7:11 PM, Evan Huus wrote: > Which I think is my point - based on my cu

Re: Topicmetadata response miss some partitions information sometimes

2015-03-02 Thread Evan Huus
Filed as https://issues.apache.org/jira/browse/KAFKA-1998 Evan On Mon, Mar 2, 2015 at 5:19 PM, Guozhang Wang wrote: > That is a valid point, today the returned metadata response already > contains partitions even with error code, so we can expose that in the > Cluster / KafkaProducer class. Cou

kafka producer does not distribute messages to partitions evenly?

2015-03-02 Thread Yang
we have 10 partitions for a topic, and omit the explicit partition param in the message creation: KeyedMessage data = new KeyedMessage (mytopic, myMessageContent); // partition key need to be polished producer.send(data); but on average 3--5 of the partitions are empty. what went wrong?

Re: kafka producer does not distribute messages to partitions evenly?

2015-03-02 Thread Mayuresh Gharat
Probably your keys are getting hashed to only those partitions. I don't think anything is wrong here. You can check how the default hashPartitioner is used in the code and try to do the same for your keys before you send them and check which partitions are those going to. The default hashpartition

Camus Issue about Output File EOF Issue

2015-03-02 Thread Bhavesh Mistry
Hi Kakfa User Team, I have been encountering two issues with Camus Kafka ETL Job: 1) End Of File (unclosed files) 2) Not SequenceFile Error The details of issues can be found at https://groups.google.com/forum/#!topic/camus_etl/RHS3ASy7Eqc. If you guys have faced similar issue, please let me kn

Re: Camus Issue about Output File EOF Issue

2015-03-02 Thread Gwen Shapira
Do you have the command you used to run Camus? and the config files? Also, I noticed your file is on maprfs - you may want to check with your vendor... I doubt Camus was extensively tested on that particular FS. On Mon, Mar 2, 2015 at 3:59 PM, Bhavesh Mistry wrote: > Hi Kakfa User Team, > > I ha

Re: kafka producer does not distribute messages to partitions evenly?

2015-03-02 Thread Yang
thanks. just checked code below. in the code below, the line that calls Random.nextInt() seems to be called only *a few times* , and all the rest of the cases getPartition() is called, the cached sendPartitionPerTopicCache.get(topic) seems to be called, so apparently you won't get an even partition

Re: kafka producer does not distribute messages to partitions evenly?

2015-03-02 Thread Christian Csar
I believe you are seeing the behavior where the random partitioner is sticky. http://mail-archives.apache.org/mod_mbox/kafka-users/201309.mbox/%3ccahwhrrxax5ynimqnacsk7jcggnhjc340y4qbqoqcismm43u...@mail.gmail.com%3E has details. So with the default 10 minute refresh if your test is only an hour or

Re: [VOTE] 0.8.2.1 Candidate 2

2015-03-02 Thread Jun Rao
+1 from me. Verified quickstart and unit tests. Thanks, Jun On Thu, Feb 26, 2015 at 2:59 PM, Jun Rao wrote: > This is the second candidate for release of Apache Kafka 0.8.2.1. This > fixes 4 critical issue in 0.8.2.0. > > Release Notes for the 0.8.2.1 release > > https://people.apache.org/~jun

Fetch Purgatory Request Size

2015-03-02 Thread Zakee
Looking for ideas from those who have been using kafka for some time. Should I be concerned about the fetch purgatory size increasing to high numbers and consistently remaining there while the producing data rates vary between 210k to 270k per sec (60M to 82M per sec)? There are 5 brokers with

Re: moving replications

2015-03-02 Thread Gwen Shapira
I think the ReassignPartitionsTool does what you need, at least partially. It will move partitions of given topics to a new set of brokers - this includes replicas and leaders from what I can tell. Here's the documentation of one of the options: topics-to-move-json-file: Generate a reassignment

Re: Camus Issue about Output File EOF Issue

2015-03-02 Thread Bhavesh Mistry
Hi Gwen, We are using MapR (Sorry no Cloudera) distribution. I am suspecting it is code issue. I am in-processor review the code about MultiOutputFormat class. https://github.com/linkedin/camus/blob/master/camus-etl-kafka/src/main/java/com/linkedin/camus/etl/kafka/mapred/EtlMultiOutputFormat.j

Re: Camus Issue about Output File EOF Issue

2015-03-02 Thread Gwen Shapira
Actually, the error you sent shows that its trying to read a TEXT file as if it was Seq. Thats why I suspected a misconfiguration of some sort. Why do you suspect a race condition? On Mon, Mar 2, 2015 at 5:19 PM, Bhavesh Mistry wrote: > Hi Gwen, > > We are using MapR (Sorry no Cloudera) distribu

Re: Camus Issue about Output File EOF Issue

2015-03-02 Thread Bhavesh Mistry
I suspect Camus job has issue because other process ( another separate Map/Reduce Job) also write to same "time" (folders) bucket and it does not have this issue at all (so far) when reading from other dependent Hive job. This dependent Hive job only have issue with files created via camus job (

Re: kafka producer does not distribute messages to partitions evenly?

2015-03-02 Thread Yang
Thanks. This is indeed the reason. On Mar 2, 2015 4:38 PM, "Christian Csar" wrote: > I believe you are seeing the behavior where the random partitioner is > sticky. > > http://mail-archives.apache.org/mod_mbox/kafka-users/201309.mbox/%3ccahwhrrxax5ynimqnacsk7jcggnhjc340y4qbqoqcismm43u...@mail.gma

Re: Got negative offset lag after restarting brokers

2015-03-02 Thread tao xiao
Since I reused the same consumer group to consume the messages after step 6 data there was no data loss occurred. But if I create a new consumer group for sure the new consumer will suffer data loss. I am more concerning about if this is an acceptable behavior by Kafka that an out of sync broker c

Re: kafka producer does not distribute messages to partitions evenly?

2015-03-02 Thread Jay Kreps
FWIW, this intensely confusing behavior is fixed in the new producer which should give the expected result by default. -Jay On Mon, Mar 2, 2015 at 6:36 PM, Yang wrote: > Thanks. This is indeed the reason. > On Mar 2, 2015 4:38 PM, "Christian Csar" wrote: > > > I believe you are seeing the beha

Re: Got negative offset lag after restarting brokers

2015-03-02 Thread Jiangjie Qin
The scenario you mentioned is equivalent to an unclean leader election. The following settings will make sure there is no data loss: 1. Set replica factor to 3 and minimum ISR size to 2. 2. When produce, use acks=-1 or acks=all 3. Disable unclean leader election. 1) and 2) Guarantees committed mes

Re: Got negative offset lag after restarting brokers

2015-03-02 Thread tao xiao
How do I achieve point 3? is there a config that I can set? On Tue, Mar 3, 2015 at 1:02 PM, Jiangjie Qin wrote: > The scenario you mentioned is equivalent to an unclean leader election. > The following settings will make sure there is no data loss: > 1. Set replica factor to 3 and minimum ISR si

Re: .deleted file descriptors

2015-03-02 Thread Guangle Fan
I see DEL file descriptors as well. java2978 root DELREG 202,16 1074192884 /mnt/data1/kafka/flirt_tasks-1/000143874814.index.deleted Using java from Oracle java version "1.7.0_72" Java(TM) SE Runtime Environment (build 1.7.0_72-b14) Java HotSpot(TM) 64-Bit

Re: Using 0.8.2 jars in consumer with producer of version 0.8.1.1

2015-03-02 Thread Jianshi Huang
0.8.1.1 On Tue, Mar 3, 2015 at 2:07 AM, Jiangjie Qin wrote: > Which server version are you running? > > On 3/2/15, 2:05 AM, "Jianshi Huang" wrote: > > >Hi, > > > >I'd like to use Scala 2.11 with Kafka, which is only supported from 0.8.2. > > > >Can I use 0.8.2 jars for my consumer with producer

OffsetCheckpoint.write()

2015-03-02 Thread Xiao
Hi, all, I just started reading the source codes of Kafka. The current OffsetCheckpoint.write() does not look good to me. After the file rename, it still needs to do a fsync. In addition, it should maintain a checksum for each check point. The checksum corruption needs to be checked during t

Re: moving replications

2015-03-02 Thread sunil kalva
Why can't kafka automatically rebalances partitions with new broker and adjust with existing brokers ? Why should we run manually ? On Tue, Mar 3, 2015 at 6:41 AM, Gwen Shapira wrote: > I think the ReassignPartitionsTool does what you need, at least partially. > > It will move partitions of give

Re: OffsetCheckpoint.write()

2015-03-02 Thread Xiao
Hi, all, In my previous note, the two check points per partition have to be stored in different files. Otherwise, the files could be corrupted. Thanks, Xiao Li On Mar 2, 2015, at 10:25 PM, Xiao wrote: > Hi, all, > > I just started reading the source codes of Kafka. The current > Offse

Re: Got negative offset lag after restarting brokers

2015-03-02 Thread Gwen Shapira
of course :) unclean.leader.election.enable On Mon, Mar 2, 2015 at 9:10 PM, tao xiao wrote: > How do I achieve point 3? is there a config that I can set? > > On Tue, Mar 3, 2015 at 1:02 PM, Jiangjie Qin > wrote: > >> The scenario you mentioned is equivalent to an unclean leader election. >> The

How to easily get all broker of one topic?

2015-03-02 Thread ChenHongHai
If there is 1000 brokers, and one topic only has 3 partitions, and 2 replica.How to easily get all broker of it? If directly get them from zookeeper /brokers/topics/theTopic/partition, maybe at that time some partition leader not on line and can't get full list of all leader and replica. If t

回复: Kafka broker failed to work when migrating the broker to another machine

2015-03-02 Thread YuanJia Li
Zhongjing, can you show some server.log ? ie:broker 4 or the leader of topic "log_error". 发件人: 俞忠静 发送时间: 2015-03-02 18:37 收件人: users@kafka.apache.org 抄送: 吴红芳; 邵宏亮; 方超超 主题: Kafka broker failed to work when migrating the broker to another machine Hi, I set up a Kafka cluster with 5 brokers, with