Re: [KafkaStreams 1.1.1] partition assignment broken?

2018-10-09 Thread Bart Vercammen
Hi Bill, Thanks for the reply. We had a look at the patch for KAFKA-7144 and will try it out on Kafka 1.1.1 Currently a full upstep to 2.0.x is not yet an option. In the mean time I have some unit-tests that reproduce this problem, so the backport to v1.1.1 can easily be verified. Greets, Bart

Disk-size aware partitioning

2018-10-09 Thread Vincent Bernardi
Hello everyone, I couldn't find an answer to this on the web so: If I have a Kafka cluster where nodes have different disk size, is there a way to have an automatic partitioning aware of these disk sizes (i.e. allocating 4 partitions to the 4TB node and 1 partition to the 1TB node)? I know I can pa

Why is segment.ms=10m for repartition topics in KafkaStreams?

2018-10-09 Thread Niklas Lönn
Hi, Recently we experienced a problem when resetting a streams application, doing quite a lot of operations based on 2 compacted source topics, with 20 partitions. We crashed entire broker cluster with TooManyOpenFiles exception (We have a multi million limit already) When inspecting the interna

Re: [KafkaStreams 1.1.1] partition assignment broken?

2018-10-09 Thread Bill Bejeck
Hi Bart, Sounds good. Let me know how it goes. -Bill On Tue, Oct 9, 2018 at 5:08 AM Bart Vercammen wrote: > Hi Bill, > > Thanks for the reply. > We had a look at the patch for KAFKA-7144 and will try it out on Kafka > 1.1.1 > Currently a full upstep to 2.0.x is not yet an option. > > In the m

Re: Why is segment.ms=10m for repartition topics in KafkaStreams?

2018-10-09 Thread Guozhang Wang
Hi Niklas, Default value of segment.ms is set to 10min as part of this project (introduced in Kafka 1.1.0): https://jira.apache.org/jira/browse/KAFKA-6150 https://cwiki.apache.org/confluence/display/KAFKA/KIP-204+%3A+Adding+records+deletion+operation+to+the+new+Admin+Client+API In KIP-204 (KAFK

Re: Disk-size aware partitioning

2018-10-09 Thread Brett Rann
LInkedin's cruise-control https://github.com/linkedin/cruise-control has numerous goals, including disk, network, cpu, rack awareness, leadership distribution etc. You can have separate disk/network limits per broker (ours are all the same fwiw) We use it and it does a stellar job of keeping a