Re: kafka user group in los angeles

2015-04-24 Thread Jon Bringhurst
Hey Alex, It looks like this group might be appropriate to have a Kafka talk at: http://www.meetup.com/Los-Angeles-Big-Data-Users-Group/ It might be worth showing up at one of their events and asking around. -Jon On Thu, Apr 23, 2015 at 11:40 AM, Alex Toth wrote: > Hi, > Sorry this isn't dire

Re: Post on running Kafka at LinkedIn

2015-03-20 Thread Jon Bringhurst
Keep in mind that these brokers aren't really stressed too much at any given time -- we need to stay ahead of the capacity curve. Your message throughput will really just depend on what hardware you're using. However, in the past, we've benchmarked at 400,000 to more than 800,000 messages / bro

Re: kafka mirrormaker cross datacenter replication

2015-03-20 Thread Jon Bringhurst
Hey Kane, When mirrormakers loose offsets on catastrophic failure, you generally have two options. You can keep auto.offset.reset set to "latest" and handle the loss of messages, or you can have it set to "earliest" and handle the duplication of messages. Although we try to avoid duplicate mes

Re: Support for Java 1.8?

2015-03-17 Thread Jon Bringhurst
At LinkedIn, we're running on 1.8.0u5. YRMV depending on hardware and load, but this is what we typically run with: -server -Xms4g -Xmx4g -XX:PermSize=96m -XX:MaxPermSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTim

Re: Mirror maker end to end latency metric

2015-03-05 Thread Jon Bringhurst
Hey Tao, Slides 27-30 on http://www.slideshare.net/JonBringhurst/kafka-audit-kafka-meetup-january-27th-2015 has a diagram to visually show that Guozhang is talking about. -Jon On Mar 5, 2015, at 9:03 AM, Guozhang Wang wrote: > There is no end2end latency metric in MM, since such a metric req

Re: Anyone interested in speaking at Bay Area Kafka meetup @ LinkedIn on March 24?

2015-03-02 Thread Jon Bringhurst
The meetups are recorded. For example, here's a link to the January meetup: http://www.ustream.tv/recorded/58109076 The links to the recordings are usually posted to the comments for each meetup on http://www.meetup.com/http-kafka-apache-org/ -Jon On Feb 23, 2015, at 3:24 PM, Ruslan Khafizov

Re: question about new consumer offset management in 0.8.2

2015-02-05 Thread Jon Bringhurst
There should probably be a wiki page started for this so we have the details in one place. The same question was asked on Freenode IRC a few minutes ago. :) A summary of the migration procedure is: 1) Upgrade your brokers and set dual.commit.enabled=false and offsets.storage=zookeeper (Commit o

LinkedIn Engineering Blog Post - Current and Future

2015-01-29 Thread Jon Bringhurst
Here's an overview of what LinkedIn plans to concentrate on in the upcoming year. https://engineering.linkedin.com/kafka/kafka-linkedin-%E2%80%93-current-and-future -Jon signature.asc Description: Message signed with OpenPGP using GPGMail

Re: One or multiple instances of MM to aggregate kafka data to one hadoop

2015-01-29 Thread Jon Bringhurst
Hey Mingjie, Here's how we have our mirror makers configured. For some context, let me try to describe this using the example datacenter layout as described in: https://engineering.linkedin.com/samza/operating-apache-samza-scale In that example, there are four data centers (A, B, C, and D). How

Re: Production settings for JDK 7 + G1 GC

2015-01-15 Thread Jon Bringhurst
We're currently using JDK 8 update 5 with the following settings: -server -Xms4g -Xmx4g -XX:PermSize=96m -XX:MaxPermSize=96m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDi

Re: Kafka 0.8.1.1 Leadership changes are happening very often

2015-01-05 Thread Jon Bringhurst
Several features in Zookeeper depend on server time. I would highly recommend that you properly setup ntpd (or whatever), then try to reproduce. -Jon On Jan 2, 2015, at 2:35 PM, Birla, Lokesh wrote: > We donĀ¹t see zookeeper expiration. However I noticed that our servers > system time is NOT sy

Re: keyed-messages & de-duplication

2014-05-14 Thread Jon Bringhurst
It looks like the log.cleanup.policy config option was changed from "dedupe" to "compact". https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/LogConfig.scala#L68 -Jon On May 13, 2014, at 1:08 PM, Jay Kreps wrote: > Hi, > > The compaction is done to clean-up space. It