Re: log.retention.size

2014-05-23 Thread András Serény
Hi Kafka users, this feature would also be very useful for us. With lots of topics of different volume (and as they grow in number) it could become tedious to maintain topic level settings. As a start, I think uniform reduction is a good idea. Logs wouldn't be retained as long as you want,

Adding partitions...

2014-05-23 Thread Niclas Hedhman
Hi, we are trying to figure out how to layout our topics and partitions. And one thing I can't find in the documentation is; What happens to data that is sitting in the 'old partitions' when I add a new one? My gut feeling says that everything remains in the partitions as they are, and if we nee

Re: Adding partitions...

2014-05-23 Thread svante karlsson
No reshuffeling will take place. And reading messages and put them back in again will not remove the messages from their "old" partition so the same message will the exist in more than one partition - eventually to get aged out of the oldest partion. If you use partitioning to distribute the load

Re: Adding partitions...

2014-05-23 Thread Niclas Hedhman
Right... We don't yet know if there is going to be "meaning" to the partition (such as content affinity of consumers has been discussed), I am simply setting the stage of how things really work. Thanks Niclas On Fri, May 23, 2014 at 2:41 PM, svante karlsson wrote: > No reshuffeling will take

Re: log.retention.size

2014-05-23 Thread Jun Rao
Yes, that's possible. There is a default log.retention.bytes for every topic. By introducing a global threshold, we may have to delete data from logs whose size is smaller than log.retention.bytes. So, are you saying that the global threshold has precedence? Thanks, Jun On Fri, May 23, 2014 at

kafka-storm-starter released: code examples that integrate Kafka 0.8 and Storm 0.9

2014-05-23 Thread Michael G. Noll
Hi everyone, to sweeten the upcoming long weekend I have released code examples that show how to integrate Kafka 0.8+ with Storm 0.9+, while using Apache Avro as the data serialization format. https://github.com/miguno/kafka-storm-starter Since the integration of the latest Kafka and Storm v

Re: kafka-storm-starter released: code examples that integrate Kafka 0.8 and Storm 0.9

2014-05-23 Thread Neha Narkhede
Cool. Thanks for sharing. I added it to our ecosystem wiki. -Neha On Fri, May 23, 2014 at 8:02 AM, Michael G. Noll < michael+ka...@michael-noll.com> wrote: > Hi everyone, > > to sweeten the upcoming long weekend I have released code examples that > show how to integrate Kafka 0.8+ with Storm 0.

kafka-producer-perf-test.sh help please

2014-05-23 Thread Chris Neal
Hi everyone. Looking for some help running the Kafka performance testing shell script. First got the NoClassDefFound error, and then built from src from these instructions: wget http://archive.apache.org/dist/kafka/0.8.1.1/kafka-0.8.1.1-src.tgz tar -xvf kafka-0.8.1.1-src.tgz cd kafka-0.8.1.1-src

Re: kafka-producer-perf-test.sh help please

2014-05-23 Thread Joel Koshy
The producer performance class is not included in the binary release. You can build it from the source release (as you attempted) - although you need to use gradle. Just run "./gradlew jar" - you can see the README.md file for more information. Joel On Fri, May 23, 2014 at 02:17:21PM -0500, Chris

Topic Partitioning Strategy For Large Data

2014-05-23 Thread Bhavesh Mistry
Hi Kafka Users, We are trying to transport 4TB data per day on single topic. It is operation application logs.How do we estimate number of partitions and partitioning strategy? Our goal is to drain (from consumer side) from the Kafka Brokers as soon as messages arrive (keep the lag as min

Re: Topic Partitioning Strategy For Large Data

2014-05-23 Thread Joel Koshy
Take a look at: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIchoosethenumberofpartitionsforatopic? On Fri, May 23, 2014 at 12:49:39PM -0700, Bhavesh Mistry wrote: > Hi Kafka Users, > > > > We are trying to transport 4TB data per day on single topic. It is > operation applica

Re: kafka-producer-perf-test.sh help please

2014-05-23 Thread Chris Neal
Hi Joel, Thanks for the prompt reply. gradlew jar and gradelw perf:jar seems to have gotten me what I needed: Again, thanks for your time. Chris On Fri, May 23, 2014 at 2:46 PM, Joel Koshy wrote: > > The producer performance class is not included in the binary release. > You can build it from

Re: stream logs from remote servers.

2014-05-23 Thread Otis Gospodnetic
Hi, Try: https://github.com/joekiller/logstash-kafka https://github.com/rngadam/sendkafka https://issues.apache.org/jira/browse/FLUME-2242 https://github.com/baniuyao/flume-kafka Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/