Re: partitioning

2013-01-10 Thread Jun Rao
Our current partitioning strategy is to mod key by # of partitions, not # brokers. For better balancing partitions over brokers, one simple strategy is to over partition, i.e., you have a few times of more partitions than brokers. That way, if one adds more brokers overtime, you can just move some

Re: Access logs aggregation

2013-01-10 Thread Neha Narkhede
Ron, The best way of doing this would be to use the ConsoleProducer. Basically, it reads data from the console and parses it using the message "reader" which by default is the LineReader. In this case, you can either write your own SquidMessageReader that understands the Squid access format [1] an

Re: Kafka open source projects

2013-01-10 Thread Olivier Pomel
While we're on the topic of community - what's the right place / way to advertise Kafka-related jobs? The standard etiquette I've seen in other lists was limiting to clearly labeled, first-party, specific and relevant postings, but since I haven't seen any guidelines here, I thought I'd ask. I do

MapDB

2013-01-10 Thread Jan Kotek
Hi, I am author of MapDB; database engine which provides Maps, Sets, Queues and other collections backed by disk (or in-memory) storage. MapDB is probably the fastest java db engine, it can do 2 million inserts per second etc... I read your design paper and it seems we have common goal (high p

MapDB

2013-01-10 Thread Jan Kotek
Hi, I am author of MapDB; database engine which provides Maps, Sets, Queues and other collections backed by disk (or in-memory) storage. MapDB is probably the fastest java db engine, it can do 2 million inserts per second etc... I read your design paper and it seems we have common goal (high p

partitioning

2013-01-10 Thread Stan Rosenberg
Hi, I apologize if this question has been addressed before. We are currently evaluating kafka for our high volume data ingestion infrastructure. I would like to understand why consistent hashing was not implemented given its inherent ability to dynamically balance the load across brokers. The cur

Re: Access logs aggregation

2013-01-10 Thread Jun Rao
The following wiki describes the operational part of Kafka. https://cwiki.apache.org/confluence/display/KAFKA/Operations To get your log into Kafka, if this log4j data, you may consider adding a KafkaLog4jAppender. Otherwise, you can probably use ConsoleProducer. You will still need to deal with t

Re: Kafka open source projects

2013-01-10 Thread Evan Chan
I added a comment to the Powered By wiki.By the way, the captcha is really annoying. On Wed, Jan 9, 2013 at 9:26 PM, Jay Kreps wrote: > I have been noticing a lot of cool Kafka integrations floating around. I > took some time and went through github and emails and tried to update the > some