In trying to better understand compression I came across the following
http://geekmantra.wordpress.com/2013/03/28/compression-in-kafka-gzip-or-snappy/
“in Kafka 0.8, messages for a partition are served by the leader broker.
The leader assigns these unique logical offsets to every message it app
Thanks for the updated deck. I had not seen that one yet. I noticed in
the preso you are running RAID10 in prod. Any thoughts of going JBOD? In
our testing we saw significant performance improvements. This of course
comes with trade off of manual steps if brokers fail.
Bert
On Monday, July 7
gt;
> Daniel.
>
> > On 1/07/2014, at 2:07 am, Bert Corderman wrote:
> >
> > Daniel,
> >
> >
> >
> > We have the same question. We noticed that the compression tests we ran
> > using the built in performance tester was not realistic. I think on d
Daniel,
We have the same question. We noticed that the compression tests we ran
using the built in performance tester was not realistic. I think on disk
compression was 200:1. (yes that is two hundred to one) I had planned to
try and edit the producer performance tester source and do the foll
verhead is concerned, have you tried running Snappy?
> Snappy's performance is good enough to offset the decompression-compression
> overhead on the server.
>
> Thanks,
> Neha
>
>
> On Thu, Jun 26, 2014 at 12:42 PM, Bert Corderman
> wrote:
>
> > We are in th
Thanks for the details Luke.
At what point would you consider a message too big?
Are you using compression?
Bert
On Thursday, June 26, 2014, Luke Forehand <
luke.foreh...@networkedinsights.com> wrote:
> I have used 50MB message size and it is not a great idea. First of all
> you need to make
We are in the process of engineering a system that will be using kafka.
The legacy system is using the local file system and a database as the
queue. In terms of scale we process about 35 billion events per day
contained in 15 million files.
I am looking for feedback on a design decision we ar
Only a single broker needs to be online for data to be available. In your
example partition 2 and 3 had copies of data on broker 0 and 1. When those
two brokers went down your data was unavailable. To withstand two brokers
going offline you would want to change your replication factor to 3.
O
support when using a single multi-threaded producer.
Bert
On Sun, Apr 27, 2014 at 11:09 PM, Jun Rao wrote:
> Could you run the tests on the 0.8.1.1 release?
>
> Thanks,
>
> Jun
>
>
> On Sat, Apr 26, 2014 at 8:23 PM, Bert Corderman
> wrote:
>
> > version 0.8.0
version 0.8.0
On Sat, Apr 26, 2014 at 12:03 AM, Jun Rao wrote:
> Bert,
>
> Thanks for sharing. Which version of Kafka were you testing?
>
>
> Jun
>
>
> On Fri, Apr 25, 2014 at 3:11 PM, Bert Corderman
> wrote:
>
> > I have been testing kafka for the past
I have been testing kafka for the past week or so and figured I would share
my results so far.
I am not sure if the formatting will keep in email but here are the results
in a google doc...all 1,100 of them
https://docs.google.com/spreadsheets/d/1UL-o2MiV0gHZtL4jFWNyqRTQl41LFdM0upjRIwCWNgQ/edit
I had this error before and corrected by increasing nofile limit
add to file an entry for the user running the broker.
/etc/security/limits.conf
kafka - nofile 98304
On Thu, Apr 24, 2014 at 1:46 PM, Yashika Gupta
wrote:
> Jun,
>
> The detailed logs are as follows:
>
> 24.04.2014 13:37:31812 I
hour. We run some
> consumers in batch and flush on time delay. Other consumers are flush per
> message processed. It's the flush per message that causes the high-volume.
>
> Push back on DEVs and software architecture if they want to flush per
> message. Do it where it's on
I cant speak to the quick start , however I found the following very
helpful when I was getting started.
http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/
On Thu, Apr 17, 2014 at 9:22 AM, Stephen Boesch wrote:
> I have tried to use kafka
> topic. If you replicate 3 the you will end up with 3x active partitions per
> broker.
>
> 1024 partitions per topic / 24 brokers =~ 43 leader partitions per broker
> per topic.
>
BERT> Thanks for the example. Good to see others are using larger
partition counts.
>
>
I am wondering what others are doing in terms of cluster separation. (if at
all) For example let’s say I need 24 nodes to support a given workload.
What are the tradeoffs between a single 24 node cluster vs 2 x 12 node
clusters for example. The application I support can support separation of
data
16 matches
Mail list logo