In production, you probably want to avoid stacking up the applications like
this. There’s a number of reasons:
1) Kafka’s performance is significantly increased by other applications not
polluting the OS page cache
2) Zookeeper has specific performance requirements - among them are a
dedicated disk for transaction logs that it can sequentially write to
3) Mirror maker chews up a lot of CPU and memory with decompression and
recompression of messages

Particular sizing of your systems is going to be dependent on the amount of
data you are moving around, but at the very least I would recommend that
your Kafka brokers, Zookeeper ensemble, and mirror makers be on separate
systems (stacking up the mirror makers on a common system is fine,
however). The Kafka brokers will need CPU and memory, and of course decent
storage to meet your retention and performance requirements. ZK needs a bit
of memory, and very good disk for the transaction logs, but it’s CPU
requirements are pretty light. Mirror maker needs CPU and memory, but it
has no real need of disk performance at all.

Sizing the brokers, you can probably get away with 3 or 4 GB of heap (this
is based on my experience running really large clusters at LinkedIn - even
at that heap size we were good for a long time), using G1 garbage
collection. The guidelines in the Kafka documentation for this are the ones
that I have developed over the last few years here. Reserve the rest of the
memory for the OS to manage - buffers and cache is your friend.

-Todd


On Mon, Aug 7, 2017 at 11:06 AM, Gabriel Machado <gmachado....@gmail.com>
wrote:

> Thanks Todd, i will set swapiness to 1.
>
> Theses machines will be the future production cluster for our main
> datacenter . We have 2 remote datacenters.
> Kafka will bufferize logs and elasticsearch will index its.
>
> Is it a bad practice to have all these JVMs on the same virtual machine ?
> What do you recommend (number of machines, quantity of GB, CPU...) ? For
> the moment, each node has 4 vcpu.
>
> Gabriel.
>
> 2017-08-07 15:45 GMT+02:00 Todd Palino <tpal...@gmail.com>:
>
> > To avoid swap you should set swappiness to 1, not 0. 1 is a request
> (don't
> > swap if avoidable) whereas 0 is a demand (processes will be killed as OOM
> > instead of swapping.
> >
> > However, I'm wondering why you are running such large heaps. Most of the
> ZK
> > heap is used for storage of the data in memory, and it's obvious from
> your
> > setup that this is a development instance. So if ZK is only being used
> for
> > that Kafka cluster you are testing, you can go with a smaller heap.
> >
> > Also, for what reason are you running a 12 GB heap for Kafka? Even our
> > largest production clusters at LinkedIn are using a heap size of 6 GB
> right
> > now. You want to leave memory open for the OS to use for buffers and
> cache
> > in order to get better performance from consumers. You can see from that
> > output that it's trying to.
> >
> > It really looks like you're just overloading your system. In which case
> > swapping is to be expected.
> >
> > -Todd
> >
> >
> >
> > On Aug 7, 2017 8:34 AM, "Gabriel Machado" <gmachado....@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I have a 3 nodes cluster with 18 GB RAM and 2 GB swap.
> > Each node have the following JVMs (Xms=Xmx) :
> > - Zookeeper 2GB
> > - Kafka 12 GB
> > - Kafka mirror-maker DCa 1 GB
> > - Kafka mirror-maker DCb 1 GB
> >
> > All th JVMs consume 16 GB. It leaves 2 GB for the OS (debian jessie 64
> > bits).
> > Why i have no swap free on these virtual machines ?
> >
> > #free -m
> >              total       used       free     shared    buffers     cached
> > Mem:         18105      17940        164          0         38       6666
> > -/+ buffers/cache:      11235       6869
> > Swap:         2047       2045          2
> >
> >
> > I've read i should avoid jvm swapping.
> > What is the best way to do that ?
> > - modify swapiness threshold
> > - unmount all swap partition
> > - force the jvm to stay in memory with mlockall (
> > https://github.com/LucidWorks/mlockall-agent)
> > - Other solution
> >
> > Gabriel.
> >
>



-- 
*Todd Palino*
Senior Staff Engineer, Site Reliability
Data Infrastructure Streaming



linkedin.com/in/toddpalino

Reply via email to