Hi Otto,

We've separated our traffic for a couple of reasons:
1. We wanted to protect our producer bandwidth to maintain a low latency
pipeline
2. We expected consumers to sometimes pick up from an older offset and clog
the pipe, causing latency for other services
3. When an out of sync replica comes back online, we don't want it to
impact producers / consumers for the in-sync replicas that are feeding data
for replication.

This is our set-up (hopefully this helps you design something similar)
*Brokers:*

   - 3 NICs: Broker[XX][eth 0, 1, 2]
   - Ensure that your brokers respond back over the same network interface
   that the request comes in.

*Producers:*

   - 1 NIC
   - Host file maps Broker[XX] -> Broker[XX][eth 1]

*Consumers:*

   - 1 NIC
   - Host file maps Brokers[XX] -> Broker[XX][eth 2]

*Zookeeper:*

   - 1 NIC
   - No Modifications

*What this accomplishes:*

   - [Replication] Broker A to Broker B and response = Broker[A][eth 0] ->
   Brokers[B][eth 0] -> Broker[A][eth 0]
   - [Publish] Producer A to Broker B and response = Producer[A][eth 0] ->
   Broker[B][eth 1] -> Producer[A][eth 0]
   - [Consume] Consumer A to Broker B and response = Consumer[A][eth 0] ->
   Broker[B][eth 2] -> Consumer[A][eth 0]
   - By default (without a host file entry) our hosts talk to eachother
   over eth 0
   - All services still talk to a normal zookeeper tier.
   - The zookeeper publishes the hostname of the broker to talk to.
   - Hostname translation allows us to remap the hostname to the different
   IP associated with the different network interface.
   - This works well when your different services are different physical /
   virtual machines. You need to get fancier with packet rewriting services if
   you are hosting multiple services on the same host.

Hope this helps!

Joris


On Wed, Mar 26, 2014 at 9:10 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Hey Otto,
>
> Yeah this isn't something we've really thought about. Presumably the
> implementation would be that the server accept connections on two
> interfaces. That is pretty easy. However the harder part is that I think
> this would require updating the metadata to advertise a different ip/host
> to other brokers versus to producers (right now there is just one for
> both). Or maybe there would be another way to do it?
>
> -Jay
>
>
> On Wed, Mar 26, 2014 at 6:44 AM, Otto Mok <otto....@acuityads.com> wrote:
>
> > Hi Jay,
> >
> > We're pushing a lot of data from the producers (n) and have many
> consumers
> > (3n) reading them.
> >
> > We're configured to have replication factor of 3, so replication traffic
> > is about (2n).
> >
> > Currently all traffic was on a single NIC, so that's about (6n) total.
> >
> > Having the replication traffic on different IP/NIC would reduce the
> > bandwidth usage by 33%, down to (4n).
> > Or 50% more capacity for producers to push before hitting the NIC's cap
> (1
> > Gbps)
> >
> > We're not quite at the cap yet, but would like to see if we can make use
> > of the second NIC to give us more room in the primary NIC.
> >
> > Thanks.
> >
> > Otto out!
> >
> > -----Original Message-----
> > From: Jay Kreps [mailto:jay.kr...@gmail.com]
> > Sent: March-25-14 6:22 PM
> > To: users@kafka.apache.org
> > Subject: Re: Separate broker replication traffic from producer/consumer
> > traffic
> >
> > No not at the moment. Are you seeing a problem that this would resolve?
> >
> > -Jay
> >
> >
> > On Tue, Mar 25, 2014 at 2:55 PM, Otto Mok <otto....@acuityads.com>
> wrote:
> >
> > > Hi all,
> > >
> > > Is there any way to configure the brokers such that producers &
> consumers
> > > are talking via IP1, while the brokers are replicating between
> themselves
> > > using IP2?
> > >
> > > I see there are broker settings for host.name and advertised.host.name
> ,
> > > but it doesn't look like these settings does what I'm looking for.
> > >
> > > Any help or insights will be appreciated.
> > >
> > > Thanks.
> > >
> > > Otto out!
> > >
> > >
> >
>

Reply via email to