+1 on separating the end points for different purposes. On 2/12/15, 5:47 PM, "Gwen Shapira" <gshap...@cloudera.com> wrote:
>I REALLY like the idea of supporting separate network for inter-broker >communication (and probably Zookeeper too). >I think its actually a pretty typical configuration in clusters, so I'm >surprised we didn't think of it before :) >Servers arrive with multiple cards specifically for "admin nic" vs. >"clients nic" vs "storage nic". > >That said, I'd like to handle it in a separate patch. First because >KAFKA-1809 is big enough already, and second because this really deserve >its own requirement-gathering and design. > >Does that make sense? > >Gwen > > > >On Thu, Feb 12, 2015 at 12:34 PM, Todd Palino <tpal...@gmail.com> wrote: > >> The idea is more about isolating the intra-cluster traffic from the >>normal >> clients as much as possible. There's a couple situations we've seen >>where >> this would be useful that I can think of immediately: >> >> 1) Normal operation - just having the intra-cluster traffic on a >>separate >> network interface would allow it to not get overwhelmed by something >>like a >> bootstrapping client who is saturating the network interface. We see >>this >> fairly often, where the replication falls behind because of heavy >>traffic >> from one application. We can always adjust the network threads, but >> segregating the traffic is the first step. >> >> 2) Isolation in case of an error - We have had situations, more than >>once, >> where we are needing to rebuild a cluster after a catastrophic problem >>and >> the clients are causing that process to take too long, or are causing >> additional failures. This has mostly come into play with file descriptor >> limits in the past, but it's certainly not the only situation. >>Constantly >> reconnecting clients continue to cause the brokers to fall over while we >> are trying to recover a down cluster. The only solution was to firewall >>off >> all the clients temporarily. This is a great deal more complicated if >>the >> brokers and the clients are all operating over the same port. >> >> Now, that said, quotas can be a partial solution to this. I don't want >>to >> jump the gun on that discussion (because it's going to come up >>separately >> and in more detail), but it is possible to structure quotas in a way >>that >> will allow the intra-cluster replication to continue to function in the >> case of high load. That would partially address case 1, but it does >>nothing >> for case 2. Additionally, I think it is also desirable to segregate the >> traffic even with quotas, so that regardless of the client load, the >> cluster itself is able to be healthy. >> >> -Todd >> >> >> On Thu, Feb 12, 2015 at 11:38 AM, Jun Rao <j...@confluent.io> wrote: >> >> > Todd, >> > >> > Could you elaborate on the benefit for having a separate endpoint for >> > intra-cluster communication? Is it mainly for giving intra-cluster >> requests >> > a high priority? At this moment, having a separate endpoint just means >> that >> > the socket connection for the intra-cluster communication is handled >>by a >> > separate acceptor thread. The processing of the requests from the >>network >> > and the handling of the requests are still shared by a single thread >> pool. >> > So, if any of the thread pool is exhausted, the intra-cluster requests >> will >> > still be delayed. We can potentially change this model, but this >>requires >> > more work. >> > >> > An alternative is to just rely on quotas. Intra-cluster requests will >>be >> > exempt from any kind of throttling. >> > >> > Gwen, >> > >> > I agree that defaulting wire.protocol.version to the current version >>is >> > probably better. It just means that we need to document the migration >> path >> > for previous versions. >> > >> > Thanks, >> > >> > Jun >> > >> > >> > On Wed, Feb 11, 2015 at 6:33 PM, Todd Palino <tpal...@gmail.com> >>wrote: >> > >> > > Thanks, Gwen. This looks good to me as far as the wire protocol >> > versioning >> > > goes. I agree with you on defaulting to the new wire protocol >>version >> for >> > > new installs. I think it will also need to be very clear (to general >> > > installer of Kafka, and not just developers) in documentation when >>the >> > wire >> > > protocol version changes moving forwards, and what the risk/benefit >>of >> > > changing to the new version is. >> > > >> > > Since a rolling upgrade of the intra-cluster protocol is supported, >> will >> > a >> > > rolling downgrade work as well? Should a flaw (bug, security, or >> > otherwise) >> > > be discovered after upgrade, is it possible to change the >> > > wire.protocol.version >> > > back to 0.8.2 and do a rolling bounce? >> > > >> > > On the host/port/protocol specification, specifically the ZK config >> > format, >> > > is it possible to have an un-advertised endpoint? I would see this >>as >> > > potentially useful if you wanted to have an endpoint that you are >> > reserving >> > > for intra-cluster communication, and you would prefer to not have it >> > > advertised at all. Perhaps it is blocked by a firewall rule or other >> > > authentication method. This could also allow you to duplicate a >> security >> > > protocol type but segregate it on a different port or interface (if >>it >> is >> > > unadvertised, there is no ambiguity to the clients as to which >>endpoint >> > > should be selected). I believe I asked about that previously, and I >> > didn't >> > > track what the final outcome was or even if it was discussed >>further. >> > > >> > > >> > > -Todd >> > > >> > > >> > > On Wed, Feb 11, 2015 at 4:38 PM, Gwen Shapira >><gshap...@cloudera.com> >> > > wrote: >> > > >> > > > Added Jun's notes to the KIP (Thanks for explaining so clearly, >>Jun. >> I >> > > was >> > > > clearly struggling with this...) and removed the reference to >> > > > use.new.wire.protocol. >> > > > >> > > > On Wed, Feb 11, 2015 at 4:19 PM, Joel Koshy <jjkosh...@gmail.com> >> > wrote: >> > > > >> > > > > The description that Jun gave for (2) was the detail I was >>looking >> > for >> > > > > - Gwen can you update the KIP with that for >>completeness/clarity? >> > > > > >> > > > > I'm +1 as well overall. However, I think it would be good if we >> also >> > > > > get an ack from someone who is more experienced on the >>operations >> > side >> > > > > (say, Todd) to review especially the upgrade plan. >> > > > > >> > > > > On Wed, Feb 11, 2015 at 09:40:50AM -0800, Jun Rao wrote: >> > > > > > +1 for proposed changes in 1 and 2. >> > > > > > >> > > > > > 1. The impact is that if someone uses SimpleConsumer and >> references >> > > > > Broker >> > > > > > explicitly, the application needs code change to compile with >> > 0.8.3. >> > > > > Since >> > > > > > SimpleConsumer is not widely used, breaking the API in >> > SimpleConsumer >> > > > but >> > > > > > maintaining overall code cleanness seems to be a better >>tradeoff. >> > > > > > >> > > > > > 2. For clarification, the issue is the following. In 0.8.3, we >> will >> > > be >> > > > > > evolving the wire protocol of UpdateMedataRequest (to send >>info >> > about >> > > > > > endpoints for different security protocols). Since this is >>used >> in >> > > > > > intra-cluster communication, we need to do the upgrade in two >> > steps. >> > > > The >> > > > > > idea is that in 0.8.3, we will default wire.protocol.version >>to >> > > 0.8.2. >> > > > > When >> > > > > > upgrading to 0.8.3, in step 1, we do a rolling upgrade to >>0.8.3. >> > > After >> > > > > step >> > > > > > 1, all brokers will be capable for processing the new >>protocol in >> > > > 0.8.3, >> > > > > > but without actually using it. In step 2, we >> > > > > > configure wire.protocol.version to 0.8.3 in each broker and do >> > > another >> > > > > > rolling restart. After step 2, all brokers will start using >>the >> new >> > > > > > protocol in 0.8.3. Let's say that in the next release 0.9, we >>are >> > > > > changing >> > > > > > the intra-cluster wire protocol again. We will do the similar >> > thing: >> > > > > > defaulting wire.protocol.version to 0.8.3 in 0.9 so that >>people >> can >> > > > > upgrade >> > > > > > from 0.8.3 to 0.9 in two steps. For people who want to upgrade >> from >> > > > 0.8.2 >> > > > > > to 0.9 directly, they will have to configure >> wire.protocol.version >> > to >> > > > > 0.8.2 >> > > > > > first and then do the two-step upgrade to 0.9. >> > > > > > >> > > > > > Gwen, >> > > > > > >> > > > > > In KIP2, there is still a reference to use.new.protocol. This >> needs >> > > to >> > > > be >> > > > > > removed. Also, would it be better to use >> > > > > intra.cluster.wire.protocol.version >> > > > > > since this only applies to the wire protocol among brokers? >> > > > > > >> > > > > > Others, >> > > > > > >> > > > > > The patch in KAFKA-1809 is almost ready. It would be good to >>wrap >> > up >> > > > the >> > > > > > discussion on KIP2 soon. So, if you haven't looked at this >>KIP, >> > > please >> > > > > take >> > > > > > a look and send your comments. >> > > > > > >> > > > > > Thanks, >> > > > > > >> > > > > > Jun >> > > > > > >> > > > > > >> > > > > > On Mon, Jan 26, 2015 at 8:02 PM, Gwen Shapira < >> > gshap...@cloudera.com >> > > > >> > > > > wrote: >> > > > > > >> > > > > > > Hi Kafka Devs, >> > > > > > > >> > > > > > > While reviewing the patch for KAFKA-1809, we came across two >> > > > questions >> > > > > > > that we are interested in hearing the community out on. >> > > > > > > >> > > > > > > 1. This patch changes the Broker class and adds a new class >> > > > > > > BrokerEndPoint that behaves like the previous broker. >> > > > > > > >> > > > > > > While technically kafka.cluster.Broker is not part of the >> public >> > > API, >> > > > > > > it is returned by javaapi, used with the SimpleConsumer. >> > > > > > > >> > > > > > > Getting replicas from PartitionMetadata will now return >> > > > BrokerEndPoint >> > > > > > > instead of Broker. All method calls remain the same, but >>since >> we >> > > > > > > return a new type, we break the API. >> > > > > > > >> > > > > > > Note that this breakage does not prevent upgrades - existing >> > > > > > > SimpleConsumers will continue working (because we are >> > > > > > > wire-compatible). >> > > > > > > The only thing that won't work is building SimpleConsumers >>with >> > > > > > > dependency on Kafka versions higher than 0.8.2. Arguably, we >> > don't >> > > > > > > want anyone to do it anyway :) >> > > > > > > >> > > > > > > So: >> > > > > > > Do we state that the highest release on which >>SimpleConsumers >> can >> > > > > > > depend is 0.8.2? Or shall we keep Broker as is and create an >> > > > > > > UberBroker which will contain multiple brokers as its >> endpoints? >> > > > > > > >> > > > > > > 2. >> > > > > > > The KIP suggests "use.new.wire.protocol" configuration to >> decide >> > > > which >> > > > > > > protocols the brokers will use to talk to each other. The >> problem >> > > is >> > > > > > > that after the next upgrade, the wire protocol is no longer >> new, >> > so >> > > > > > > we'll have to reset it to false for the following upgrade, >>then >> > > > change >> > > > > > > to true again... and upgrading more than a single version >>will >> be >> > > > > > > impossible. >> > > > > > > Bad idea :) >> > > > > > > >> > > > > > > As an alternative, we can have a property for each version >>and >> > set >> > > > one >> > > > > > > of them to true. Or (simple, I think) have >> > "wire.protocol.version" >> > > > > > > property and accept version numbers (0.8.2, 0.8.3, 0.9) as >> > values. >> > > > > > > >> > > > > > > Please share your thoughts :) >> > > > > > > >> > > > > > > Gwen >> > > > > > > >> > > > > >> > > > > >> > > > >> > > >> > >>