Thank you all.

We have in the past exposed message streams backed by Kafka via a HTTP/POST
and Websocket service which worked very well. We were able to filter
messages based on schema compliance and it was very simple for the teams
that generate the data to use. It also had no trouble scaling to the 100K
messages / sec levels.

However not exposing the Kafka protocol has it's drawbacks when you try to
bring in other tools and teams who are already familiar with Kafka.

So we looked for something that would provide:
* Native Kafka protocol support
* Single endpoint access to make access between networks easier
* Schema (and possibly other business logic) enforcement.

I took a couple of weeks to create a PoC that works, at least, with the
producer and consumer command line tools. I have this working now and can
insert a predicate into the PRODUCE message handler that can reject
messages.

We plan to develop this further and take it beyond a PoC. I’d be keen to
understand if you think this kind of component could be a good addition to
the Kafka ecosystem? Are there any other capabilities that might be a good
fit with this proxy layer? And most importantly, does anybody foresee any
fundamental issues with this approach?

James Grant

Developer - Expedia Group


On Tue, 19 Mar 2019 at 16:13, Hans Jespersen <h...@confluent.io> wrote:

>
>
> You might want to take a look at kafka-proxy ( see
> https://github.com/grepplabs/kafka-proxy <
> https://github.com/grepplabs/kafka-proxy>).
> It’s a true kafka protocol proxy and modified the metadata like advertized
> listeners so it works when there is no ip routing between the client and
> the brokers.
>
> -hans
>
>
>
>
>
> > On Mar 19, 2019, at 8:19 AM, James Grant <ja...@queeg.org> wrote:
> >
> > Hello,
> >
> > We would like to expose a Kafka cluster running on one network to clients
> > that are running on other networks without having to have full routing
> > between the two networks. In this case these networks are in different
> AWS
> > accounts but the concept applies more widely. We would like to access
> Kafka
> > over a single (or very few) host names.
> >
> > In addition we would like to filter incoming messages to enforce some
> level
> > of data quality and also impose some access control.
> >
> > A solution we are looking into is to provide a Kafka protocol level proxy
> > that presents to clients as a single node Kafka cluster holding all the
> > topics and partitions of the cluster behind it. This proxy would be able
> to
> > operate in a load balanced cluster behind a single DNS entry and would
> also
> > be able to intercept and filter/alter messages as they passed through.
> >
> > The advantages we see in this approach over the HTTP proxy is that it
> > presents the Kafka protocol whilst also meaning that we can use a typical
> > TCP level load balancer that it is easy to route connections to. This
> means
> > that we continue to use native Kafka clients.
> >
> > Does anything like this already exist? Does anybody think it would
> useful?
> > Does anybody know of any reason it would be impossible (or a bad idea) to
> > do?
> >
> > James Grant
> >
> > Developer - Expedia Group
>
>

Reply via email to