Hi Clay, I agree with Gwen in thinking that you might want to take a second look at streaming protobuf data to Kafka and then having connectors read that from Kafka. To address the issues in order:
1. You say you have hundreds of machines sending data, but if you run a connector that is not tied to a single ip address you'd basically need to update all these data senders with the new ip address, if the connector moves around in the cluster. Only way around that that I can come up with is to have a loadbalancer with a single ip pointed at all machines in the cluster and performing regular healthchecks to find out where the job is currently running (similar to what Mesos does), but that would not be interruption-free. 2. Kafka does run in a cluster mode, is HA and very scalable. It was arguably built to do the exact job that you are describing here. The downside is, that you would need to change your data senders, which might not be possible, I do not know that. Perhaps you could implement a tiny tool that reads from TCP and forwards the message to Kafka (Logstash might be an option, not sure). To make this HA and scalable just deploy more than one of these jobs and put a loadbalancer before them to distribute requests across all instances. This is a very similar architecture to what you wanted to do with connect, but without the issue of jobs moving around in the cluster which would create unnecessary complexity. Just my 2 cent, but hope it helps :) On Wed, Jul 5, 2017 at 6:09 AM, Clay Teahouse <clayteaho...@gmail.com> wrote: > Hello Gwen, > > Thanks for the reply. My comments/answers inline. > > 1. Connectors that listen on sockets typically run in stand-alone mode, so > they can tied to a specific machine (in distributed mode, connectors can > move around). > [Clay:] Even if the connectors move around, they can still listen to a > specific port on the node in the cluster, right? The data will be sent to > the cluster of connectors from hundreds of data sources. > 2. Why do you need a connector? Why not just use Kafka producer to send > protobuf directly to Kafka? > > [Clay:] I have hundreds of data sources which push the data to the > connectors. I do need the connectors to run in a cluster mode, for HA and > scalability. > > > > On Tue, Jul 4, 2017 at 10:45 PM, Gwen Shapira <g...@confluent.io> wrote: > > > I don't remember seeing one. There is no reason not to write one (let us > > know if you do, so we can put it on the connector hub!). > > > > Few things: > > 1. Connectors that listen on sockets typically run in stand-alone mode, > so > > they can tied to a specific machine (in distributed mode, connectors can > > move around). > > 2. Why do you need a connector? Why not just use Kafka producer to send > > protobuf directly to Kafka? > > > > Gwen > > On Tue, Jul 4, 2017 at 9:02 AM Clay Teahouse <clayteaho...@gmail.com> > > wrote: > > > > > Hello All, > > > > > > I'd appreciate your help with the following questions. > > > > > > 1) Is there kafka connect for listening to tcp sockets? > > > > > > 2) If, can the messages be in protobuf, with each messaged prefixed > with > > > the length of the message? > > > > > > thanks > > > Clay > > > > > > -- Sönke Liebau Partner Tel. +49 179 7940878 OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel - Germany