Another way to think about this is that the producer allows you to PUSH data into Kafka and the consumer allows you to PULL data out. This is what you need to write an application.
However for an existing data system you need the opposite you need to PULL data into Kafka from the system or PUSH it out of Kafka into the system. Kafka Connect implements this. Technically you could implement this from scratch but then you'd just be rebuilding what Connect itself does (as Ewen said). Not sure if that made things more or less clear :-) -Jay On Thu, Apr 7, 2016 at 1:41 PM, Ewen Cheslack-Postava <e...@confluent.io> wrote: > On Wed, Apr 6, 2016 at 8:56 AM, Uber Slacker <cubeslac...@gmail.com> > wrote: > > > Hi folks. I'm pretty new to Kafka. I have spent a fair amount of time so > > far understanding the Kafka system in general and how producers and > > consumers work. I'm now trying to get a grasp on how Kafka Connect > > compares/contrasts to Producers/Consumers written via the Java API. > > > > When might someone want to write their own Java Producer/Consumer versus > > using a connector in Kafka Connect? How does Kafka Connect use producers > > and consumers behind the scenes? Why wouldn't we simply want a > > producer/consumer library that contains producers and consumers written > to > > work with various external systems such as HDFS? Why this new framework? > > Thanks for any clarification! > > > > Internally Connect does use the producer and consumer. However, the > framework adds a lot of support for functionality you want specifically > when you are copying data from another system to Kafka or from Kafka to > another system. Connect handles distribution and fault tolerance for you at > the framework level. It provides a schema/data API and abstracts away > details of serialization such that you can write a single connector and > support multiple formats. > > If you're trying to copy data to/from another system, we'd generally > recommend using the Connect framework since it adds all this extra support > and allows you to focus only on how you get the data into/out of the other > system. You'll want to use producers and consumers directly if you need > more control that Connect hides from you, but then you'll need to create > your own implementation of features Connect provides (or simply not support > them). > > -Ewen >