On Wed, Apr 6, 2016 at 8:56 AM, Uber Slacker <cubeslac...@gmail.com> wrote:
> Hi folks. I'm pretty new to Kafka. I have spent a fair amount of time so > far understanding the Kafka system in general and how producers and > consumers work. I'm now trying to get a grasp on how Kafka Connect > compares/contrasts to Producers/Consumers written via the Java API. > > When might someone want to write their own Java Producer/Consumer versus > using a connector in Kafka Connect? How does Kafka Connect use producers > and consumers behind the scenes? Why wouldn't we simply want a > producer/consumer library that contains producers and consumers written to > work with various external systems such as HDFS? Why this new framework? > Thanks for any clarification! > Internally Connect does use the producer and consumer. However, the framework adds a lot of support for functionality you want specifically when you are copying data from another system to Kafka or from Kafka to another system. Connect handles distribution and fault tolerance for you at the framework level. It provides a schema/data API and abstracts away details of serialization such that you can write a single connector and support multiple formats. If you're trying to copy data to/from another system, we'd generally recommend using the Connect framework since it adds all this extra support and allows you to focus only on how you get the data into/out of the other system. You'll want to use producers and consumers directly if you need more control that Connect hides from you, but then you'll need to create your own implementation of features Connect provides (or simply not support them). -Ewen