[
https://issues.apache.org/jira/browse/KAFKA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gwen Shapira resolved KAFKA-1705.
---------------------------------
Resolution: Won't Fix
If I didn't do it by now...
Also, MapReduce is so 2014 :)
> Add MR layer to Kafka
> ---------------------
>
> Key: KAFKA-1705
> URL: https://issues.apache.org/jira/browse/KAFKA-1705
> Project: Kafka
> Issue Type: Improvement
> Reporter: Gwen Shapira
> Assignee: Gwen Shapira
> Priority: Major
>
> Many NoSQL-type storage systems (HBase, Mongo,
> Cassandra) and file formats (Avro, Parquet) provide is a MapReduce
> integration layer - usually an InputFormat, OutputFormat and a utility
> class. Sometimes there's also an abstract Job and Mapper that do more
> setup, which can make things even more convenient.
> This is different than the existing Hadoop contrib project or Camus in that
> an MR layer will be providing components for use in MR jobs, not an entire
> job that ingests data from Kafka to HDFS.
> The benefits I see for a MapReduce layer are:
> * Developers can create their own jobs, processing the data as it is
> ingested - rather than having to process it in two steps.
> * There's reusable components for developers looking to integrate with
> Kafka, rather than having everyone implement their own solution.
> * Hadoop developers expect projects to have this layer.
> * Spark reuses Hadoop's InputFormat and OutputFormat - so we get Spark
> integration for free.
> * There's a layer to plug the delegation token code into and make it
> invisible to MapReduce developers. Without this, everyone who writes
> MR jobs will need to think about how to implement authentication.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)