----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51142/ -----------------------------------------------------------
(Updated Sept. 9, 2016, 1:32 a.m.) Review request for samza, Chris Pettitt, Yi Pan (Data Infrastructure), and Navina Ramesh. Bugs: SAMZA-967 https://issues.apache.org/jira/browse/SAMZA-967 Repository: samza Description (updated) ------- Add HDFS System Consumer: 1. System admin, partitioner 2. System consumer with metrics Design doc can be found here: https://issues.apache.org/jira/secure/attachment/12824078/HDFSSystemConsumer.pdf An overview of the high level architecture: +------------------------------------------------------------------------------+ | | +-----------------+ HDFS | | Obtain | | | Partition +------+----------------------^------+---------------------------------^-------+ | Description | | | | | | | | | | +-------------v-------+ | | Filtering/ | | | | | +---+ Grouping +-----+ | | HDFSAvroFileReader | | | | | | | Persist | | | | +---------+-----------+ Partition | | | | | Description | +------v--------------+ +----------+----------+ | | | | | | | | +---------+-----------+ | |Directory Partitioner| | HDFSAvroWriter | | | IFileReader | | | | | | | | | | +------+--------------+ +----------+----------+ | +---------+-----------+ | | | | | | | | | | | | | | +---------+-----------+ +-+----------+--------+ +----------+----------+ | | | | | | | | | HDFSSystemConsumer | | HDFSSystemAdmin | | HDFSSystemProducer | +----------> | | | | | +---------+-----------+ +-----------+---------+ +----------+----------+ | | | +------------------------------------+------------------------------------+ | +---------------------------------------+--------------------------------------+ | | | HDFSSystemFactory | | | +------------------------------------------------------------------------------+ Diffs ----- build.gradle 1d4eb74b1294318db8454631ddd0901596121ab2 gradle/dependency-versions.gradle 47c71bfde027835682889407261d4798b629d214 samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemAdmin.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemConsumer.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/PartitionDescriptionUtil.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/FileSystemAdapter.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/HdfsFileSystemAdapter.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/AvroFileHdfsReader.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/HdfsReaderFactory.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/MultiFileHdfsReader.java PRE-CREATION samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/SingleFileHdfsReader.java PRE-CREATION samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala 61b7570afae3219b618c8830905035063941bdd7 samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala 92eb4472533db67dca01f075cb460581b4bdac0d samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala ef3c20a097ddf2feecaf8b0ad4587ea4bf6570b7 samza-hdfs/src/test/java/org/apache/samza/system/hdfs/TestHdfsSystemConsumer.java PRE-CREATION samza-hdfs/src/test/java/org/apache/samza/system/hdfs/TestPartitionDesctiptionUtil.java PRE-CREATION samza-hdfs/src/test/java/org/apache/samza/system/hdfs/partitioner/TestDirectoryPartitioner.java PRE-CREATION samza-hdfs/src/test/java/org/apache/samza/system/hdfs/partitioner/TestHdfsFileSystemAdapter.java PRE-CREATION samza-hdfs/src/test/java/org/apache/samza/system/hdfs/reader/TestAvroFileHdfsReader.java PRE-CREATION samza-hdfs/src/test/java/org/apache/samza/system/hdfs/reader/TestMultiFileHdfsReader.java PRE-CREATION samza-hdfs/src/test/resources/integTest/emptyTestFile PRE-CREATION samza-hdfs/src/test/resources/partitioner/testfile01 PRE-CREATION samza-hdfs/src/test/resources/partitioner/testfile02 PRE-CREATION samza-hdfs/src/test/resources/reader/TestEvent.avsc PRE-CREATION samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala 261310d03de204718621f601117f016da14841df samza-yarn/src/main/scala/org/apache/samza/job/yarn/YarnJobFactory.scala 4e328a5f8c2b496a71e36c106339b7af263c96c7 Diff: https://reviews.apache.org/r/51142/diff/ Testing ------- unit tests pass. manually tested by writing a real hdfs samza job and deploying to a yarn cluster. Thanks, Hai Lu