> On Sept. 1, 2016, 9:21 p.m., Navina Ramesh wrote: > > @lhaiesp: Your patch looks awesome. Happy to review again once you have > > addressed the comments. It will be great if you can add some unit test for > > HdfsSystemConsumer. Some of the documentation that you will have to include > > for this feature will be: > > * Add newly introduced configs to configuration-table.html > > * Add newly introduced metrics to metrics-table.html (Pending SAMZA-702 > > commit) > > * Add a webpage for describing the behavior of HDFS systemconsumer (or more > > generically, consuming from Bounded input sources) and how to use the HDFS > > consumer > > > > You can choose to keep the documentation as a part of this RB or create a > > follow-up JIRA for documentation and assign it to your self. Ideally, we > > don't want to have a lot of gap between code and documentation. > > > > Thanks for such a thorough work!
I do have a work item for documentation. I will take a look at the existing documentations. > On Sept. 1, 2016, 9:21 p.m., Navina Ramesh wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemConsumer.java, > > line 149 > > <https://reviews.apache.org/r/51142/diff/4/?file=1487650#file1487650line149> > > > > Isn't numTotalEventsCounter a sum of all counters in numEventsCounter > > in the map ? Do we want to maintain a running sum? It is. I think I was thinking about adding a config to enable/disable per partition metrics eventually. In that case a total metrics would be necessary. > On Sept. 1, 2016, 9:21 p.m., Navina Ramesh wrote: > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java, > > line 171 > > <https://reviews.apache.org/r/51142/diff/4/?file=1487652#file1487652line171> > > > > Question: Is the generateOldestOffset simply returning a string of "0" > > delimited by a comma? The number of "0" matches the number of files in the > > group? Yes. - Hai ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51142/#review147596 ----------------------------------------------------------- On Sept. 7, 2016, 11:44 p.m., Hai Lu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51142/ > ----------------------------------------------------------- > > (Updated Sept. 7, 2016, 11:44 p.m.) > > > Review request for samza, Chris Pettitt, Yi Pan (Data Infrastructure), and > Navina Ramesh. > > > Bugs: SAMZA-967 > https://issues.apache.org/jira/browse/SAMZA-967 > > > Repository: samza > > > Description > ------- > > Add HDFS System Consumer: > > 1. System admin, partitioner > 2. System consumer with metrics > > > Diffs > ----- > > build.gradle 1d4eb74b1294318db8454631ddd0901596121ab2 > gradle/dependency-versions.gradle 47c71bfde027835682889407261d4798b629d214 > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemAdmin.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/HdfsSystemConsumer.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/PartitionDescriptionUtil.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/DirectoryPartitioner.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/FileSystemAdapter.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/partitioner/HdfsFileSystemAdapter.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/AvroFileHdfsReader.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/HdfsReaderFactory.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/MultiFileHdfsReader.java > PRE-CREATION > > samza-hdfs/src/main/java/org/apache/samza/system/hdfs/reader/SingleFileHdfsReader.java > PRE-CREATION > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala > 61b7570afae3219b618c8830905035063941bdd7 > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala > 92eb4472533db67dca01f075cb460581b4bdac0d > > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemFactory.scala > ef3c20a097ddf2feecaf8b0ad4587ea4bf6570b7 > > samza-hdfs/src/test/java/org/apache/samza/system/hdfs/TestHdfsSystemConsumer.java > PRE-CREATION > > samza-hdfs/src/test/java/org/apache/samza/system/hdfs/TestPartitionDesctiptionUtil.java > PRE-CREATION > > samza-hdfs/src/test/java/org/apache/samza/system/hdfs/partitioner/TestDirectoryPartitioner.java > PRE-CREATION > > samza-hdfs/src/test/java/org/apache/samza/system/hdfs/partitioner/TestHdfsFileSystemAdapter.java > PRE-CREATION > > samza-hdfs/src/test/java/org/apache/samza/system/hdfs/reader/TestAvroFileHdfsReader.java > PRE-CREATION > > samza-hdfs/src/test/java/org/apache/samza/system/hdfs/reader/TestMultiFileHdfsReader.java > PRE-CREATION > samza-hdfs/src/test/resources/integTest/emptyTestFile PRE-CREATION > samza-hdfs/src/test/resources/partitioner/testfile01 PRE-CREATION > samza-hdfs/src/test/resources/partitioner/testfile02 PRE-CREATION > samza-hdfs/src/test/resources/reader/TestEvent.avsc PRE-CREATION > > samza-hdfs/src/test/scala/org/apache/samza/system/hdfs/TestHdfsSystemProducerTestSuite.scala > 261310d03de204718621f601117f016da14841df > samza-yarn/src/main/scala/org/apache/samza/job/yarn/YarnJobFactory.scala > 4e328a5f8c2b496a71e36c106339b7af263c96c7 > > Diff: https://reviews.apache.org/r/51142/diff/ > > > Testing > ------- > > unit tests pass. > > tested by writing a real hdfs samza job and deploying to hadoop cluster. > > > Thanks, > > Hai Lu > >