> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > Thanks for the contribution! I added a few comments below. Thanks

You're welcome. Thanks for Samza.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > gradle.properties, line 18
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254930#file1254930line18>
> >
> >     Please keep the -SNAPSHOT suffix here.

Fixed in new patch.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > gradle/dependency-versions.gradle, line 32
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254931#file1254931line32>
> >
> >     Please remove -MINE.

I had (Snappy linking) issues with the standard RocksDb and had to compile my 
own. See SAMZA-870. Fixed in new patch.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > gradle/dependency-versions.gradle, line 23
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254931#file1254931line23>
> >
> >     There is SAMZA-878 opened for this jackson library upgrade. You can 
> > associate the patch w/ that ticket as well.

Attached the whole patch submitted here as I wasn't sure. Seems overkill.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > gradle/dependency-versions.gradle, line 33
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254931#file1254931line33>
> >
> >     The minimum supported YARN version is still 2.6.1 in Samza and we have 
> > not deprecated the support yet. Hence, I am curious why you need 2.7.1 here?

Had trouble getting jobs to work on my 2.7.1 cluster but I'm no longer certain 
that changing this was what cured it. Dropping.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala, 
> > line 50
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254932#file1254932line50>
> >
> >     Adding configuration variables also requires adding documentation to 
> > docs/learn/documentation/versioned/jobs/configuration-table.html.

Fixed in new patch.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala,
> >  line 30
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254933#file1254933line30>
> >
> >     Please remove the auto-generated author info. Apache does not allow 
> > this.

Fixed in new patch.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala,
> >  line 33
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254933#file1254933line33>
> >
> >     How is this class used? It would be nice to create some javadoc here to 
> > explain it.
> >     
> >     A set of unit tests for the new class is also needed here.

Fixed in new patch.


> On Feb. 23, 2016, 6:51 p.m., Yi Pan (Data Infrastructure) wrote:
> > samza-kv-rocksdb/src/main/java/org/apache/samza/storage/kv/RocksDbOptionsHelper.java,
> >  line 19
> > <https://reviews.apache.org/r/43732/diff/1/?file=1254934#file1254934line19>
> >
> >     This does not seem to be relevant to the AvroDataFileWriter? Can we 
> > move it to another patch?

Dropped from new patch.


- Edi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43732/#review120357
-----------------------------------------------------------


On Feb. 18, 2016, 7:46 p.m., Edi Bice wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43732/
> -----------------------------------------------------------
> 
> (Updated Feb. 18, 2016, 7:46 p.m.)
> 
> 
> Review request for samza.
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/SAMZA-876 
> 
> Implemented AvroDataFileHdfsWriter fashioned loosely after 
> BinarySequenceFileHDFSWriter.
> 
> Exposed several RocksDb configuration options (recommended in RocksDb tuning 
> guide):
> 
> rocksdb.log.level
> rocksdb.log.keepfilenum
> rocksdb.log.timetoroll
> rocksdb.log.maxfilesize
> 
> rocksdb.bloomfilter.bits
> rocksdb.max.background.compactions
> rocksdb.max.background.flushes
> rocksdb.num.write.buffers
> rocksdb.target.file.size.base
> rocksdb.max.bytes.level.base
> 
> 
> Diffs
> -----
> 
>   gradle.properties 16e1f5d 
>   gradle/dependency-versions.gradle 52e25aa 
>   samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsConfig.scala 
> 7993119 
>   
> samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/AvroDataFileHdfsWriter.scala
>  PRE-CREATION 
>   
> samza-kv-rocksdb/src/main/java/org/apache/samza/storage/kv/RocksDbOptionsHelper.java
>  d4f765c 
> 
> Diff: https://reviews.apache.org/r/43732/diff/
> 
> 
> Testing
> -------
> 
> I am using AvroDataFileHdfsWriter at the end of my pipeline. I feed the 
> generated avro files to Apache Samoa. Have processed millions of records 
> successfully. The RocksDb config changes are older and were used and verified 
> to be working when originally implemented.
> 
> 
> Thanks,
> 
> Edi Bice
> 
>

Reply via email to