----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56290/#review170404 -----------------------------------------------------------
This is looking very nice, Mike! I'm going to pull down what you have now and test it locally. There are a couple of minor things (byte<->string encoding, text/byte[] performance, etc) to look at. I think the only major thing that stood out at me is the `NO_MATCH` `Range`. Your unit test coverage is quite nice, too! Good work. accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java Lines 135 (patched) <https://reviews.apache.org/r/56290/#comment243230> I think this is still an issue here. 1. Semantically, a lack of an index record just means that there are no records in the data table. As such, an empty `List<Range>` would make sense to me. 2. I don't see use of `NO_MATCH` as the "special" match in the query execution path to short-circuit out and return no results (maybe I missed it?) Best as I can see, this is only used by tests. I think checking for an empty list returned by `getIndexRowRanges(..)` would be better. accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java Lines 165 (patched) <https://reviews.apache.org/r/56290/#comment243231> A null check on `indexColumns` would be good (here or in the constructor). A quick glance looks like we pull it out a `Properties` instance which could be `null` (intentionally or not). accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexLexicoder.java Lines 65 (patched) <https://reviews.apache.org/r/56290/#comment243232> Change the `getBytes()` calls to `getBytes(StandardCharsets.UTF_8)`, please. `java.nio.charset.StandardCharsets` was a nice addition in 1.7. Treating those bytes as utf-8 is reasonable enough. accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java Lines 90 (patched) <https://reviews.apache.org/r/56290/#comment243235> Why `var6` and not `e`? :) accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java Lines 164 (patched) <https://reviews.apache.org/r/56290/#comment243236> Use the Logger to print the stack trace instead, please. accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java Lines 223 (patched) <https://reviews.apache.org/r/56290/#comment243237> Nit: the escaped single-quote is unnecessary. You just strike the '\' accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java Lines 225 (patched) <https://reviews.apache.org/r/56290/#comment243238> The catch on `AccumuloException` and `AccumuloSecurityException` can just be dropped since this method can throw them. Don't need to catch+rethrow. accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java Lines 268 (patched) <https://reviews.apache.org/r/56290/#comment243243> You can make this a bit better (avoiding String manipulation) by doing the following: ``` Text cf = new Text(); Text cq = new Text(); for (ColumnUpdate cu : ..) { cu.getColumnFamily(cf); cu.getColumnQualifier(cq); ... byte[] colType = indexDef.getColType(cf, cq); ``` For every column update in the mutation, this would save you a couple of String creations and byte-array copies. accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java Lines 332 (patched) <https://reviews.apache.org/r/56290/#comment243244> I think you're missing a `throw new IOException(var7)` here. Users wouldn't know if data wasn't actually written unless they read the log (which they wouldn't do if the job succeeded). accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/AccumuloRangeGenerator.java Lines 346 (patched) <https://reviews.apache.org/r/56290/#comment243246> `getBytes(UTF_8)` here too, please. accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloIndexParameters.java Lines 70 (patched) <https://reviews.apache.org/r/56290/#comment243248> I'd just drop this `catch` and have the method `throw AccumuloIndexScannerException`. You'll get the same effect out of junit. - Josh Elser On March 6, 2017, 4:24 p.m., Mike Fagan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56290/ > ----------------------------------------------------------- > > (Updated March 6, 2017, 4:24 p.m.) > > > Review request for hive and Josh Elser. > > > Repository: hive-git > > > Description > ------- > > HIVE-15795: Add Accumulo Index Table Support > > > Diffs > ----- > > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexLexicoder.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexScanner.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexScannerException.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloStorageHandler.java > cdbc7f2 > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexDefinition.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/HiveAccumuloTableOutputFormat.java > 3ae5431 > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/IndexOutputConfigurator.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/package-info.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/AccumuloPredicateHandler.java > a7ec7c5 > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/AccumuloRangeGenerator.java > 21392d1 > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java > 17d5529 > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/serde/AccumuloIndexParameters.java > PRE-CREATION > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/serde/AccumuloSerDeParameters.java > 09c5f24 > > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/serde/package-info.java > PRE-CREATION > > accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloDefaultIndexScanner.java > PRE-CREATION > > accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloIndexLexicoder.java > PRE-CREATION > > accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloIndexParameters.java > PRE-CREATION > > accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloStorageHandler.java > 0aaa782 > > accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloPredicateHandler.java > 88e4530 > > accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloRangeGenerator.java > 339da07 > accumulo-handler/src/test/queries/positive/accumulo_queries.q 279b661 > > > Diff: https://reviews.apache.org/r/56290/diff/2/ > > > Testing > ------- > > > Thanks, > > Mike Fagan > >