----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68099/#review206929 -----------------------------------------------------------
Please note that while I have not commented on every occurrence of the following issues, I would still like them all to be fixed: * Unnecessary 'else' clauses * Unnecessary uses of 'this' * RuntimeExceptions which should be replaced with checked exceptions. contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryFileInputFormat.java Lines 1 (patched) <https://reviews.apache.org/r/68099/#comment290075> I think this code and the other files in this patch belong in the serde module. Please move the code to serde/src/java/org/apache/hadoop/hive/serde2/teradata contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryFileOutputFormat.java Lines 71 (patched) <https://reviews.apache.org/r/68099/#comment290076> This should be a static final class variable, i.e: static final byte RECORD_END_BYTE = ... contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java Lines 74 (patched) <https://reviews.apache.org/r/68099/#comment290081> Please avoid unnecessary uses of "this", both in this file and others in the patch. contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java Lines 89 (patched) <https://reviews.apache.org/r/68099/#comment290083> Change message to "Input file is compressed. Using compression code %s" contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java Lines 92 (patched) <https://reviews.apache.org/r/68099/#comment290082> Please remove or change message to "The input file is not compressed". contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java Lines 116 (patched) <https://reviews.apache.org/r/68099/#comment290077> static import String.format() in order to avoid constantly using the "String." prefix. contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java Lines 155 (patched) <https://reviews.apache.org/r/68099/#comment290078> Magic constants (e.g. "0x0a") should be defined in one place (e.g. TeradataConstants.java) as a static final variable. contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java Lines 197 (patched) <https://reviews.apache.org/r/68099/#comment290079> Remove contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java Lines 211 (patched) <https://reviews.apache.org/r/68099/#comment290080> Please make this more readable by replace the ?: operator with equivalent if(){} else {} code. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataInputStream.java Lines 101 (patched) <https://reviews.apache.org/r/68099/#comment290085> Unnecessary else clause. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataInputStream.java Lines 122 (patched) <https://reviews.apache.org/r/68099/#comment290084> This else clause is unnecessary. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataOutputStream.java Lines 105 (patched) <https://reviews.apache.org/r/68099/#comment290086> This else clause is unnecessary if you explicitly return from the previous block. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataOutputStream.java Lines 148 (patched) <https://reviews.apache.org/r/68099/#comment290087> Consider breaking this into multiple lines for improved readability: int toWrite = date.get().getYear() * 10000 + date.get().getMonth() * 100 + ... contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataOutputStream.java Lines 178 (patched) <https://reviews.apache.org/r/68099/#comment290089> Add explicit return and remove unnecessary else clause. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataOutputStream.java Lines 184 (patched) <https://reviews.apache.org/r/68099/#comment290088> Instead of logging this info separately, I think it would make more sense to include this in the exception message. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataOutputStream.java Lines 191 (patched) <https://reviews.apache.org/r/68099/#comment290093> Unnecessary else clause. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinarySerde.java Lines 190 (patched) <https://reviews.apache.org/r/68099/#comment290090> Is this worth logging? If so, consider changing the log level to DEBUG. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinarySerde.java Lines 217 (patched) <https://reviews.apache.org/r/68099/#comment290092> remove line contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinarySerde.java Lines 363 (patched) <https://reviews.apache.org/r/68099/#comment290091> Throwing a SerdeException contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinarySerde.java Lines 363 (patched) <https://reviews.apache.org/r/68099/#comment290094> Replace RuntimeExceptions with SerdeExceptions. contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinarySerde.java Lines 409 (patched) <https://reviews.apache.org/r/68099/#comment290095> Combine this into one statement using warn(String message, Throwable ex) contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinarySerde.java Lines 578 (patched) <https://reviews.apache.org/r/68099/#comment290096> convert these into if statements and return the value instead of savintg it to byteNum contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestGeneralFunctions.java Lines 31 (patched) <https://reviews.apache.org/r/68099/#comment290097> This belongs in the common module since it's testing functionality that is defined in the common module. - Carl Steinbach On July 31, 2018, 12:22 a.m., Lu Li wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68099/ > ----------------------------------------------------------- > > (Updated July 31, 2018, 12:22 a.m.) > > > Review request for hive, Carl Steinbach and Daniel Dai. > > > Bugs: HIVE-20225 > https://issues.apache.org/jira/browse/HIVE-20225 > > > Repository: hive-git > > > Description > ------- > > When using TPT/BTEQ to export Data from Teradata, Teradata will export binary > files based on the schema. > > A Customized SerDe is needed in order to directly read these files from Hive. > > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > > SELECT * FROM `TABLE1`; > Problem Statement: > > Right now the fast way to export data from Teradata is using TPT. However, > the Hive could not directly utilize these exported binary format because it > doesn't have a SerDe for these files. > > Result: > > Provided with the SerDe, Hive can operate upon the exported Teradata Binary > Format file transparently. > > > Diffs > ----- > > > contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryFileInputFormat.java > PRE-CREATION > > contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryFileOutputFormat.java > PRE-CREATION > > contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/teradata/TeradataBinaryRecordReader.java > PRE-CREATION > > contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataInputStream.java > PRE-CREATION > > contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinaryDataOutputStream.java > PRE-CREATION > > contrib/src/java/org/apache/hadoop/hive/contrib/serde2/teradata/TeradataBinarySerde.java > PRE-CREATION > > contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestGeneralFunctions.java > PRE-CREATION > > contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestTeradataBinarySerdeForDate.java > PRE-CREATION > > contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestTeradataBinarySerdeForDecimal.java > PRE-CREATION > > contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestTeradataBinarySerdeForTimeStamp.java > PRE-CREATION > > contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestTeradataBinarySerdeGeneral.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/68099/diff/2/ > > > Testing > ------- > > Junit tests have been added for Serialization and Deserialization functions > > > Thanks, > > Lu Li > >