> On Oct. 24, 2018, 2:31 p.m., Szabolcs Vasas wrote: > > Hi Feró, > > > > Thank you for submitting this improvement! > > I have left some comments, see them below. > > Apart from that I think we need to test explicitly that if the > > sqoop.parquet.logical_types.decimal.enable flag is true then the Parquet > > file contains a decimal value and otherwise it contains a string value. > > > > NumericTypesImportTest asserts on string values so it is not able to verify > > this, most of the tests passed even if I commented out the content of the > > addEnableParquetDecimal method. > > Fero Szabo wrote: > I'll look into this one. > > I'm thinking that using the org.apache.sqoop.util.ParquetReader#readAll > method could help (since it returns GenericRecords), though I'm not sure. > I'll somehow need to actually turn off the conversion and check for the > bytes. Any suggestions?
Yes, I would run a Sqoop job with and without conversion switched on and check the schema of the GenericRecord returned by org.apache.sqoop.util.ParquetReader#readAll > On Oct. 24, 2018, 2:31 p.m., Szabolcs Vasas wrote: > > src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java > > Lines 56 (patched) > > <https://reviews.apache.org/r/69060/diff/2/?file=2099870#file2099870line56> > > > > Are we sure that adding the logical type conversion only here is enough? > > In case of Avro it is also added in > > org.apache.sqoop.mapreduce.AvroOutputFormat#getRecordWriter which gets > > invoked in every mapper so I assume that we have to add the conversion in > > every mapper in case of Parquet files too. > > Fero Szabo wrote: > My understanding is that this method is invoked in every mapper. (it's > doc suggest this as well: "Called once at the beginning of the task.") > Where else would put this statement? Sorry, you are right, this should be enough here. - Szabolcs ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69060/#review209954 ----------------------------------------------------------- On Oct. 24, 2018, 12:25 p.m., Fero Szabo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69060/ > ----------------------------------------------------------- > > (Updated Oct. 24, 2018, 12:25 p.m.) > > > Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas. > > > Bugs: SQOOP-3382 > https://issues.apache.org/jira/browse/SQOOP-3382 > > > Repository: sqoop-trunk > > > Description > ------- > > This patch is about adding support for fixed point decimal types in parquet > import. > > The implementation is simple after the fact that parquet was upgraded to > 1.9.0 in SQOOP-3381: we just need to register the GenericDataSupplier with > AvroParquetOutputFormat. > > For testing, we can reuse the existing Avro tests, because Sqoop uses Avro > under the hood to write parquet. > > I also moved around and renamed the classes involved in this change so their > name and package reflect their purpose. > > ** Note: A key design decision can be seen in the ImportJobTestConfiguration > interface ** > - I decided to create a new function to get the expected results for each > file format, since we seldom add new fileformats. > - However this also enforces future configurations to always define their > expected result for every file forma or throw a NotImplementedException > should they lack the support for one. > - The alternative for this is to define the fileLayout as an input parameter > instead. This would allow for better extendability. > _Please share your thoughts on this!_ > > > Diffs > ----- > > src/java/org/apache/sqoop/config/ConfigurationConstants.java 3724f250 > src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 80c06988 > src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java 62334f8a > src/java/org/apache/sqoop/orm/AvroSchemaGenerator.java 7a2a5f9c > src/test/org/apache/sqoop/importjob/ImportJobTestConfiguration.java > 14de910b > src/test/org/apache/sqoop/importjob/avro/AvroImportForNumericTypesTest.java > ff13dc3b > > src/test/org/apache/sqoop/importjob/avro/configuration/MSSQLServerImportJobTestConfiguration.java > 182d2967 > > src/test/org/apache/sqoop/importjob/avro/configuration/MySQLImportJobTestConfiguration.java > e9bf9912 > > src/test/org/apache/sqoop/importjob/avro/configuration/OracleImportJobTestConfiguration.java > b7bad08c > > src/test/org/apache/sqoop/importjob/avro/configuration/OracleImportJobTestConfigurationForNumber.java > 465e61f4 > > src/test/org/apache/sqoop/importjob/avro/configuration/PostgresqlImportJobTestConfigurationForNumeric.java > 66715c17 > > src/test/org/apache/sqoop/importjob/avro/configuration/PostgresqlImportJobTestConfigurationPaddingShouldSucceed.java > ec4db41b > src/test/org/apache/sqoop/util/ParquetReader.java 908ce566 > > > Diff: https://reviews.apache.org/r/69060/diff/2/ > > > Testing > ------- > > 3rd party tests and unit tests, both gradle and ant > > > Thanks, > > Fero Szabo > >