GitHub user jamescao opened a pull request: https://github.com/apache/flink/pull/1079
[FLINK1919] add HCatOutputFormat [FLINK1919] Add `HCatOutputFormat` for Tuple data types for java and scala api also fix a bug for the scala api's `HCatInputFormat` for hive complex types. Java api includes check for whether the schema of the HCatalog table and the Flink tuples match if the user provides a `TypeInformation` in the constructor. For data types other than tuples, the OutputFormat requires a preceding Map function that converts to `HCatRecords` scala api includes check if the schema of the HCatalog table and the Scala tuples match. For data types other than scala Tuple, the `OutputFormat` requires a preceding Map function that converts to HCatRecords scala api requires user to import `org.apache.flink.api.scala`._ to allow the type be captured by the scala macro. The Hcatalog jar in maven central is compiled using hadoop1, which is not compatible with hive jars for testing, so a cloudera hcatalog jar is pulled into the pom for testing purpose. It can be removed if not required. java List and Map can not be cast to scala `List` and `Map`, `JavaConverters` is used to fix a bug in HcatInputFormat scala api @chiwanpark @rmetzger I have changed the hcatalog jar to the apache version. That requires that I move the hcatalog module to hadoop1 profile. @chiwanpark I had made changes to most of your comment. Except for your comment regarding the verification of Exception in the tests. I feel that it's better to verify the exception at the point it's expected to be thrown. If we use method-wide annotation, we are not sure where the exception is thrown from the test method, this is not safe especially for common exception types such as IOException. I did remove the test dependency on exception error message. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jamescao/flink hcatbranch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1079.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1079 ---- commit b226ff06aa37a84b3203fe012be321f4161f2b03 Author: James Cao <james...@outlook.com> Date: 2015-08-06T01:52:45Z add HCatOutputFormat java api and scala api fix scala HCatInputFormat bug for complex type moved hcatalog module to hadoop1 profile. Modify the surefile configuration for hcatalog tests. Addressed review comments from the first PR. remove unused import ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---