I'm working on defining a custom InputFormat and OutputFormat for use with Hive. I'd like tables using these IF/OF to be native tables, so that I can LOAD DATA and INSERT INTO them. However, I'm finding that with the default CombineHiveInputFormat, the getSplits method of my InputFormat is not being called. If I "set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;", then getSplits is called.

What I want to know is:
- Is this difference in behavior between CombineHiveInputFormat and HiveInputFormat intentional? - Is there any way of forcing CombineHiveInputFormat to call getSplits on my own InputFormat? I was reading through the code for CombineHiveInputFormat, and it looks like it might only call my own InputFormat's getSplits method if the table is non-native. I'm not sure if I'm interpreting this correctly. - Is it better to set "hive.input.format" to work around this, or to create a StorageHandler and make non-native tables?

Thanks for any advice.

Reply via email to