I'm working on defining a custom InputFormat and OutputFormat for use
with Hive. I'd like tables using these IF/OF to be native tables, so
that I can LOAD DATA and INSERT INTO them. However, I'm finding that
with the default CombineHiveInputFormat, the getSplits method of my
InputFormat is not being called. If I "set
hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;", then
getSplits is called.
What I want to know is:
- Is this difference in behavior between CombineHiveInputFormat and
HiveInputFormat intentional?
- Is there any way of forcing CombineHiveInputFormat to call getSplits
on my own InputFormat? I was reading through the code for
CombineHiveInputFormat, and it looks like it might only call my own
InputFormat's getSplits method if the table is non-native. I'm not sure
if I'm interpreting this correctly.
- Is it better to set "hive.input.format" to work around this, or to
create a StorageHandler and make non-native tables?
Thanks for any advice.
- CombineHiveInputFormat does not call getSplits on cus... Luke Lovett
-