Brian Bloniarz created HIVE-3198:
------------------------------------
Summary: StorageHandler properties not passed to InputFormat (?)
Key: HIVE-3198
URL: https://issues.apache.org/jira/browse/HIVE-3198
Project: Hive
Issue Type: Bug
Environment: trunk r1352973
Reporter: Brian Bloniarz
I'm working on a custom StorageHandler implementation. I use
configureTableJobProperties to pass properties onto a serde & InputFormat, but
it looks to me like the properties aren't present inside the InputFormat.
I found the following code which looks like it's supposed to propagate
JobProperties:
{code}
public class HiveInputFormat<K extends WritableComparable, V extends Writable>
...
public RecordReader getRecordReader(InputSplit split, JobConf job,
Reporter reporter) throws IOException {
HiveInputSplit hsplit = (HiveInputSplit) split;
...
boolean nonNative = false;
PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
if ((part != null) && (part.getTableDesc() != null)) {
Utilities.copyTableJobPropertiesToConf(part.getTableDesc(), cloneJobConf);
nonNative = part.getTableDesc().isNonNative();
}
{code}
In the debugger, I see that part==null so copyTableJobPropertiesToConf doesn't
get called. I see that for this table:
{code}
create external table test3 () STORED BY 'foo' location '/data/bar';
{code}
The InputSplit path is the *file* (i.e. "/data/bar/part-00000") but
pathToPartitionInfo has an entry for the *dir* (i.e "/data/bar").
I attached a patch which fixes the problem for me; it makes things explicit by
passing along the directory name inside the HiveInputSplit; this mean we don't
have to figure out which files are a part of which partition.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira