[ https://issues.apache.org/jira/browse/PIG-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohini Palaniswamy updated PIG-5106: ------------------------------------ Fix Version/s: 0.19.0 (was: 0.18.0) > Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true > ----------------------------------------------------------------------------- > > Key: PIG-5106 > URL: https://issues.apache.org/jira/browse/PIG-5106 > Project: Pig > Issue Type: Bug > Reporter: Rohini Palaniswamy > Assignee: Artem Ervits > Priority: Major > Labels: newbie > Fix For: 0.19.0 > > Attachments: PIG-5106-0.patch, PIG-5106-1.patch > > > Many of our classes extending InputFormat have > {code} > /* > * This is to support multi-level/recursive directory listing until > * MAPREDUCE-1577 is fixed. > */ > @Override > protected List<FileStatus> listStatus(JobContext job) throws IOException > { > return MapRedUtil.getAllFileRecursively(super.listStatus(job), > job.getConfiguration()); > } > {code} > Now that we have dropped Hadoop 1.x, it can be optimized to > {code} > if (getInputDirRecursive(job)) { > return super.listStatus(job); > } else { > /* > * mapreduce.input.fileinputformat.input.dir.recursive is not > true > * by default for backward compatibility reasons. > */ > return MapRedUtil.getAllFileRecursively(super.listStatus(job), > job.getConfiguration()); > } > {code} > That would avoid one extra iteration when > mapreduce.input.fileinputformat.input.dir.recursive is set to true by users. -- This message was sent by Atlassian Jira (v8.20.10#820010)