Something in the lines of
... class MyOutputFormat extends MultipleTextOutputFormat<Text, Text> {
protected String generateFileNameForKeyValue(Text key,
Text v, String name) {
Path outpath = new Path(key.toString(), name);
return outpath.toString();
}
}
would create a directory per key.
If you just want to keep your side-effect files separate, then
get your working dir by
FileOutputFormat.getWorkOutputPath(...)
or $mapred_work_output_dir
and dfs -mkdir <workdir>/NewDir and put the secondary files there.
Explained in
http://hadoop.apache.org/core/docs/r0.18.3/api/org/apache/hadoop/mapred/FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf)
Koji
-----Original Message-----
From: Stuart White [mailto:[email protected]]
Sent: Tuesday, April 21, 2009 11:46 AM
To: [email protected]
Subject: Re: Multiple outputs and getmerge?
On Tue, Apr 21, 2009 at 1:00 PM, Koji Noguchi <[email protected]> wrote:
>
> I once used MultipleOutputFormat and created
> (mapred.work.output.dir)/type1/part-_____
> (mapred.work.output.dir)/type2/part-_____
> ...
>
> And JobTracker took care of the renaming to
> (mapred.output.dir)/type{1,2}/part-______
>
> Would that work for you?
Can you please explain this in more detail? It looks like you're
using MultipleOutputFormat for *both* of your outputs? So, you simply
don't use the OutputCollector passed as a parm to Mapper#map()?