[
https://issues.apache.org/jira/browse/AVRO-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958587#comment-13958587
]
kobefeng commented on AVRO-1356:
--------------------------------
AvroKeyValueOutputFormat still used context.getOutputKeyClass() and
context.getOutputValueClass() without considering isMapOnly.
Also value schema is not using mapOutputValueSchema in
AvroDatumConverterFactory.create(Class<IN> inputClass):
100 Schema schema = null;
101 if (isMapOnly) {
102 AvroJob.getMapOutputValueSchema(getConf());
103 if (null == schema) {
104 schema = AvroJob.getOutputValueSchema(getConf());
105 }
106 }
> AvroMultipleOutputs map only jobs do not use NamedOutput schemas
> ----------------------------------------------------------------
>
> Key: AVRO-1356
> URL: https://issues.apache.org/jira/browse/AVRO-1356
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.4
> Reporter: Alan Paulsen
> Assignee: Alan Paulsen
> Fix For: 1.7.5
>
> Attachments: AVRO-1356.patch
>
>
> AvroMultipleOutputs sets the MapOutputKeySchema when running a map only job,
> as follows:
> {code:java}
> boolean isMaponly = job.getNumReduceTasks() == 0;
> if (keySchema != null) {
> if (isMaponly)
> AvroJob.setMapOutputKeySchema(job, keySchema);
> else
> AvroJob.setOutputKeySchema(job, keySchema);
> }
> if (valSchema != null) {
> if (isMaponly)
> AvroJob.setMapOutputValueSchema(job, valSchema);
> else
> AvroJob.setOutputValueSchema(job, valSchema);
> }
> {code}
> Unfortunately, AvroKeyOutputFormat and AvroKeyValueOutputFormat never check
> if the job is map only, and uses the OutputKeySchema and OutputValueSchema
> regardless.
> We can fix this by either
> * Changing AvroKeyOutputFormat and AvroKeyValueOutputFormat to check if the
> job is map only and use the appropriate schema. (Seems right)
> * Change AvroMultipleOutputs to always use the OutputKeySchema and
> OutputValueSchema
--
This message was sent by Atlassian JIRA
(v6.2#6252)