[
https://issues.apache.org/jira/browse/AVRO-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616622#comment-14616622
]
Alexander Malyshevskiy commented on AVRO-1456:
----------------------------------------------
This is a definite bug in my case. I have avro files in HDFS and use Hadoop
Pipes to process entities in my C++ code. My input format is
AvroAsTextInputFormat so I could read entities in C++. I also use the ordinal
JSON deserialize methods to deserialize those strings into my objects in C++.
So I cannot use Unions in my case because I just cannot deserialize JSON
strings because of inconsistent with the avro JSON encoding strings that I get.
May be you could point another method to get those entities in my C++ code with
schema that uses Unions?
> AvroAsTextInputFormat is inconsistent with the Avro JSON Encoding described
> in the Avro Specification
> -----------------------------------------------------------------------------------------------------
>
> Key: AVRO-1456
> URL: https://issues.apache.org/jira/browse/AVRO-1456
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.6
> Reporter: Jamie Olson
>
> org.apache.avro.mapred.AvroAsTextInputFormat relies on the toString() method
> rather than using org.apache.avro.generic.GenericDatumWriter.write() and
> org.apache.avro.io.JsonEncoder as in org.apache.avro.tool.DataFileReadTool.
> This results in a serialization of the data element, without the fully
> qualified name as specified in the Avro Specifications JSON Encoding section:
> http://avro.apache.org/docs/1.7.6/spec.html#json_encoding
> The specification indicates that for a union type: ["null","string","Foo"],
> data should be serialized with:
> * null as null;
> * the string "a" as {"string": "a"}; and
> * a Foo instance as {"Foo": {...}}, where {...} indicates the JSON encoding
> of a Foo instance.
> Instead, AvroAsTextInputFormat is serializing these values as
> * null as null;
> * the string "a" as "a"; and
> * a Foo instance as {...}, where {...} indicates the JSON encoding of a Foo
> instance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)