[
https://issues.apache.org/jira/browse/AVRO-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sachin Goyal updated AVRO-1554:
-------------------------------
Attachment: CustomEncodingUnionBug.zip
{quote}
I am confused by the handling of custom-encoded unions in the patch. How can it
be correct to discard the index on read?
{quote}
[~cutting], It appears that the current implementation has a bug for not
supporting unions. I am attaching a simple test 'CustomEncodingUnionBug.zip' to
demonstrate this. If *ReflectData.AllowNull* is used, unions are produced in
the schema and following exception is thrown:
{color:brown}
java.io.IOException: Invalid int encoding
at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:145)
at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423)
at
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at
org.apache.avro.io.ResolvingDecoder.readLong(ResolvingDecoder.java:152)
at
org.apache.avro.reflect.DateAsLongEncoding.read(DateAsLongEncoding.java:50)
at
org.apache.avro.reflect.DateAsLongEncoding.read(DateAsLongEncoding.java:33)
at org.apache.avro.reflect.CustomEncoding.read(CustomEncoding.java:45)
at
org.apache.avro.reflect.FieldAccessUnsafe$UnsafeCustomEncodedField.read(FieldAccessUnsafe.java:353)
at
org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:214)
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
at
org.avro.bugs.AvroEncodeUnionTest.testEncodingWithUnion(AvroEncodeUnionTest.java:53)
{color}
\\
\\
If *AllowNull* is not used, then the test passes without issues.
The submitted patch fixes this issue and includes a test for this.
> Avro should have support for common constructs like UUID and Date
> -----------------------------------------------------------------
>
> Key: AVRO-1554
> URL: https://issues.apache.org/jira/browse/AVRO-1554
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.6
> Reporter: Sachin Goyal
> Attachments: AVRO-1554.patch, CustomEncodingUnionBug.zip
>
>
> Consider the following code:
> {code}
> public class AvroExample
> {
> public static void main (String [] args) throws Exception
> {
> ReflectData rdata = ReflectData.AllowNull.get();
> Schema schema = rdata.getSchema(Temp.class);
>
> ReflectDatumWriter<Temp> datumWriter =
> new ReflectDatumWriter (Temp.class, rdata);
> DataFileWriter<Temp> fileWriter =
> new DataFileWriter<Temp> (datumWriter);
> ByteArrayOutputStream baos = new ByteArrayOutputStream();
> fileWriter.create(schema, baos);
> fileWriter.append(new Temp());
> fileWriter.close();
> byte[] bytes = baos.toByteArray();
> GenericDatumReader<GenericRecord> datumReader =
> new GenericDatumReader<GenericRecord> ();
> SeekableByteArrayInput avroInputStream =
> new SeekableByteArrayInput(bytes);
> DataFileReader<GenericRecord> fileReader =
> new DataFileReader<GenericRecord>(avroInputStream,
> datumReader);
> schema = fileReader.getSchema();
> GenericRecord record = null;
> record = fileReader.next(record);
> System.out.println (record);
> System.out.println (record.get("id"));
> }
> }
> class Temp
> {
> UUID id = UUID.randomUUID();
> Date date = new Date();
> BigInteger bi = BigInteger.TEN;
> }
> {code}
> Output from this code is:
> {code:javascript}
> {"id": {}, "date": {}, "bi": "10"}
> {code}
> UUID and Date type fields are very common in Java and can be found a lot in
> third-party code as well (where it may be difficult to put annotations).
> So Avro should include a default serialization/deserialization support for
> such fields.
--
This message was sent by Atlassian JIRA
(v6.2#6252)