[ 
https://issues.apache.org/jira/browse/AVRO-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Goyal updated AVRO-1554:
-------------------------------

    Attachment: CustomEncodingUnionBug.zip

{quote}
I am confused by the handling of custom-encoded unions in the patch. How can it 
be correct to discard the index on read?
{quote}

[~cutting], It appears that the current implementation has a bug for not 
supporting unions. I am attaching a simple test 'CustomEncodingUnionBug.zip' to 
demonstrate this. If *ReflectData.AllowNull* is used, unions are produced in 
the schema and following exception is thrown:
{color:brown}
java.io.IOException: Invalid int encoding
        at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:145)
        at org.apache.avro.io.BinaryDecoder.readIndex(BinaryDecoder.java:423)
        at 
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
        at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
        at 
org.apache.avro.io.ResolvingDecoder.readLong(ResolvingDecoder.java:152)
        at 
org.apache.avro.reflect.DateAsLongEncoding.read(DateAsLongEncoding.java:50)
        at 
org.apache.avro.reflect.DateAsLongEncoding.read(DateAsLongEncoding.java:33)
        at org.apache.avro.reflect.CustomEncoding.read(CustomEncoding.java:45)
        at 
org.apache.avro.reflect.FieldAccessUnsafe$UnsafeCustomEncodedField.read(FieldAccessUnsafe.java:353)
        at 
org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:214)
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
        at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
        at 
org.avro.bugs.AvroEncodeUnionTest.testEncodingWithUnion(AvroEncodeUnionTest.java:53)
{color}

\\
\\
If *AllowNull* is not used, then the test passes without issues.
The submitted patch fixes this issue and includes a test for this.

> Avro should have support for common constructs like UUID and Date
> -----------------------------------------------------------------
>
>                 Key: AVRO-1554
>                 URL: https://issues.apache.org/jira/browse/AVRO-1554
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.6
>            Reporter: Sachin Goyal
>         Attachments: AVRO-1554.patch, CustomEncodingUnionBug.zip
>
>
> Consider the following code:
> {code}
> public class AvroExample
> {
>     public static void main (String [] args) throws Exception
>     {
>         ReflectData rdata = ReflectData.AllowNull.get();
>         Schema schema = rdata.getSchema(Temp.class);
>         
>         ReflectDatumWriter<Temp> datumWriter = 
>                new ReflectDatumWriter (Temp.class, rdata);
>         DataFileWriter<Temp> fileWriter = 
>                new DataFileWriter<Temp> (datumWriter);
>         ByteArrayOutputStream baos = new ByteArrayOutputStream();
>         fileWriter.create(schema, baos);
>         fileWriter.append(new Temp());
>         fileWriter.close();
>         byte[] bytes = baos.toByteArray();
>         GenericDatumReader<GenericRecord> datumReader = 
>                 new GenericDatumReader<GenericRecord> ();
>         SeekableByteArrayInput avroInputStream = 
>                 new SeekableByteArrayInput(bytes);
>         DataFileReader<GenericRecord> fileReader = 
>                 new DataFileReader<GenericRecord>(avroInputStream, 
> datumReader);
>         schema = fileReader.getSchema();
>         GenericRecord record = null;
>         record = fileReader.next(record);
>         System.out.println (record);
>         System.out.println (record.get("id"));
>     }
> }
> class Temp
> {
>     UUID id = UUID.randomUUID();
>     Date date = new Date();
>     BigInteger bi = BigInteger.TEN;
> }
> {code}
> Output from this code is:
> {code:javascript}
> {"id": {}, "date": {}, "bi": "10"}
> {code}
> UUID and Date type fields are very common in Java and can be found a lot in 
> third-party code as well (where it may be difficult to put annotations).
> So Avro should include a default serialization/deserialization support for 
> such fields.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to