[ 
https://issues.apache.org/jira/browse/AVRO-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated AVRO-1402:
----------------------------

    Attachment: AVRO-1402.patch

Following Doug's [subType 
suggestion|https://issues.apache.org/jira/browse/AVRO-739?focusedCommentId=13933465&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13933465]
 on AVRO-739 I wrote a patch to support a decimal type.

The schema declaration looks like this: {"type":"bytes", "subType":"decimal"}

The encoding is an int to represent the scale followed by a byte array 
containing the unscaled integer, in the (language-neutral) format described 
here: 
http://docs.oracle.com/javase/6/docs/api/java/math/BigInteger.html#toByteArray%28%29.

One thing to notice is that in this patch the type does not define the 
precision and scale as a part of the type. This means that there is no 
restriction in Avro on the decimal that may be written. Instead, the burden of 
limiting the precision and scale falls on the application. Hive, for example, 
already has logic for ensuring that the precision and scale of a decimal value 
are consistent with the precision and scale values set as a part of the type 
definition for that decimal column. (There is more discussion on this point on 
HIVE-3976, and in particular in the [functional 
spec|https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf].)

It would be good if we could get some agreement on i) whether this is a good 
approach for adding new (optional) types to Avro and ii) what the binary 
encoding should look like. Thoughts?

In this initial patch I added the decimal subtype to GenericDatumWriter/Reader. 
I did this since Hive uses generic, but there is a potential compatibility 
issue where code using a GenericDatumReader receives a BigDecimal instead of a 
ByteBuffer when reading a new schema with a decimal subtype. Any thoughts on 
how to tackle this would be gratefully received too.

> Support for DECIMAL primitive type
> ----------------------------------
>
>                 Key: AVRO-1402
>                 URL: https://issues.apache.org/jira/browse/AVRO-1402
>             Project: Avro
>          Issue Type: New Feature
>    Affects Versions: 1.7.5
>            Reporter: Mariano Dominguez
>            Priority: Minor
>              Labels: Hive
>         Attachments: AVRO-1402.patch
>
>
> Currently, Avro does not seem to support a DECIMAL type or equivalent.
> http://avro.apache.org/docs/1.7.5/spec.html#schema_primitive
> Adding DECIMAL support would be particularly interesting when converting 
> types from Avro to Hive, since DECIMAL is already a supported data type in 
> Hive (0.11.0).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to