[ 
https://issues.apache.org/jira/browse/AVRO-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720966#comment-13720966
 ] 

Vincenz Priesnitz commented on AVRO-1341:
-----------------------------------------

You are right. The patch made record reading and writing take about twice as 
long. 
Here is the reflection performance of the trunk: 
{noformat}
                                                   test name     time    M 
entries/sec   M bytes/sec  bytes/cycle
                         ReflectRecordRead:   5646 ms       2.952       114.543 
       808498
                        ReflectRecordWrite:   3537 ms       4.711       182.822 
       808498
                      ReflectBigRecordRead:   6044 ms       1.654       101.558 
       767380
                     ReflectBigRecordWrite:   4222 ms       2.368       145.384 
       767380
                          ReflectFloatRead:   5519 ms       0.000       144.932 
      1000004
                         ReflectFloatWrite:   1210 ms       0.001       660.832 
      1000004
                         ReflectDoubleRead:   7310 ms       0.000       218.876 
      2000004
                        ReflectDoubleWrite:   2190 ms       0.000       730.585 
      2000004
                       ReflectIntArrayRead:   8980 ms       1.856        76.589 
       859709
                      ReflectIntArrayWrite:   2707 ms       6.156       254.031 
       859709
                      ReflectLongArrayRead:   4569 ms       1.824       140.991 
       805344
                     ReflectLongArrayWrite:   1781 ms       4.677       361.609 
       805344
                    ReflectDoubleArrayRead:   5396 ms       1.853       121.281 
       818144
                   ReflectDoubleArrayWrite:   1652 ms       6.051       396.060 
       818144
                     ReflectFloatArrayRead:   9788 ms       2.043        69.156 
       846172
                    ReflectFloatArrayWrite:   2309 ms       8.661       293.156 
       846172
               ReflectNestedFloatArrayRead:  11524 ms       1.735        58.738 
       846172
              ReflectNestedFloatArrayWrite:   4506 ms       4.438       150.199 
       846172
              ReflectNestedObjectArrayRead:   9895 ms       0.404        52.156 
       645104
             ReflectNestedObjectArrayWrite:   5745 ms       0.696        89.822 
       645104
          ReflectNestedLargeFloatArrayRead:   7262 ms       0.459       119.783 
      1087381
         ReflectNestedLargeFloatArrayWrite:   2006 ms       1.661       433.513 
      1087381
   ReflectNestedLargeFloatArrayBlockedRead:   7401 ms       0.450       119.034 
      1101357
  ReflectNestedLargeFloatArrayBlockedWrite:   4797 ms       0.695       183.666 
      1101357
{noformat}
With the patch applied: 
{noformat}
                                                   test name     time    M 
entries/sec   M bytes/sec  bytes/cycle
                         ReflectRecordRead:   9332 ms       1.786        69.305 
       808498
                        ReflectRecordWrite:   7412 ms       2.248        87.252 
       808498
                      ReflectBigRecordRead:   9533 ms       1.049        64.392 
       767380
                     ReflectBigRecordWrite:   8132 ms       1.230        75.487 
       767380
                          ReflectFloatRead:   5432 ms       0.000       147.256 
      1000004
                         ReflectFloatWrite:   1172 ms       0.001       682.323 
      1000004
                         ReflectDoubleRead:   6885 ms       0.000       232.387 
      2000004
                        ReflectDoubleWrite:   2303 ms       0.000       694.613 
      2000004
                       ReflectIntArrayRead:   8244 ms       2.022        83.426 
       859709
                      ReflectIntArrayWrite:   2517 ms       6.619       273.148 
       859709
                      ReflectLongArrayRead:   4534 ms       1.838       142.076 
       805344
                     ReflectLongArrayWrite:   1729 ms       4.819       372.619 
       805344
                    ReflectDoubleArrayRead:   4999 ms       2.000       130.928 
       818144
                   ReflectDoubleArrayWrite:   1431 ms       6.985       457.167 
       818144
                     ReflectFloatArrayRead:   9139 ms       2.188        74.066 
       846172
                    ReflectFloatArrayWrite:   2401 ms       8.329       281.898 
       846172
               ReflectNestedFloatArrayRead:  12295 ms       1.627        55.056 
       846172
              ReflectNestedFloatArrayWrite:   4975 ms       4.020       136.058 
       846172
              ReflectNestedObjectArrayRead:  14627 ms       0.273        35.281 
       645104
             ReflectNestedObjectArrayWrite:  10045 ms       0.398        51.375 
       645104
          ReflectNestedLargeFloatArrayRead:   7315 ms       0.456       118.910 
      1087381
         ReflectNestedLargeFloatArrayWrite:   2029 ms       1.642       428.657 
      1087381
   ReflectNestedLargeFloatArrayBlockedRead:   7429 ms       0.449       118.597 
      1101357
  ReflectNestedLargeFloatArrayBlockedWrite:   5330 ms       0.625       165.280 
      1101357
{noformat}
I added the proposed booleans to FieldAccessor and this improved performance 
almost back to prepatch:
{noformat}
                                                   test name     time    M 
entries/sec   M bytes/sec  bytes/cycle
                         ReflectRecordRead:   6391 ms       2.607       101.189 
       808498
                        ReflectRecordWrite:   4180 ms       3.987       154.712 
       808498
                      ReflectBigRecordRead:   6276 ms       1.593        97.812 
       767380
                     ReflectBigRecordWrite:   4926 ms       2.030       124.610 
       767380
                          ReflectFloatRead:   5580 ms       0.000       143.356 
      1000004
                         ReflectFloatWrite:   1285 ms       0.001       622.420 
      1000004
                         ReflectDoubleRead:   6847 ms       0.000       233.657 
      2000004
                        ReflectDoubleWrite:   2325 ms       0.000       688.114 
      2000004
                       ReflectIntArrayRead:   7973 ms       2.090        86.252 
       859709
                      ReflectIntArrayWrite:   2760 ms       6.038       249.168 
       859709
                      ReflectLongArrayRead:   4720 ms       1.765       136.489 
       805344
                     ReflectLongArrayWrite:   1762 ms       4.728       365.527 
       805344
                    ReflectDoubleArrayRead:   5253 ms       1.903       124.587 
       818144
                   ReflectDoubleArrayWrite:   1637 ms       6.107       399.693 
       818144
                     ReflectFloatArrayRead:   9280 ms       2.155        72.942 
       846172
                    ReflectFloatArrayWrite:   2182 ms       9.163       310.143 
       846172
               ReflectNestedFloatArrayRead:  11072 ms       1.806        61.134 
       846172
              ReflectNestedFloatArrayWrite:   4058 ms       4.928       166.812 
       846172
              ReflectNestedObjectArrayRead:  11122 ms       0.360        46.399 
       645104
             ReflectNestedObjectArrayWrite:   6689 ms       0.598        77.152 
       645104
          ReflectNestedLargeFloatArrayRead:   7320 ms       0.455       118.834 
      1087381
         ReflectNestedLargeFloatArrayWrite:   1837 ms       1.814       473.434 
      1087381
   ReflectNestedLargeFloatArrayBlockedRead:   7383 ms       0.451       119.326 
      1101357
  ReflectNestedLargeFloatArrayBlockedWrite:   4839 ms       0.689       182.069 
      1101357
{noformat}

Attached is a new patch with the improved performance.

                
> Allow controlling avro via java annotations when using reflection. 
> -------------------------------------------------------------------
>
>                 Key: AVRO-1341
>                 URL: https://issues.apache.org/jira/browse/AVRO-1341
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Vincenz Priesnitz
>            Assignee: Vincenz Priesnitz
>             Fix For: 1.7.5
>
>         Attachments: AVRO-1341.patch, AVRO-1341.patch, AVRO-1341.patch, 
> AVRO-1341.patch, AVRO-1341.patch
>
>
> It would be great if one could control avro with java annotations. As of now, 
> it is already possible to mark fields as Nullable or classes being encoded as 
> a String. I propose a bigger set of annotations to control the behavior of 
> avro on fields and classes. Such annotations have proven useful with jacksons 
> json serialization and morphias mongoDB serialization.
> I propose the following additional annotations: 
> @AvroName("alternativeName")
> @AvroAlias(alias="alias", space="space")
> @AvroIgnore
> @AvroMeta(key="K", value="V")
> @AvroEncode(using=CustomEncoding.class)
> Java fields with the @AvroName("alternativeName") annotation will be renamed 
> in the induced schema. When reading an avro file via reflection, the 
> reflection reader will look for fields in the schema with "alternativeName". 
> For example:
> {code}
>    @AvroName("foo")
>    int bar;  
> {code}
> is serialized as
> {code}
>   { "name" : "foo", "type" : "int" } 
> {code}
> The @AvroAlias annotation will add a new alias to the induced schema of a 
> record, enum or field. The space parameter is optional and defaults to the 
> namespace of the named schema the alias is added to.
> Fields with the @AvroIgnore annotation will be treated as if they had a 
> transient modifier, i.e. they will not be written to or read from avro files. 
> The @AvroMeta(key="K", value="V") annotation allows you to store an arbitrary 
> key : value pair at every node in the schema.
> {code}
>    @AvroMeta(key="fieldKey", value="fieldValue")
>    int foo;  
> {code}
> will create the following schema
> {code}
> {"name" : "foo", "type" : "int", "fieldKey" : "fieldValue" } 
> {code}
> Fields can be custom encoded with the AvroEncode(using=CustomEncoding.class) 
> annotation. This annotation is a generalization of the @Stringable 
> annotation. The @Stringable annotation is limited to classes with string 
> argument constructors. Some classes can be similarly reduced to a smaller 
> class or even a single primitive, but dont fit the requirements for 
> @Stringable. A prominent example is java.util.Date, which instances can 
> essentially be described with a single long. Such classes can now be encoded 
> with a CustomEncoding, which reads and writes directly from the 
> encoder/decoder. 
> One simply extends the abstract CustomEncodings class by implementing a 
> schema, a read method and a write method. A java field can then be annotated 
> like this:
> {code}
> @AvroEncode(using=DateAslongEncoding.class)
> Date date;
> {code}
> The custom encoding implementation would look like 
> {code}
> public class DateAsLongEncoding extends CustomEncoding<Date> {
>   {
>     schema = Schema.create(Schema.Type.LONG);
>     schema.addProp("CustomEncoding", "DateAsLongEncoding");
>   }
>   
>   @Override
>   public void write(Object datum, Encoder out) throws IOException {
>     out.writeLong(((Date)datum).getTime());
>   }
>   
>   @Override
>   public Date read(Object reuse, Decoder in) throws IOException {
>     if (reuse != null) {
>       ((Date)reuse).setTime(in.readLong());
>       return (Date)reuse;
>     }
>     else return new Date(in.readLong());
>   }
> }
> {code}
> I implemented said annotations and a custom encoding for java.util.Date as a 
> proof of concept and also extended the @Stringable annotations to fields.
> This issue is a followup of AVRO-1328 and AVRO-1330.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to