[ https://issues.apache.org/jira/browse/AVRO-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720966#comment-13720966 ]
Vincenz Priesnitz commented on AVRO-1341: ----------------------------------------- You are right. The patch made record reading and writing take about twice as long. Here is the reflection performance of the trunk: {noformat} test name time M entries/sec M bytes/sec bytes/cycle ReflectRecordRead: 5646 ms 2.952 114.543 808498 ReflectRecordWrite: 3537 ms 4.711 182.822 808498 ReflectBigRecordRead: 6044 ms 1.654 101.558 767380 ReflectBigRecordWrite: 4222 ms 2.368 145.384 767380 ReflectFloatRead: 5519 ms 0.000 144.932 1000004 ReflectFloatWrite: 1210 ms 0.001 660.832 1000004 ReflectDoubleRead: 7310 ms 0.000 218.876 2000004 ReflectDoubleWrite: 2190 ms 0.000 730.585 2000004 ReflectIntArrayRead: 8980 ms 1.856 76.589 859709 ReflectIntArrayWrite: 2707 ms 6.156 254.031 859709 ReflectLongArrayRead: 4569 ms 1.824 140.991 805344 ReflectLongArrayWrite: 1781 ms 4.677 361.609 805344 ReflectDoubleArrayRead: 5396 ms 1.853 121.281 818144 ReflectDoubleArrayWrite: 1652 ms 6.051 396.060 818144 ReflectFloatArrayRead: 9788 ms 2.043 69.156 846172 ReflectFloatArrayWrite: 2309 ms 8.661 293.156 846172 ReflectNestedFloatArrayRead: 11524 ms 1.735 58.738 846172 ReflectNestedFloatArrayWrite: 4506 ms 4.438 150.199 846172 ReflectNestedObjectArrayRead: 9895 ms 0.404 52.156 645104 ReflectNestedObjectArrayWrite: 5745 ms 0.696 89.822 645104 ReflectNestedLargeFloatArrayRead: 7262 ms 0.459 119.783 1087381 ReflectNestedLargeFloatArrayWrite: 2006 ms 1.661 433.513 1087381 ReflectNestedLargeFloatArrayBlockedRead: 7401 ms 0.450 119.034 1101357 ReflectNestedLargeFloatArrayBlockedWrite: 4797 ms 0.695 183.666 1101357 {noformat} With the patch applied: {noformat} test name time M entries/sec M bytes/sec bytes/cycle ReflectRecordRead: 9332 ms 1.786 69.305 808498 ReflectRecordWrite: 7412 ms 2.248 87.252 808498 ReflectBigRecordRead: 9533 ms 1.049 64.392 767380 ReflectBigRecordWrite: 8132 ms 1.230 75.487 767380 ReflectFloatRead: 5432 ms 0.000 147.256 1000004 ReflectFloatWrite: 1172 ms 0.001 682.323 1000004 ReflectDoubleRead: 6885 ms 0.000 232.387 2000004 ReflectDoubleWrite: 2303 ms 0.000 694.613 2000004 ReflectIntArrayRead: 8244 ms 2.022 83.426 859709 ReflectIntArrayWrite: 2517 ms 6.619 273.148 859709 ReflectLongArrayRead: 4534 ms 1.838 142.076 805344 ReflectLongArrayWrite: 1729 ms 4.819 372.619 805344 ReflectDoubleArrayRead: 4999 ms 2.000 130.928 818144 ReflectDoubleArrayWrite: 1431 ms 6.985 457.167 818144 ReflectFloatArrayRead: 9139 ms 2.188 74.066 846172 ReflectFloatArrayWrite: 2401 ms 8.329 281.898 846172 ReflectNestedFloatArrayRead: 12295 ms 1.627 55.056 846172 ReflectNestedFloatArrayWrite: 4975 ms 4.020 136.058 846172 ReflectNestedObjectArrayRead: 14627 ms 0.273 35.281 645104 ReflectNestedObjectArrayWrite: 10045 ms 0.398 51.375 645104 ReflectNestedLargeFloatArrayRead: 7315 ms 0.456 118.910 1087381 ReflectNestedLargeFloatArrayWrite: 2029 ms 1.642 428.657 1087381 ReflectNestedLargeFloatArrayBlockedRead: 7429 ms 0.449 118.597 1101357 ReflectNestedLargeFloatArrayBlockedWrite: 5330 ms 0.625 165.280 1101357 {noformat} I added the proposed booleans to FieldAccessor and this improved performance almost back to prepatch: {noformat} test name time M entries/sec M bytes/sec bytes/cycle ReflectRecordRead: 6391 ms 2.607 101.189 808498 ReflectRecordWrite: 4180 ms 3.987 154.712 808498 ReflectBigRecordRead: 6276 ms 1.593 97.812 767380 ReflectBigRecordWrite: 4926 ms 2.030 124.610 767380 ReflectFloatRead: 5580 ms 0.000 143.356 1000004 ReflectFloatWrite: 1285 ms 0.001 622.420 1000004 ReflectDoubleRead: 6847 ms 0.000 233.657 2000004 ReflectDoubleWrite: 2325 ms 0.000 688.114 2000004 ReflectIntArrayRead: 7973 ms 2.090 86.252 859709 ReflectIntArrayWrite: 2760 ms 6.038 249.168 859709 ReflectLongArrayRead: 4720 ms 1.765 136.489 805344 ReflectLongArrayWrite: 1762 ms 4.728 365.527 805344 ReflectDoubleArrayRead: 5253 ms 1.903 124.587 818144 ReflectDoubleArrayWrite: 1637 ms 6.107 399.693 818144 ReflectFloatArrayRead: 9280 ms 2.155 72.942 846172 ReflectFloatArrayWrite: 2182 ms 9.163 310.143 846172 ReflectNestedFloatArrayRead: 11072 ms 1.806 61.134 846172 ReflectNestedFloatArrayWrite: 4058 ms 4.928 166.812 846172 ReflectNestedObjectArrayRead: 11122 ms 0.360 46.399 645104 ReflectNestedObjectArrayWrite: 6689 ms 0.598 77.152 645104 ReflectNestedLargeFloatArrayRead: 7320 ms 0.455 118.834 1087381 ReflectNestedLargeFloatArrayWrite: 1837 ms 1.814 473.434 1087381 ReflectNestedLargeFloatArrayBlockedRead: 7383 ms 0.451 119.326 1101357 ReflectNestedLargeFloatArrayBlockedWrite: 4839 ms 0.689 182.069 1101357 {noformat} Attached is a new patch with the improved performance. > Allow controlling avro via java annotations when using reflection. > ------------------------------------------------------------------- > > Key: AVRO-1341 > URL: https://issues.apache.org/jira/browse/AVRO-1341 > Project: Avro > Issue Type: New Feature > Components: java > Reporter: Vincenz Priesnitz > Assignee: Vincenz Priesnitz > Fix For: 1.7.5 > > Attachments: AVRO-1341.patch, AVRO-1341.patch, AVRO-1341.patch, > AVRO-1341.patch, AVRO-1341.patch > > > It would be great if one could control avro with java annotations. As of now, > it is already possible to mark fields as Nullable or classes being encoded as > a String. I propose a bigger set of annotations to control the behavior of > avro on fields and classes. Such annotations have proven useful with jacksons > json serialization and morphias mongoDB serialization. > I propose the following additional annotations: > @AvroName("alternativeName") > @AvroAlias(alias="alias", space="space") > @AvroIgnore > @AvroMeta(key="K", value="V") > @AvroEncode(using=CustomEncoding.class) > Java fields with the @AvroName("alternativeName") annotation will be renamed > in the induced schema. When reading an avro file via reflection, the > reflection reader will look for fields in the schema with "alternativeName". > For example: > {code} > @AvroName("foo") > int bar; > {code} > is serialized as > {code} > { "name" : "foo", "type" : "int" } > {code} > The @AvroAlias annotation will add a new alias to the induced schema of a > record, enum or field. The space parameter is optional and defaults to the > namespace of the named schema the alias is added to. > Fields with the @AvroIgnore annotation will be treated as if they had a > transient modifier, i.e. they will not be written to or read from avro files. > The @AvroMeta(key="K", value="V") annotation allows you to store an arbitrary > key : value pair at every node in the schema. > {code} > @AvroMeta(key="fieldKey", value="fieldValue") > int foo; > {code} > will create the following schema > {code} > {"name" : "foo", "type" : "int", "fieldKey" : "fieldValue" } > {code} > Fields can be custom encoded with the AvroEncode(using=CustomEncoding.class) > annotation. This annotation is a generalization of the @Stringable > annotation. The @Stringable annotation is limited to classes with string > argument constructors. Some classes can be similarly reduced to a smaller > class or even a single primitive, but dont fit the requirements for > @Stringable. A prominent example is java.util.Date, which instances can > essentially be described with a single long. Such classes can now be encoded > with a CustomEncoding, which reads and writes directly from the > encoder/decoder. > One simply extends the abstract CustomEncodings class by implementing a > schema, a read method and a write method. A java field can then be annotated > like this: > {code} > @AvroEncode(using=DateAslongEncoding.class) > Date date; > {code} > The custom encoding implementation would look like > {code} > public class DateAsLongEncoding extends CustomEncoding<Date> { > { > schema = Schema.create(Schema.Type.LONG); > schema.addProp("CustomEncoding", "DateAsLongEncoding"); > } > > @Override > public void write(Object datum, Encoder out) throws IOException { > out.writeLong(((Date)datum).getTime()); > } > > @Override > public Date read(Object reuse, Decoder in) throws IOException { > if (reuse != null) { > ((Date)reuse).setTime(in.readLong()); > return (Date)reuse; > } > else return new Date(in.readLong()); > } > } > {code} > I implemented said annotations and a custom encoding for java.util.Date as a > proof of concept and also extended the @Stringable annotations to fields. > This issue is a followup of AVRO-1328 and AVRO-1330. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira