[jira] [Commented] (AVRO-2247) Improve Java reading performance with a new reader

ASF GitHub Bot (JIRA) Tue, 27 Nov 2018 22:05:04 -0800


    [ 
https://issues.apache.org/jira/browse/AVRO-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701425#comment-16701425
 ]


ASF GitHub Bot commented on AVRO-2247:
--------------------------------------

rstata commented on issue #391: AVRO-2247 - improved java reading performance 
with new reader
URL: https://github.com/apache/avro/pull/391#issuecomment-442330623
 
 
   I've run your code against `Perf.java` and uploaded the 
   [results 
here](https://github.com/apache/avro/files/2623075/AVRO-2247-Perf-results-11-27.pdf).
  This report contains two sets of results:
   
   * The "avro-2247 (calibration)" column presents the results of running the 
2247 branch against itself three different times.  These results are useful for 
understanding where the Perf.java benchmark tends to have a lot of internal 
variability.  As an example, the BooleanRead/Write shows a lot of natural 
variability, which is something I've notice in a lot of my previous performance 
testing.
   
   * The "avro-2274 (w/ custom coders) vs" column presents the result of 
running three different treatments against my avro-2274 branch.  The three 
sub-columns here are as follows: "master" is the Apache Avro master branch 
(just prior to avro-2274 being merged into it); "2247 (off)" branch is the 2247 
code with fast-coder turned off; "2247 (on)" is the 2247 branch with coders 
turned on.
   
   The last sub-column of "avro-2274 (...) vs" results is the more relevant.  
What we see here are a large number of record-related cases showing speedups of 
20-30% and even more.  This is very promising.
   
   I am currently running the JMH-based benchmarks.  These do _not_ have an 
(obvious) mechanism for comparing the "before/after" performance of your 
proposed changes, but I will be interested in seeing if they do better in 
reducing the variance between runs.
   
   I haven't inspected your code yet.  I'll do that as well, and offer some 
opinions.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Improve Java reading performance with a new reader
> --------------------------------------------------
>
>                 Key: AVRO-2247
>                 URL: https://issues.apache.org/jira/browse/AVRO-2247
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Martin Jubelgas
>            Priority: Major
>             Fix For: 1.9.0
>
>         Attachments: Perf-Comparison.md
>
>
> Complementary to AVRO-2090, I have been working on decoding of Avro objects 
> in Java and am suggesting a new implementation of a DatumReader that improves 
> read performance for both generic and specific records by approximately 20% 
> (and even more in cases of nested objects with defaults, a case I encounter a 
> lot in practical use).
> Key concept is to create a detailed execution plan once at DatumReader. This 
> execution plan contains all required defaulting/lookup values so they need 
> not be looked up during object traversal while reading.
> The reader implementation can be enabled and disabled per GenericData 
> instance. The system default is set via the system variable 
> "org.apache.avro.fastread" (defaults to "false").
> Attached a performance comparison of the existing implementation with the 
> proposed one. Will open a pull request with respective code in a bit (not 
> including interoperability with the optimizations of AVRO-2090 yet). Please 
> let me know your opinion of whether this is worth pursuing further.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AVRO-2247) Improve Java reading performance with a new reader

Reply via email to