steveloughran commented on PR #3562: URL: https://github.com/apache/parquet-java/pull/3562#issuecomment-4591384114
Latest status. Ran benchmarks unattended; still lot of variation where even some tests which only write data are being statisticially significantly faster to serialize. That's a codepath which isn't being updated, and implies that system/other work is interfering. Results: https://github.com/steveloughran/benchmarking-variants/tree/main/json/hardening [final hardened json](https://github.com/steveloughran/benchmarking-variants/blob/main/json/hardening/2026-06-01-parquet-hardened-01.json) and [baseline master](https://github.com/steveloughran/benchmarking-variants/blob/main/json/hardening/2026-05-31-parquet-master-02.json) Comparing these with [JMH tabulate](https://github.com/steveloughran/jmh-tabulate) (important: use my fork as it strictly enforces a safe version of the charting.js lib from npm): <img width="1543" height="627" alt="Screenshot 2026-06-01 at 10 18 59" src="https://github.com/user-attachments/assets/cea32db7-e770-40f8-a552-59eadc6205b5" /> This looks like a slowdown but filter on statistical significance and most vanish and of the three which are significant, *two are writing data not reading it* <img width="1808" height="952" alt="Screenshot 2026-06-01 at 10 17 32" src="https://github.com/user-attachments/assets/1d0de7ac-59ef-4e74-8883-3599d5c0584d" /> the two `write` variants are just serializing the prebuilt variant, so not stressing the code. the reader does on a deeply nested structure, but there I'm not sure anything is showing <img width="1554" height="1551" alt="Screenshot 2026-06-01 at 10 39 29" src="https://github.com/user-attachments/assets/f7764715-fa34-41be-8964-a4140ab72c5a" /> Summary: even though minor some statistically significant slowdown is being reported, the fact that speedups in unmodified codepaths are also observed tells me the results aren't reliable. Whatever changes are being made here, they aren't actually measurable in the new dataset and general os/execution/jvm noise is more of a factor + updated all the constructor javadocs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
