gortiz opened a new pull request, #15977: URL: https://github.com/apache/pinot/pull/15977
This PR fixes the performance regression detected on https://github.com/apache/pinot/pull/15609. Contrary to https://github.com/apache/pinot/pull/15967, this change doesn't reduce stats precision and may also increase performance in other situations (like using pipeline breakers or the new physical planner). The root reason why https://github.com/apache/pinot/pull/15609 affected performance is that we keep pointers to MultiStageOperators. Some of these operators (more explicitly: joins, aggregates and window functions) were implemented in a way that they kept large maps on the heap. These maps are used to calculate blocks while the operator is being executed and they are not used once EOS is returned. Local variables could substitute these attributes, but we would need to add new parameters to different methods. To simplify the change, this PR sets these attributes to null once they are not needed. In future redesigns, we may want to split the operator class from its execution (in the same way we have plan nodes and operators in SSE). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
