Hi Rossi, It is supposed to be used by every node that needs a temporary array. It is not used because we haven’t performed the refactor.
Sasha > 9 марта 2023 г., в 21:57, Ruoxi Sun <zanmato1...@gmail.com> написал(а): > > Hi Sasha, thanks for the kind reply. Yeah, that makes sense for using > thread local data to reduce the vector allocation/deallocation overhead. > However I'm still wondering if this thread local data has to be in > QueryContext? Specifically, there is thread local state > <https://github.com/apache/arrow/blob/ad44e8e4e669019299dc56b37d24d2976588b648/cpp/src/arrow/compute/exec/swiss_join.cc#L2505> > within SwissJoin already, does it make sense to put the thread local vector > inside SwissJoin rather than QueryContext? Or, is the thread local data in > QueryContext is designed to be used inter-node? > > Thanks. > > *Rossi Sun* > > > Sasha Krassovsky <krassovskysa...@gmail.com> 于2023年3月10日周五 01:54写道: > >> Hi Rossi, >> When profiling Acero we noticed that there was a lot of overhead regarding >> memory allocation, specifically in the creation/destruction of std::vector. >> This thread local data in QueryContext was put there as a preparation to >> refactor other nodes to use TempVectorStack when they need a temporary >> block of memory. >> >> Hope this helps, >> Sasha >> >>>> 9 марта 2023 г., в 09:11, Ruoxi Sun <zanmato1...@gmail.com> написал(а): >>> >>> Hi folks, >>> >>> I see that the member `tld_ >>> < >> https://github.com/apache/arrow/blob/0ac0f733ff61f2db45cbff54def8768b3ceb8a9d/cpp/src/arrow/compute/exec/query_context.h#L150 >>> ` >>> in class `QueryContext` is used by `BloomFilterPushdownContext >>> < >> https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/hash_join_node.cc >>> ` >>> and `SwissJoin >>> < >> https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/swiss_join.cc >>> `, >>> both of which are parts of hash join node. >>> >>> I'm wondering if there is any particular reason to design it this way. It >>> seems reasonable to move it into hash join node - in which there exists >>> per-thread states - and subsequently pass it down to the >>> `BloomFilterPushdownContext` and `SwissJoin`. This way the `QueryContext` >>> could stay thread-local-state-agnostic. >>> >>> Please help. Thanks in advance. >>> >>> *Rossi Sun* >>