Got it, that makes sense. Thanks for the answer, really appreciate it!
*Rossi Sun* Sasha Krassovsky <krassovskysa...@gmail.com> 于2023年3月10日周五 14:38写道: > Hi Rossi, > It is supposed to be used by every node that needs a temporary array. It > is not used because we haven’t performed the refactor. > > Sasha > > > 9 марта 2023 г., в 21:57, Ruoxi Sun <zanmato1...@gmail.com> написал(а): > > > > Hi Sasha, thanks for the kind reply. Yeah, that makes sense for using > > thread local data to reduce the vector allocation/deallocation overhead. > > However I'm still wondering if this thread local data has to be in > > QueryContext? Specifically, there is thread local state > > < > https://github.com/apache/arrow/blob/ad44e8e4e669019299dc56b37d24d2976588b648/cpp/src/arrow/compute/exec/swiss_join.cc#L2505 > > > > within SwissJoin already, does it make sense to put the thread local > vector > > inside SwissJoin rather than QueryContext? Or, is the thread local data > in > > QueryContext is designed to be used inter-node? > > > > Thanks. > > > > *Rossi Sun* > > > > > > Sasha Krassovsky <krassovskysa...@gmail.com> 于2023年3月10日周五 01:54写道: > > > >> Hi Rossi, > >> When profiling Acero we noticed that there was a lot of overhead > regarding > >> memory allocation, specifically in the creation/destruction of > std::vector. > >> This thread local data in QueryContext was put there as a preparation to > >> refactor other nodes to use TempVectorStack when they need a temporary > >> block of memory. > >> > >> Hope this helps, > >> Sasha > >> > >>>> 9 марта 2023 г., в 09:11, Ruoxi Sun <zanmato1...@gmail.com> > написал(а): > >>> > >>> Hi folks, > >>> > >>> I see that the member `tld_ > >>> < > >> > https://github.com/apache/arrow/blob/0ac0f733ff61f2db45cbff54def8768b3ceb8a9d/cpp/src/arrow/compute/exec/query_context.h#L150 > >>> ` > >>> in class `QueryContext` is used by `BloomFilterPushdownContext > >>> < > >> > https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/hash_join_node.cc > >>> ` > >>> and `SwissJoin > >>> < > >> > https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/swiss_join.cc > >>> `, > >>> both of which are parts of hash join node. > >>> > >>> I'm wondering if there is any particular reason to design it this way. > It > >>> seems reasonable to move it into hash join node - in which there exists > >>> per-thread states - and subsequently pass it down to the > >>> `BloomFilterPushdownContext` and `SwissJoin`. This way the > `QueryContext` > >>> could stay thread-local-state-agnostic. > >>> > >>> Please help. Thanks in advance. > >>> > >>> *Rossi Sun* > >> >