Hi Rossi, When profiling Acero we noticed that there was a lot of overhead regarding memory allocation, specifically in the creation/destruction of std::vector. This thread local data in QueryContext was put there as a preparation to refactor other nodes to use TempVectorStack when they need a temporary block of memory.
Hope this helps, Sasha > 9 марта 2023 г., в 09:11, Ruoxi Sun <zanmato1...@gmail.com> написал(а): > > Hi folks, > > I see that the member `tld_ > <https://github.com/apache/arrow/blob/0ac0f733ff61f2db45cbff54def8768b3ceb8a9d/cpp/src/arrow/compute/exec/query_context.h#L150>` > in class `QueryContext` is used by `BloomFilterPushdownContext > <https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/hash_join_node.cc>` > and `SwissJoin > <https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/swiss_join.cc>`, > both of which are parts of hash join node. > > I'm wondering if there is any particular reason to design it this way. It > seems reasonable to move it into hash join node - in which there exists > per-thread states - and subsequently pass it down to the > `BloomFilterPushdownContext` and `SwissJoin`. This way the `QueryContext` > could stay thread-local-state-agnostic. > > Please help. Thanks in advance. > > *Rossi Sun*