Hi Rossi,
It is supposed to be used by every node that needs a temporary array. It is not 
used because we haven’t performed the refactor. 

Sasha

> 9 марта 2023 г., в 21:57, Ruoxi Sun <zanmato1...@gmail.com> написал(а):
> 
> Hi Sasha, thanks for the kind reply. Yeah, that makes sense for using
> thread local data to reduce the vector allocation/deallocation overhead.
> However I'm still wondering if this thread local data has to be in
> QueryContext? Specifically, there is thread local state
> <https://github.com/apache/arrow/blob/ad44e8e4e669019299dc56b37d24d2976588b648/cpp/src/arrow/compute/exec/swiss_join.cc#L2505>
> within SwissJoin already, does it make sense to put the thread local vector
> inside SwissJoin rather than QueryContext? Or, is the thread local data in
> QueryContext is designed to be used inter-node?
> 
> Thanks.
> 
> *Rossi Sun*
> 
> 
> Sasha Krassovsky <krassovskysa...@gmail.com> 于2023年3月10日周五 01:54写道:
> 
>> Hi Rossi,
>> When profiling Acero we noticed that there was a lot of overhead regarding
>> memory allocation, specifically in the creation/destruction of std::vector.
>> This thread local data in QueryContext was put there as a preparation to
>> refactor other nodes to use TempVectorStack when they need a temporary
>> block of memory.
>> 
>> Hope this helps,
>> Sasha
>> 
>>>> 9 марта 2023 г., в 09:11, Ruoxi Sun <zanmato1...@gmail.com> написал(а):
>>> 
>>> Hi folks,
>>> 
>>> I see that the member `tld_
>>> <
>> https://github.com/apache/arrow/blob/0ac0f733ff61f2db45cbff54def8768b3ceb8a9d/cpp/src/arrow/compute/exec/query_context.h#L150
>>> `
>>> in class `QueryContext` is used by `BloomFilterPushdownContext
>>> <
>> https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/hash_join_node.cc
>>> `
>>> and `SwissJoin
>>> <
>> https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/swiss_join.cc
>>> `,
>>> both of which are parts of hash join node.
>>> 
>>> I'm wondering if there is any particular reason to design it this way. It
>>> seems reasonable to move it into hash join node - in which there exists
>>> per-thread states - and subsequently pass it down to the
>>> `BloomFilterPushdownContext` and `SwissJoin`. This way the `QueryContext`
>>> could stay thread-local-state-agnostic.
>>> 
>>> Please help. Thanks in advance.
>>> 
>>> *Rossi Sun*
>> 

Reply via email to