Hi Sasha, thanks for the kind reply. Yeah, that makes sense for using
thread local data to reduce the vector allocation/deallocation overhead.
However I'm still wondering if this thread local data has to be in
QueryContext? Specifically, there is thread local state
<https://github.com/apache/arrow/blob/ad44e8e4e669019299dc56b37d24d2976588b648/cpp/src/arrow/compute/exec/swiss_join.cc#L2505>
within SwissJoin already, does it make sense to put the thread local vector
inside SwissJoin rather than QueryContext? Or, is the thread local data in
QueryContext is designed to be used inter-node?

Thanks.

*Rossi Sun*


Sasha Krassovsky <krassovskysa...@gmail.com> 于2023年3月10日周五 01:54写道:

> Hi Rossi,
> When profiling Acero we noticed that there was a lot of overhead regarding
> memory allocation, specifically in the creation/destruction of std::vector.
> This thread local data in QueryContext was put there as a preparation to
> refactor other nodes to use TempVectorStack when they need a temporary
> block of memory.
>
> Hope this helps,
> Sasha
>
> > 9 марта 2023 г., в 09:11, Ruoxi Sun <zanmato1...@gmail.com> написал(а):
> >
> > Hi folks,
> >
> > I see that the member `tld_
> > <
> https://github.com/apache/arrow/blob/0ac0f733ff61f2db45cbff54def8768b3ceb8a9d/cpp/src/arrow/compute/exec/query_context.h#L150
> >`
> > in class `QueryContext` is used by `BloomFilterPushdownContext
> > <
> https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/hash_join_node.cc
> >`
> > and `SwissJoin
> > <
> https://github.com/apache/arrow/blob/main/cpp/src/arrow/compute/exec/swiss_join.cc
> >`,
> > both of which are parts of hash join node.
> >
> > I'm wondering if there is any particular reason to design it this way. It
> > seems reasonable to move it into hash join node - in which there exists
> > per-thread states - and subsequently pass it down to the
> > `BloomFilterPushdownContext` and `SwissJoin`. This way the `QueryContext`
> > could stay thread-local-state-agnostic.
> >
> > Please help. Thanks in advance.
> >
> > *Rossi Sun*
>

Reply via email to