That makes perfect sense, esp. seeing the the zero-copy fashion for slicing
the big input. Thanks Weston!

*Rossi*


Weston Pace <weston.p...@gmail.com> 于2023年7月3日周一 22:33写道:

> > is this overflow considered a bug? Or is large exec batch something that
> should be avoided?
>
> This is not a bug and it is something that should be avoided.
>
> Some of the hash-join internals expect small batches.  I actually thought
> the limit was 32Ki and not 64Ki because I think there may be some places we
> are using int16_t as an index.  The reasoning is that the hash-join is
> going to make multiple passes through the data (e.g. first to calculate the
> hashes from the key columns and then again the encode the key columns,
> etc.) and you're going to get better performance when your batches are
> small enough that they fit into the CPU cache.  [1] is often given as a
> reference for this idea.  Since this is the case there is not much need for
> operating on larger batches.
>
> > And does acero have any logic preventing that from happening
>
> Yes, in the source node we take (potentially large) batches from the I/O
> side of things and slice them into medium sized batches (about 1Mi rows) to
> distribute across threads and then each thread iterates over that medium
> sized batch in even smaller batches (32Ki rows) for actual processing.
> This all happens here[2].
>
> [1] https://db.in.tum.de/~leis/papers/morsels.pdf
> [2]
>
> https://github.com/apache/arrow/blob/6af660f48472b8b45a5e01b7136b9b040b185eb1/cpp/src/arrow/acero/source_node.cc#L120
>
> On Mon, Jul 3, 2023 at 6:50 AM Ruoxi Sun <zanmato1...@gmail.com> wrote:
>
> > Hi folks,
> >
> > I've encountered a bug when doing swiss join using a big exec batch, say,
> > larger than 65535 rows, on the probe side. It turns out to be that in the
> > algorithm, it is using `uint16_t` to represent the index within the probe
> > exec batch (the materialize_batch_ids_buf
> > <
> >
> https://github.com/apache/arrow/blob/f951f0c42040ba6f584831621864f5c23e0f023e/cpp/src/arrow/acero/swiss_join.cc#L1897C8-L1897C33
> > >),
> > and row id larger than 65535 will be silently overflow and cause the
> result
> > nonsense.
> >
> > One thing to note is that I'm not exactly using the acero "the acero
> way".
> > Instead I carve out some pieces of code from acero and run them
> > individually. So I'm just wondering that, is this overflow considered a
> > bug? Or is large exec batch something that should be avoided? (And does
> > acero have any logic preventing that from happening, e.g., some wild man
> > like me just throws it an arbitrary large exec batch?)
> >
> > Thanks.
> >
> > *Rossi*
> >
>

Reply via email to