Thanks David!

I am not yet familiar with the implementation of this kernel so I am
hoping someone more familiar with kernels can shed some light on this. I
wonder if this is kind of expected performance (comparing to similar kernel
perf) or maybe something with the RoundTemporal implementation seems off?

Steven (who ran the test) computed around 500 CPU cycles / value which
seems more than what is needed but I am not an expert on the kernels so
want to hear more thoughts from the dev.

Li

On Tue, Apr 12, 2022 at 4:19 PM David Li <lidav...@apache.org> wrote:

> While we do track benchmarks for each commit on Conbench [1] it seems we
> lack benchmarks for the temporal operations. I filed ARROW-16173 [2].
>
> They do do a bit more work than just a round (especially if they need to
> handle time zones).
>
> [1]: https://conbench.ursa.dev/
> [2]: https://issues.apache.org/jira/browse/ARROW-16173
>
> -David
>
> On Tue, Apr 12, 2022, at 15:40, Li Jin wrote:
> > Sorry I should have mentioned this is the Arrow C++ compute kernels.
> >
> > On Tue, Apr 12, 2022 at 3:39 PM Li Jin <ice.xell...@gmail.com> wrote:
> >
> >> Hello!
> >>
> >> We recently noticed unexpected performance with Arrow's temporal
> >> operation kernels (in particular, CeilTemporal). The perf we see are
> around
> >> 1.4-1.8 Gb / s. This seems to be much lower than adding a constant to a
> >> float column (~9Gb/s). This is a bit unexpected because CeilTemporal is
> >> similar to a numeric round operation so we are wondering if there are
> some
> >> benchmarks around this and where the issue might be?
> >>
> >> Thanks!
> >> Li
> >>
>

Reply via email to