Hey Li,

If your input data is in UTC you don't need assume_timezone [1]. You
would need it if your input was America/New_York local time and you
wanted to convert to a zoned timestamp array where underlying data is
in UTC and timezone is metadata only. Perhaps python tests are
interesting for reference [2].

Available extraction kernels are listed here: [3].

Rok

[1] 
https://arrow.apache.org/docs/python/generated/pyarrow.compute.assume_timezone.html
[2] 
https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_compute.py#L1908-L1999
[3] https://arrow.apache.org/docs/cpp/compute.html#temporal-component-extraction

On Thu, Feb 3, 2022 at 3:54 PM Li Jin <ice.xell...@gmail.com> wrote:
>
> Hello!
>
> I am new to the Arrow C++ compute engine and trying to figure out this time
> zone conversion and time extraction:
>
> t.dt.tz_convert('America/New_York').dt.time == datetime.time(11, 30, 0)
>
> So I started looking at:
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
>
> and found these these functions seem relevant:
> assume_timezone
> hour
> minute
>
> So my thinking is trying to figure out a way to build plan that basically
> does these steps:
> (1) Assume timezone to New_York (input data is UTC)
> (2) Extract hour value
> (3) Extract minute value
> (4) Filter on hour and minute value
>
>  I wonder what is a good way to map these functions in
> scalar_temporal_unary to an ExecPlan? (Looked under
> https://github.com/apache/arrow/tree/master/cpp/src/arrow/compute/exec but
> didn't see anything obvious)
>
> Thanks!
> Li

Reply via email to