> (1) if I want to cast n columns to a different type (e.g., float to int).
What is the smallest memory overhead that I can do? (memory overhead of 1
column, n columns or 100 columns?)
You should be able to do this with only 1 column of overhead. Though you
might need to go a little out of your w
I think you can replace the schema metadata using [1]. You can perhaps also
do the same for the field metadata, depending on where timezone metadata
may be on a timestamp array [2].
[1]:
https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.replace_schema_metadata
[2]:
ht
Oh thanks that could be a workaround! I thought pa tables are supposed to
be immutable , is there a safe way to just change the metadata?
On Wed, Feb 15, 2023 at 5:44 PM Rok Mihevc wrote:
> Well that's suboptimal. As a workaround I suppose you could just change the
> metadata if the array is tim
Well that's suboptimal. As a workaround I suppose you could just change the
metadata if the array is timezone aware.
On Wed, Feb 15, 2023 at 10:37 PM Li Jin wrote:
> Oh found this comment:
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_cast_temporal.cc#L156
Oh found this comment:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_cast_temporal.cc#L156
On Wed, Feb 15, 2023 at 4:23 PM Li Jin wrote:
> Not sure if this is actually a bug or expected behavior - I filed
> https://github.com/apache/arrow/issues/34210
>
> On
Not sure if this is actually a bug or expected behavior - I filed
https://github.com/apache/arrow/issues/34210
On Wed, Feb 15, 2023 at 4:15 PM Li Jin wrote:
> Hmm..something feels off here - I did the following experiment on Arrow 11
> and casting timestamp-naive to int64 is much faster than cas
Hmm..something feels off here - I did the following experiment on Arrow 11
and casting timestamp-naive to int64 is much faster than casting
timestamp-naive to timestamp-utc:
In [16]: %time table.cast(schema_int)
CPU times: user 114 µs, sys: 30 µs, total: 144 µs
*Wall time: 231 µs*
Out[16]:
pyarrow
I'm not sure about (1) but I'm pretty sure for (2) doing a cast of tz-aware
timestamp to tz-naive should be a metadata-only change.
On Wed, Feb 15, 2023 at 4:19 PM Li Jin wrote:
> Asking (2) because IIUC this is a metadata operation that could be zero
> copy but I am not sure if this is actually
Asking (2) because IIUC this is a metadata operation that could be zero
copy but I am not sure if this is actually the case.
On Wed, Feb 15, 2023 at 10:17 AM Li Jin wrote:
> Hello!
>
> I have some questions about type casting memory usage with pyarrow Table.
> Let's say I have a pyarrow Table wi
Hello!
I have some questions about type casting memory usage with pyarrow Table.
Let's say I have a pyarrow Table with 100 columns.
(1) if I want to cast n columns to a different type (e.g., float to int).
What is the smallest memory overhead that I can do? (memory overhead of 1
column, n columns
10 matches
Mail list logo