Re: Unsupported/Other Type

2024-04-10 Thread Rok Mihevc
There are JSON [1] and UUID [2] PRs open. I don't know about the former (seems to be stuck in review), but I plan to work on the UUID PR this week. [1] https://github.com/apache/arrow/pull/13901 [2] https://github.com/apache/arrow/pull/37298 On Thu, Apr 11, 2024 at 12:31 AM James Duong wrote: >

Re: Unsupported/Other Type

2024-04-10 Thread James Duong
It’s worth noting that this maps well to the User data type field in the XdbcTypeInfo APIs for Flight SQL. From: David Li Date: Wednesday, April 10, 2024 at 3:23 PM To: dev@arrow.apache.org Subject: Re: Unsupported/Other Type I think this should be an extension type, yes. It could be parametri

Re: Unsupported/Other Type

2024-04-10 Thread David Li
I think this should be an extension type, yes. It could be parametrized on the storage type; the other system might at least know that one type is based on another (e.g. a user defined type). Type metadata can be preserved in the extension type's metadata. I think it would be good to have stand

Re: Unsupported/Other Type

2024-04-10 Thread Wes McKinney
In the past we have discussed adding a canonical type for UUID and JSON. I still think this is a good idea and could improve ergonomics in downstream language bindings (e.g. by exposing JSON querying function or automatically boxing UUIDs in built-in UUID types, like the Python uuid library). Has a

Re: Unsupported/Other Type

2024-04-10 Thread Micah Kornfield
Hi Norman, Arrow has a concept of extension types [1] along with the possibility of proposing new canonical extension types [2]. This seems to cover the use-cases you mention but I might be misunderstanding? Thanks, Micah [1] https://arrow.apache.org/docs/format/Columnar.html#format-metadata-ext

Unsupported/Other Type

2024-04-10 Thread Norman Jordan
Problem Description Currently Arrow schemas can only contain columns of types supported by Arrow. In some cases an Arrow schema maps to an external schema. This can result in the Arrow schema not being able to support all the columns from the external schema. Consider an external system that c

Parquet: Legacy timestamp "adjustToUtc" conversion change in arrow 16.0

2024-04-10 Thread wish maple
The issue [1] mentions about the syntax change about arrow parquet. In general, when reading from a Parquet file with legacy timestamp not written by arrow, isAdjustedToUTC would be ignored during read. And when filtering a file like this, filtering would not work. When casting from a "deprecated

Re: Upgrading Java version in build toolchain

2024-04-10 Thread Laurent Goujon
I can give it a try for sure On Fri, Apr 5, 2024 at 10:26 AM Dane Pitkin wrote: > I think we can revisit the discussion soon for dropping Java 8 altogether, > since Spark will release 4.0 in ~June supporting Java 17+ at runtime. > > I'm curious how big of an effort it would be to get your propos

Re: [RFC] Enabling data frames in disaggregated shared memory

2024-04-10 Thread Antoine Pitrou
Hello John, Arrow IPC files can be backed quite naturally by shared memory, simply by memory-mapping them for reading. So if you have some pieces of shared memory containing Arrow IPC files, and they are reachable using a filesystem mount point, you're pretty much done. You can see an exam

Re: [RFC] Enabling data frames in disaggregated shared memory

2024-04-10 Thread Matt Topol
Hi John, I recently proposed on the mailing list an experimental extension of the Arrow IPC protocol that would make it easier to leverage disaggregated shared memory along with non-cpu memory via utilities such as UCX and libfabric [1]. I'll be putting together a more formal description of it tha

Re: Arrow community meeting April 10 at 16:00 UTC

2024-04-10 Thread Jean-Baptiste Onofré
Hi folks, I will be there. @Dewey, thanks for running the meeting :) Regards JB On Wed, Apr 10, 2024 at 3:23 PM Dewey Dunnington wrote: > > Hi Ian, > > I'll be attending and I'm happy to run the meeting. > > Cheers! > > -dewey > > On Tue, Apr 9, 2024 at 9:41 PM Ian Cook wrote: > > > > Our nex

Re: Arrow community meeting April 10 at 16:00 UTC

2024-04-10 Thread Dewey Dunnington
Hi Ian, I'll be attending and I'm happy to run the meeting. Cheers! -dewey On Tue, Apr 9, 2024 at 9:41 PM Ian Cook wrote: > > Our next biweekly Arrow community meeting is tomorrow at 16:00 UTC / 12:00 > EDT. > > I will not be able to attend tomorrow. Could someone please volunteer to > lead th