Hi roee,
It seems that we have both raw value and encoded value types in the Java
implementation, so there is no information loss?
In particular, we have org.apache.arrow.vector.types.pojo.FieldType#type
for the raw type
and org.apache.arrow.vector.types.pojo.FieldType#dictionary#indexType for
th
On Wed, 2021-08-25 at 21:02 +0300, roee shlomo wrote:
> This means that an API to import an ArrowSchema (in C) into a
> Field/Schema
> (in Java) is not suitable for dictionary encoded arrays because there
> is an
> information loss. Specifically, there is nothing in Field/Schema to
> indicate the
Le 25/08/2021 à 20:02, roee shlomo a écrit :
In Java, the dictionary vector is completely separate from the encoded
vector. Typically, a DictionaryProvider is available alongside a dictionary
encoded vector (to provide dictionaries for the vector and its children).
On the other hand, the C Data
We are currently implementing the C Data Interface in Java and have some
questions regarding dictionary-encoded arrays. We would appreciate some
help and guidance, especially from an API perspective.
In Java, the dictionary vector is completely separate from the encoded
vector. Typically, a Dictio
Le 25/08/2021 à 17:27, Joris Van den Bossche a écrit :
https://github.com/rapidsai/cudf/blob/be25a30ca20f3135f341c51b36cb075b376d5def/python/cudf/cudf/_lib/cpp/io/types.pxd#L9
Here they are doing `from pyarrow.includes.libarrow cimport
CRandomAccessFile` (CRandomAccessFile is the cython equiva
On Wed, 25 Aug 2021 at 17:21, Antoine Pitrou wrote:
>
> Le 25/08/2021 à 17:12, Joris Van den Bossche a écrit :
> > One example of consumer of our Cython API is cudf (
> > https://github.com/rapidsai/cudf).
> > I am not very familiar with the package itself, but browsing its code, I
> > see that t
SGTM
On Wed, Aug 25, 2021 at 10:04 AM Antoine Pitrou wrote:
> +1
>
>
> Le 25/08/2021 à 16:03, Ian Cook a écrit :
> > In last week's Arrow sync call, I suggested that we move future sync
> > calls from Google Meet to Zoom. The primary benefit of this is that
> > Zoom meetings can be configured to
Le 25/08/2021 à 17:12, Joris Van den Bossche a écrit :
One example of consumer of our Cython API is cudf (
https://github.com/rapidsai/cudf).
I am not very familiar with the package itself, but browsing its code, I
see that they do for example cimport RecordBatchReader (
https://github.com/rapi
Le 25/08/2021 à 17:17, Keith Kraus a écrit :
If I remember correctly the reason cuDF interacts with the Cython code for
IPC stuff is that in the past the existing IPC machinery in Arrow didn't
work correctly with GPU memory. If that is fixed I think there's a case to
remove this code entirely f
If I remember correctly the reason cuDF interacts with the Cython code for
IPC stuff is that in the past the existing IPC machinery in Arrow didn't
work correctly with GPU memory. If that is fixed I think there's a case to
remove this code entirely from cuDF and instruct users to use the higher
lev
One example of consumer of our Cython API is cudf (
https://github.com/rapidsai/cudf).
I am not very familiar with the package itself, but browsing its code, I
see that they do for example cimport RecordBatchReader (
https://github.com/rapidsai/cudf/blob/f6d31fa95d9b8d8658301438d0f9ba22a1c131aa/pyt
+1
Le 25/08/2021 à 16:03, Ian Cook a écrit :
In last week's Arrow sync call, I suggested that we move future sync
calls from Google Meet to Zoom. The primary benefit of this is that
Zoom meetings can be configured to allow participants to join even if
the host is not present, thus eliminating t
In last week's Arrow sync call, I suggested that we move future sync
calls from Google Meet to Zoom. The primary benefit of this is that
Zoom meetings can be configured to allow participants to join even if
the host is not present, thus eliminating the need for any one
particular person or person w
Le 20/08/2021 à 12:24, Alessandro Molina a écrit :
We could argue that only what was documented explicitly should be
considered "public" and everything else can be changed, but our
documentation seems to be unclear on this point. It lists some functions
that should be considered our explicit a
Given we didn't get much opinions on this one, I will propose we move
forward with merging the open PR that moves ipc cython implementation and
discover if we receive any open issue because projects out there were
relying on it.
It seems that ipc is a low risk module from that point of view and wil
15 matches
Mail list logo