No concerns from me either.
On Mon, Aug 19, 2019 at 5:10 AM Antoine Pitrou wrote:
>
>
> No concern from me. It should probably be documented somewhere though :-)
>
> Regards
>
> Antoine.
>
>
> Le 16/08/2019 à 17:23, Joris Van den Bossche a écrit :
> > Coming back to this older thread, I have ope
No concern from me. It should probably be documented somewhere though :-)
Regards
Antoine.
Le 16/08/2019 à 17:23, Joris Van den Bossche a écrit :
> Coming back to this older thread, I have opened a PR with a proof of
> concept of the proposed protocol to convert third-party array objects to
Coming back to this older thread, I have opened a PR with a proof of
concept of the proposed protocol to convert third-party array objects to
arrow: https://github.com/apache/arrow/pull/5106
In the tests, I added the protocol to pandas' nullable integer array (which
is currently not supported in th
Hi Wes,
That indeeds seems as a good fit for the pandas ExtensionArray <-> Arrow
conversion.
I will look into it starting this week.
Joris
Op vr 17 mei 2019 om 00:28 schreef Wes McKinney :
> hi Joris,
>
> Somewhat related to this, I want to also point out that we have C++
> extension types [1].
hi Joris,
Somewhat related to this, I want to also point out that we have C++
extension types [1]. As part of this, it would also be good to define
and document a public API for users to create ExtensionArray
subclasses that can be serialized and deserialized using this
machinery.
As a motivating
Op do 9 mei 2019 om 21:38 schreef Uwe L. Korn :
> +1 to the idea of adding a protocol to let other objects define their way
> to Arrow structures. For pandas.Series I would expect that they return an
> Arrow Column.
>
> For the Arrow->pandas conversion I have a bit mixed feelings. In the
> normal
My initial idea was to not let this protocol pass metadata around (which
indeed is not possible for arrays).
Currently, metadata are only saved at the level of a Table when converting
from a pandas DataFrame (in Table.from_pandas()). That could continue to be
the case, where Table.from_pandas both
+1 to the idea of adding a protocol to let other objects define their way to
Arrow structures. For pandas.Series I would expect that they return an Arrow
Column.
For the Arrow->pandas conversion I have a bit mixed feelings. In the normal
Fletcher case I would expect that we don't convert anyth
Arrow arrays don't have metadata, so if you want to pass metadata around
you should at least add a hook for columns as well.
Regards
Antoine.
Le 09/05/2019 à 18:10, Joris Van den Bossche a écrit :
> An additional question might be at which "level" to provide such a hook to
> third-party packa
An additional question might be at which "level" to provide such a hook to
third-party packages: I proposed for Array, but what for chunked arrays,
columns or tables? Maybe at least returning a chunked array should also be
allowed.
Op do 9 mei 2019 om 18:06 schreef Joris Van den Bossche <
jorisvan
The signature I had in mind is something like:
def __arrow_array__(self, type : pyarrow.DataType=None) -> pyarrow.Array:
where the function returns a pyarrow.Array, and takes an optional data type
(in case there are multiple ways to convert to a pyarrow Array, and what
can be passed by the user i
Hi Joris,
Do you have a signature for __arrow_array__ method in mind?
For example, let's say you want to roundtrip ExtensionArrays or other
third-party data through Arrow. How do you preserve the required metadata?
Regards
Antoine.
Le 09/05/2019 à 13:29, Joris Van den Bossche a écrit :
> H
12 matches
Mail list logo