Hi Wes / Arrow Dev Team, Following up on our brief twitter convo<https://twitter.com/wesmckinn/status/1222647039252525057> on the Datasets functionality in R / Python.
To provide context to others, you had mentioned that the API in python / pyarrow was more developer centric and intended for users to consume it through higher level interfaces(i.e. IBIS). This was in comparison to dplyr which from your demo had some nice analytic capabilities on top of Arrow Datasets. Seeing that demonstration made me interested to see similar Arrow Datasets functionality within Python. But it doesn't seem that is an intended capability for pyarrow which I do generally understand. However, I was trying to understand how Gandiva ties into the Arrow project as I understand that to be an analytic engine of sorts (maybe im misunderstanding). I saw this<http://blog.christianperone.com/tag/python/> implementation of Gandiva with pandas which was quite interesting and was wondering if this is the strategic goal - to have Gandiva be a component of third party tools who use arrow or if Gandiva would eventually be more of a core analytic component of Arrow. Extending on this I hoping to get the teams view on what they see as the likely home of dplyr datasets type functionality within the python ecosystem (i.e. IBIS or something else). Thanks to all for your work on the project! Best, Matthew M. Turner Email: matthew.m.tur...@outlook.com<mailto:matthew.m.tur...@outlook.com> Phone: (908)-868-2786