Re: [DISCUSS] Move sqlparser-rs back into DataFusion project?

2024-02-27 Thread Aldrin
Maybe it would be valuable to more explicitly define "moving back into DataFusion project". I assumed it meant absorbing into the datafusion repo, but it occurs to me that may not be the case. Then, how would sqlparser-rs be "moved"? # -- # Aldrin https://github.

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Aldrin
I am interested in this as well, but I haven't gotten to a point where I can have valuable input (I haven't tried other transports). I know of a third party that is interested in Arrow for HPC environments that could be interested in the proposal and I can see if they're interested in providing

R Date class lost when column used for partitioning

2024-02-27 Thread Andrew Piskorski
Hi, using the R arrow package version 14.0.2.1, I'm stumped by something seemingly simple. For date columns, I like to use R's Date class, which is stored internally as a number but prints as a -MM-DD string. In most cases arrow handles these Date columns nicely. The exception is when I part

Arrow community meeting February 28 at 17:00 UTC

2024-02-27 Thread Ian Cook
Our next biweekly Arrow community meeting is tomorrow at 17:00 UTC / 12:00 EST. Zoom meeting URL: https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 Meeting ID: 876 4903 3008 Passcode: 958092 Meeting notes will be captured in this Google Doc: https://docs.google.com/document/d/1xr

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Paul Whalen
As a potential "end user developer," (and aspiring contributor) this immediately excited me when I first saw it. I work at a trading firm, and my team has developed an IPC mechanism for efficiently transmitting pandas dataframes both remotely via TCP and locally via shared memory, where the interf

Re: [DISCUSS] Move sqlparser-rs back into DataFusion project?

2024-02-27 Thread Chak-Pong Chung
There are cases where people need datafusion but not a SQL parser. For example, people building a composable query engine for graph or other data modality may not choose SQL as the DSL. Decoupling them seems to be a good idea. On Tue, Feb 27, 2024, 6:20 AM Mehmet Ozan Kabak wrote: > In this case

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Matt Topol
I'll continue my efforts of trying to reach out to other interested parties, but if anyone else here has any contacts or connections that they think might be interested please forward them the link to the Google doc. I really do want to get as much engagement and feedback as possible on this. Tha

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Wes McKinney
Have there been efforts to proactively reach out to other third parties that might have an interest in this or be a potential user at some point? There are a lot of interested parties in Arrow that may not actively follow the mailing list. Seems like folks from the Dask, Ray, RAPIDS (especially fo

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Antoine Pitrou
If there's no engagement, then I'm afraid it might mean that third parties have no interest in this. I don't really have any solution for generating engagement except nagging and pinging people explicitly :-) Le 27/02/2024 à 19:09, Matt Topol a écrit : I would like to see the same Antoine

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Matt Topol
I would like to see the same Antoine, currently given the lack of engagement (both for OR against) I was going to take the silence as assent and hope for non-Voltron Data PMC members to vote in this. If anyone has any suggestions on how we could potentially generate more engagement and discussion

Re: [VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Antoine Pitrou
Hello, I'd really like to see more engagement and criticism from non-Voltron Data parties before this is formally adopted as an Arrow spec. Regards Antoine. Le 27/02/2024 à 18:35, Matt Topol a écrit : Hey all, I'd like to propose a vote for us to officially adopt the protocol described

[VOTE] Protocol for Dissociated Arrow IPC Transports

2024-02-27 Thread Matt Topol
Hey all, I'd like to propose a vote for us to officially adopt the protocol described in the google doc[1] for Dissociated Arrow IPC Transports. This proposal was originally discussed at [2]. Once this proposal is adopted, I will work on adding the necessary documentation to the Arrow website alon

[DataFusion] Proposed changes to the TreeNode API

2024-02-27 Thread Andrew Lamb
I would like to draw some additional attention to any DataFusion user who uses the TreeNode API heavily. There is a PR with a proposed improvement to the API in [1]. Please share any comments you may have on the PR. Andrew [1] https://github.com/apache/arrow-datafusion/pull/8891

Re: [DISCUSS] Move sqlparser-rs back into DataFusion project?

2024-02-27 Thread Mehmet Ozan Kabak
In this case, maybe we can bring sqlparser-rs into the ASF umbrella following the arrow-datafusion model? Once DataFusion becomes a top-level project, we could move it to datafusion-sqlparser-rs — it would be a quasi-independent project just like how DataFusion is today w.r.t. Arrow. But it wou

[VOTE] Flight RPC: add 'fallback' URI scheme

2024-02-27 Thread David Li
I would like to propose a 'reuse connection' URI scheme for Flight RPC. This proposal was previously discussed at [1]. A candidate implementation for C++, Java, and Go is at [2]. The vote will be open for at least 72 hours. [ ] +1 [ ] +0 [ ] -1 Do not accept this proposal because... [1]: http

Re: [DISCUSS] Move sqlparser-rs back into DataFusion project?

2024-02-27 Thread Andrew Lamb
Julian, thank you for your insight. I very much agree with it. > I think the ASF is wrong on this. I think it needs to provide a home > for medium-sized projects such as sqlparser-rs in an existing > top-level project; It could be said that DataFusion fits this model -- it isn't really an "Arrow