Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread David Li
gRPC supports headers so for Flight, we could send essentially an Accept header and perhaps a Content-Type header. David On Mon, Mar 2, 2020, 23:15 Micah Kornfield wrote: > Hi Wes, > A few thoughts on this. In general, I think it is a good idea. But before > proceeding, I think the following

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Antoine Pitrou
If we want to use a HTTP header, it would be more of a Accept-Encoding header, no? In any case, we would have to put non-standard values there (e.g. lz4), so I'm not sure how desirable it is to repurpose HTTP headers for that, rather than add some dedicated field to the Flight messages. Regards

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Fan Liya
I am so glad to see this discussion, and I am willing to provide help from the Java side. In the proposal, I see the support for basic compression strategies (e.g.gzip, snappy). IMO, applying a single basic strategy is not likely to achieve performance improvement for most scenarios. The optimal c

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Antoine Pitrou
Well, we shouldn't overdo this either. We are not trying to replicate the Parquet format. Regards Antoine. Le 03/03/2020 à 14:36, Fan Liya a écrit : > I am so glad to see this discussion, and I am willing to provide help from > the Java side. > > In the proposal, I see the support for basic

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Wes McKinney
On Tue, Mar 3, 2020, 7:36 AM Fan Liya wrote: > I am so glad to see this discussion, and I am willing to provide help from > the Java side. > > In the proposal, I see the support for basic compression strategies > (e.g.gzip, snappy). > IMO, applying a single basic strategy is not likely to achieve

[jira] [Created] (ARROW-7994) [CI][C++] Move AppVeyor MinGW builds to Github Actions

2020-03-03 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7994: - Summary: [CI][C++] Move AppVeyor MinGW builds to Github Actions Key: ARROW-7994 URL: https://issues.apache.org/jira/browse/ARROW-7994 Project: Apache Arrow

Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

2020-03-03 Thread Igor Calabria
Hi Micah, I actually got involved with another personal project and had to postpone my contribution to arrow a bit. The good news is that I'm almost done with it, so I could help you with the read side very soon. Any ideas how we could coordinate this? Em qua., 26 de fev. de 2020 às 21:06, Wes McK

[jira] [Created] (ARROW-7995) [C++] IO: coalescing and caching read ranges

2020-03-03 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7995: - Summary: [C++] IO: coalescing and caching read ranges Key: ARROW-7995 URL: https://issues.apache.org/jira/browse/ARROW-7995 Project: Apache Arrow Issue Typ

[jira] [Created] (ARROW-7996) Error serializing empty pandas DataFrame with pyarrow

2020-03-03 Thread Juan David Agudelo (Jira)
Juan David Agudelo created ARROW-7996: - Summary: Error serializing empty pandas DataFrame with pyarrow Key: ARROW-7996 URL: https://issues.apache.org/jira/browse/ARROW-7996 Project: Apache Arrow

[jira] [Created] (ARROW-7997) Schema equals method with inconsistent docs in pyarrow.

2020-03-03 Thread Jira
Otávio Vasques created ARROW-7997: - Summary: Schema equals method with inconsistent docs in pyarrow. Key: ARROW-7997 URL: https://issues.apache.org/jira/browse/ARROW-7997 Project: Apache Arrow

[jira] [Created] (ARROW-7998) [C++][Plasma] Make Seal requests synchronous

2020-03-03 Thread Stephanie Wang (Jira)
Stephanie Wang created ARROW-7998: - Summary: [C++][Plasma] Make Seal requests synchronous Key: ARROW-7998 URL: https://issues.apache.org/jira/browse/ARROW-7998 Project: Apache Arrow Issue Typ

Arrow sync call March 4 at 12:00 US/Eastern, 17:00 UTC

2020-03-03 Thread Neal Richardson
Hi all, Reminder that our biweekly call is coming up tomorrow/later today at https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will be sent out to the mailing list afterward. Neal

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Fan Liya
Sure. I agree with you that we should not overdo this. I am wondering if we should provide an option to allow users to plugin their customized compression strategies. Best, Liya Fan On Tue, Mar 3, 2020 at 9:47 PM Wes McKinney wrote: > On Tue, Mar 3, 2020, 7:36 AM Fan Liya wrote: > > > I am so

Re: [DISCUSS] Adding "trivial" buffer compression option to IPC protocol (ARROW-300)

2020-03-03 Thread Wes McKinney
On Tue, Mar 3, 2020, 8:11 PM Fan Liya wrote: > Sure. I agree with you that we should not overdo this. > I am wondering if we should provide an option to allow users to plugin > their customized compression strategies. > Can you provide a patch showing changes to Message.fbs (or Schema.fbs) that

Re: Coordinating / scheduling C++ Parquet-Arrow nested data work (ARROW-1644 and others)

2020-03-03 Thread Micah Kornfield
Hi Igor, If you have the time https://issues.apache.org/jira/browse/ARROW-7960 might be a good task to pick up for this I think it should be a relatively small amount of code, so it is probably a good contribution to the project. Once that is wrapped up we can see were we both are. Cheers, Micah