Re: [VOTE] Accept donation of Arrow Ruby bindings

2018-05-11 Thread Kouhei Sutou
Hi, Thanks for starting the vote! +1 Thanks, -- kou In "[VOTE] Accept donation of Arrow Ruby bindings" on Fri, 11 May 2018 18:47:52 -0400, Wes McKinney wrote: > Dear all, > > Arrow PMC member Kouhei Sutou has developed Ruby bindings to the GLib > C interface for Apache Arrow > > * h

Re: Question about streaming to memorymapped files

2018-05-11 Thread Wes McKinney
hi Robert, Thank you for this analysis. Having a memory map interface that supports growing the memory map sounds useful, so we would welcome this contribution to the project. best Wes On Fri, May 11, 2018 at 10:23 AM, Ambalu, Robert wrote: > Antoine, fair point. I just ran some perf stats usi

Re: Continuous benchmarking setup

2018-05-11 Thread Wes McKinney
Thanks Tom and Antoine! Since these benchmarks are literally running on a machine in my closet at home, there may be some downtime in the future. At some point we should document a process of setting up a new machine from scratch to be the nightly bare metal benchmark slave. - Wes On Fri, May 11

Re: [CI] Code coverage reports

2018-05-11 Thread Wes McKinney
hey Antoine, Looks like codecov is enabled now. I'm not sure when that happened, maybe when we made the Gitbox transition? The last time I took up this issue with ASF Infra was in 2016, I can't seem to find the ticket, but here is one where they say the "permissions are too broad" https://issues.

[VOTE] Accept donation of Arrow Ruby bindings

2018-05-11 Thread Wes McKinney
Dear all, Arrow PMC member Kouhei Sutou has developed Ruby bindings to the GLib C interface for Apache Arrow * https://github.com/red-data-tools/red-arrow * https://github.com/red-data-tools/red-arrow-gpu He is proposing to pull these projects into Apache Arrow to develop them all in the same

Import Ruby bindings

2018-05-11 Thread Kouhei Sutou
Hi, I want to import the Ruby bindings written by me at the followings: * https://github.com/red-data-tools/red-arrow * https://github.com/red-data-tools/red-arrow-gpu https://github.com/apache/arrow/pull/1990a We need IP Clearance process to import the Ruby bindings but I think that I don'

[jira] [Created] (ARROW-2575) [Python] Silently exclude hidden files

2018-05-11 Thread Durmus Karatay (JIRA)
Durmus Karatay created ARROW-2575: - Summary: [Python] Silently exclude hidden files Key: ARROW-2575 URL: https://issues.apache.org/jira/browse/ARROW-2575 Project: Apache Arrow Issue Type: Bug

RE: Question about streaming to memorymapped files

2018-05-11 Thread Ambalu, Robert
Antoine, fair point. I just ran some perf stats using FileOutputStream vs my growing mmap impl. It seems in most cases you are correct, their runtimes are basically equivalent. The only time mmap beats it significantly is if there are many Flush calls. I have a parameter to control how many ro

[jira] [Created] (ARROW-2574) [CI] Collect and publish Python coverage

2018-05-11 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2574: - Summary: [CI] Collect and publish Python coverage Key: ARROW-2574 URL: https://issues.apache.org/jira/browse/ARROW-2574 Project: Apache Arrow Issue Type: I

[jira] [Created] (ARROW-2573) Field metadata is lost on serialization round-trip

2018-05-11 Thread Thomas Buhrmann (JIRA)
Thomas Buhrmann created ARROW-2573: -- Summary: Field metadata is lost on serialization round-trip Key: ARROW-2573 URL: https://issues.apache.org/jira/browse/ARROW-2573 Project: Apache Arrow I

Re: Continuous benchmarking setup

2018-05-11 Thread Antoine Pitrou
Hi again, Tom has configured the benchmarking machine to run and publish Arrow's ASV-based benchmarks. The latest results can now be seen at: https://pandas.pydata.org/speed/arrow/ I expect these are regenerated on a regular (daily?) basis. Thanks Tom :-) Regards Antoine. On Wed, 11 Apr 20

Re: Question about streaming to memorymapped files

2018-05-11 Thread Antoine Pitrou
If you write your own auto-growing memory mapped file implementation, I'd be curious about performance measurements vs. FileOutputStream (and possibly BufferedOutputStream). mremap() and truncate() calls are not free. Also, at some point you'll want to unmap data already written to prevent the m

[jira] [Created] (ARROW-2572) [Python] Add factory function to create a Table from Columns and Schema.

2018-05-11 Thread Thomas Buhrmann (JIRA)
Thomas Buhrmann created ARROW-2572: -- Summary: [Python] Add factory function to create a Table from Columns and Schema. Key: ARROW-2572 URL: https://issues.apache.org/jira/browse/ARROW-2572 Project: A

Re: [CI] Code coverage reports

2018-05-11 Thread Antoine Pitrou
Hi Wes, Le 11/05/2018 à 05:32, Wes McKinney a écrit : > > I also prefer codecov.io, but unfortunately Apache Infra does not > support it I believe due to some app hook permissions issue (there are > some similar problems preventing CircleCI from being made available to > Apache projects). I have