Re: Increasing transparency of corporate support for Apache Arrow development

2018-08-16 Thread Ted Dunning
Yes, there are several such examples. And it turned into a monstrous mess with companies bragging over lines of code changed. Oddly, the guys who did lots of reformatting did really well. There is also the problem of the very strong Apache tradition that it is individuals who contribute to project

[DISCUSS] Rust add adapter for parquet

2018-08-16 Thread Renjie Liu
Hi, all: Now the rust component is approaching a stable state and rust reader for parquet is ready. I think it maybe a good time to start an adapter for parquet, just like adapter for orc in cpp. How you guys think about it? -- Liu, Renjie Software Engineer, MVAD

[jira] [Created] (ARROW-3063) [Go] move list of supported/TODO features to confluence

2018-08-16 Thread Sebastien Binet (JIRA)
Sebastien Binet created ARROW-3063: -- Summary: [Go] move list of supported/TODO features to confluence Key: ARROW-3063 URL: https://issues.apache.org/jira/browse/ARROW-3063 Project: Apache Arrow

Re: Increasing transparency of corporate support for Apache Arrow development

2018-08-16 Thread Julian Hyde
This is a tough one. I think we need to strike a delicate balance: we should thank companies for being benefactors, but should not put up with bragging (or as Ted puts it, genital comparisons). In Calcite, we allow committers to show their company affiliations[1]. I was initially concerned, but it

Re: [ANNOUNCE] New Arrow committers: Andy Grove and Krisztián Szűcs

2018-08-16 Thread Uwe L. Korn
Congratulations to you two! Well deserved. Uwe > Am 16.08.2018 um 03:24 schrieb Renjie Liu : > > Congrats Andy, Krisztian! > > Andy Grove 于 2018年8月16日周四 上午7:47写道: > >> Congrats to you too, Krisztian! >> >> I'm also honored to be part of this project and look forward to >> contributing more

Re: Increasing transparency of corporate support for Apache Arrow development

2018-08-16 Thread Uwe L. Korn
What about separating committers and companies? We could have a section listing all committers as we currently do and have a separate listing of all companies that employed a committer while they were contributing. This will give individuals and companies attribution but does not make a big mat

Creating a user@ mailing list

2018-08-16 Thread Uwe L. Korn
Hello all, I would like to create a u...@arrow.apache.org mailing list. Some people are a bit confused that there is only a dev mailing list. They interpret this as a mailing list that should be used solely for Arrow development, not usage questions. This is sadly a psychological barrier for pe

Re: Creating a user@ mailing list

2018-08-16 Thread Wes McKinney
hi Uwe, This sounds like a good idea to me. I think we should go ahead and ask INFRA to set it up. We'll need to add a "Community" landing page on the website of sorts to explain the mailing lists better. - Wes On Thu, Aug 16, 2018 at 4:49 AM, Uwe L. Korn wrote: > Hello all, > > I would like t

[DISCUSS] Moving forward on the Arrow-Parquet C++ monorepo project

2018-08-16 Thread Wes McKinney
hi folks, I have just started a vote on the Parquet dev@ mailing list about merging development process of the Parquet C++ codebase + Arrow integration to a single repository, i.e. the Arrow one: https://lists.apache.org/thread.html/53f77f9f1f04b97709a0286db1b73a49b7f1541d8f8b2cb32db5c922@%3Cdev.

Re: Progress on Arrow RPC a.k.a. Arrow Flight

2018-08-16 Thread Jacques Nadeau
I'm out of town this week (vacation) and will be reviewing your feedback next week. Thanks for the feedback! On Thu, Aug 9, 2018, 8:45 PM Wes McKinney wrote: > hi folks, > > I left some feedback on this PR. If others could take a look > (particularly at the .proto service definition) that would

Re: Creating a user@ mailing list

2018-08-16 Thread Brian Hulette
Agreed. I was concerned about the plan to drop Slack because it was a place users would come to ask questions (for better or worse). I assumed that was because those users were just uncomfortable with mailing lists, but I think Uwe is right, they're probably just uncomfortable with *this* mailing l

Re: Progress on Arrow RPC a.k.a. Arrow Flight

2018-08-16 Thread Wes McKinney
To give some extra color on my personal motivation for interest in Arrow Flight: Systems that expose databases on a network frequently send data very slowly. For example, ODBC is in general extremely slow. What I would like to see is servers that can expose a "sql" action type. So, in considerati

Re: Creating a user@ mailing list

2018-08-16 Thread Wes McKinney
To give an example, dev@spark has 3008 subscribers user@spark has 5022 subscribers Spark is a quite different project, of course, but it shows that a user list can and will attract more subscribers. dev@arrow has 789 subscribers - Wes On Thu, Aug 16, 2018 at 10:52 AM, Brian Hulette wrote: > Ag

Re: Increasing transparency of corporate support for Apache Arrow development

2018-08-16 Thread Wes McKinney
I'm not proposing to summarize or display contributions based on LOC changes -- this is one of the worst metrics that I know of. To give an extreme example, take a look at the contribution graph for Apache ORC: https://github.com/apache/orc/graphs/contributors I don't think you can infer anything

[VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Wes McKinney
Dear all, The developers of Gandiva, an LLVM-based vectorized expression evaluation engine for Arrow columnar memory, are proposing to donate the project to Apache Arrow at some point in the near future, as has been discussed on the dev@ mailing list [1]. The Gandiva codebase is located at: http

Re: Creating a user@ mailing list

2018-08-16 Thread Wes McKinney
https://issues.apache.org/jira/browse/INFRA-16915 On Thu, Aug 16, 2018 at 11:01 AM, Wes McKinney wrote: > To give an example, > > dev@spark has 3008 subscribers > user@spark has 5022 subscribers > > Spark is a quite different project, of course, but it shows that a > user list can and will attrac

[jira] [Created] (ARROW-3064) [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for particular unit test executables

2018-08-16 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3064: --- Summary: [C++] Add option to ADD_ARROW_TEST to indicate additional dependencies for particular unit test executables Key: ARROW-3064 URL: https://issues.apache.org/jira/browse/ARROW

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Julian Hyde
+1 On Thu, Aug 16, 2018 at 8:56 AM Wes McKinney wrote: > > Dear all, > > The developers of Gandiva, an LLVM-based vectorized expression > evaluation engine for Arrow columnar memory, are proposing to donate > the project to Apache Arrow at some point in the near future, as has > been discussed on

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Siddharth Teotia
+1 On Thu, Aug 16, 2018 at 9:57 AM, Julian Hyde wrote: > +1 > On Thu, Aug 16, 2018 at 8:56 AM Wes McKinney wrote: > > > > Dear all, > > > > The developers of Gandiva, an LLVM-based vectorized expression > > evaluation engine for Arrow columnar memory, are proposing to donate > > the project to

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Philipp Moritz
+1 On Thu, Aug 16, 2018, 10:02 AM Siddharth Teotia wrote: > +1 > > On Thu, Aug 16, 2018 at 9:57 AM, Julian Hyde wrote: > > > +1 > > On Thu, Aug 16, 2018 at 8:56 AM Wes McKinney > wrote: > > > > > > Dear all, > > > > > > The developers of Gandiva, an LLVM-based vectorized expression > > > evalu

Re: Progress on Arrow RPC a.k.a. Arrow Flight

2018-08-16 Thread Julian Hyde
If your use case is SQL RPC, then you are getting close to Avatica's territory. Avatica[1] is a protocol for implementing language-independent JDBC and ODBC stacks. Now, I agree that many ODBC implementations are inefficient. Some ODBC stacks make more round trips than necessary, and do more copyi

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Li Jin
+1 On Thu, Aug 16, 2018 at 1:11 PM Philipp Moritz wrote: > +1 > > On Thu, Aug 16, 2018, 10:02 AM Siddharth Teotia > wrote: > > > +1 > > > > On Thu, Aug 16, 2018 at 9:57 AM, Julian Hyde wrote: > > > > > +1 > > > On Thu, Aug 16, 2018 at 8:56 AM Wes McKinney > > wrote: > > > > > > > > Dear all, >

Re: Progress on Arrow RPC a.k.a. Arrow Flight

2018-08-16 Thread Wes McKinney
hi Julian, Thanks for chiming in. On Thu, Aug 16, 2018 at 1:16 PM, Julian Hyde wrote: > If your use case is SQL RPC, then you are getting close to Avatica's > territory. Avatica[1] is a protocol for implementing > language-independent JDBC and ODBC stacks. I'm not proposing to develop a SQL RPC

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Bryan Cutler
+1 On Thu, Aug 16, 2018, 10:18 AM Li Jin wrote: > +1 > On Thu, Aug 16, 2018 at 1:11 PM Philipp Moritz wrote: > > > +1 > > > > On Thu, Aug 16, 2018, 10:02 AM Siddharth Teotia > > wrote: > > > > > +1 > > > > > > On Thu, Aug 16, 2018 at 9:57 AM, Julian Hyde wrote: > > > > > > > +1 > > > > On Thu

[jira] [Created] (ARROW-3065) concat_tables() failing from bad Pandas Metadata

2018-08-16 Thread David Lee (JIRA)
David Lee created ARROW-3065: Summary: concat_tables() failing from bad Pandas Metadata Key: ARROW-3065 URL: https://issues.apache.org/jira/browse/ARROW-3065 Project: Apache Arrow Issue Type: Bug

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Kouhei Sutou
+1 In "Re: [VOTE] Accept donation of Gandiva to Apache Arrow" on Thu, 16 Aug 2018 14:30:28 -0700, Bryan Cutler wrote: > +1 > > On Thu, Aug 16, 2018, 10:18 AM Li Jin wrote: > >> +1 >> On Thu, Aug 16, 2018 at 1:11 PM Philipp Moritz wrote: >> >> > +1 >> > >> > On Thu, Aug 16, 2018, 10:02

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Andy Grove
+1 On Thu, Aug 16, 2018 at 9:56 AM Wes McKinney wrote: > Dear all, > > The developers of Gandiva, an LLVM-based vectorized expression > evaluation engine for Arrow columnar memory, are proposing to donate > the project to Apache Arrow at some point in the near future, as has > been discussed on

Re: [ANNOUNCE] New Arrow committers: Andy Grove and Krisztián Szűcs

2018-08-16 Thread Bryan Cutler
Congratulations Krisztian and Andy! On Thu, Aug 16, 2018, 1:30 AM Uwe L. Korn wrote: > Congratulations to you two! Well deserved. > > Uwe > > > Am 16.08.2018 um 03:24 schrieb Renjie Liu : > > > > Congrats Andy, Krisztian! > > > > Andy Grove 于 2018年8月16日周四 上午7:47写道: > > > >> Congrats to you too,

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Phillip Cloud
+1 On Thu, Aug 16, 2018 at 9:26 PM Andy Grove wrote: > +1 > > On Thu, Aug 16, 2018 at 9:56 AM Wes McKinney wrote: > > > Dear all, > > > > The developers of Gandiva, an LLVM-based vectorized expression > > evaluation engine for Arrow columnar memory, are proposing to donate > > the project to Ap

Re: [VOTE] Accept donation of Gandiva to Apache Arrow

2018-08-16 Thread Jeff Zhang
+1 Phillip Cloud 于2018年8月17日周五 上午10:59写道: > +1 > > On Thu, Aug 16, 2018 at 9:26 PM Andy Grove wrote: > > > +1 > > > > On Thu, Aug 16, 2018 at 9:56 AM Wes McKinney > wrote: > > > > > Dear all, > > > > > > The developers of Gandiva, an LLVM-based vectorized expression > > > evaluation engine for