Re: [C++] Shall we modify the ORC reader?

2021-01-19 Thread Deepak Majeti
> >>>> > >>>> Here are my proposed changes: > >>>> 1. The ORC STRING type should be converted to the Arrow LARGE_STRING > type > >>>> instead of STRING type since it is large. > >>>> 2. The ORC LIST type should be converted to the Arrow LARGE_LIST type > >>>> instead of LIST type since it is large. > >>>> 3. The ORC MAP type should be converted to the Arrow MAP type instead > of > >>>> list of structs with hardcoded field names as long as > >>>> the offsets fit into int32. Otherwise we shouldn't return OK. > >>>> > >>>> Thanks, > >>>> Ying > >> > >> > > > > > > > > -- regards, Deepak Majeti

[jira] [Created] (ARROW-5538) [C++] Restrict minimum OpenSSL version to 1.0.2

2019-06-09 Thread Deepak Majeti (JIRA)
Deepak Majeti created ARROW-5538: Summary: [C++] Restrict minimum OpenSSL version to 1.0.2 Key: ARROW-5538 URL: https://issues.apache.org/jira/browse/ARROW-5538 Project: Apache Arrow Issue

Re: [DISCUSS] Parquet C++/Rust: Rename Parquet::LogicalType to Parquet::ConvertedType

2019-05-29 Thread Deepak Majeti
"converted"? Is there a conversion? > > > Le 29/05/2019 à 08:46, Deepak Majeti a écrit : > > Hi Everyone, > > > > In the early days of parquet-cpp development, the developers mapped the > > thrift::ConvertedType to parquet::LogicalType. > > This

[DISCUSS] Parquet C++/Rust: Rename Parquet::LogicalType to Parquet::ConvertedType

2019-05-28 Thread Deepak Majeti
objections to renaming the Parquet::LogicalType to Parquet::ConvertedType in both C++ and Rust? -- regards, Deepak Majeti

[jira] [Created] (ARROW-5241) [Python] Add option to disable writing statistics

2019-04-29 Thread Deepak Majeti (JIRA)
Deepak Majeti created ARROW-5241: Summary: [Python] Add option to disable writing statistics Key: ARROW-5241 URL: https://issues.apache.org/jira/browse/ARROW-5241 Project: Apache Arrow Issue

[jira] [Created] (ARROW-5218) [C++] Improve build when third-party library locations are specified

2019-04-25 Thread Deepak Majeti (JIRA)
Deepak Majeti created ARROW-5218: Summary: [C++] Improve build when third-party library locations are specified Key: ARROW-5218 URL: https://issues.apache.org/jira/browse/ARROW-5218 Project: Apache

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-08-07 Thread Deepak Majeti
ng, > >> posting and iterating on a commit and also the number of opportunities > for > >> missteps. The size of the repo and build/test times matter but are > >> secondary so long as the workflow is simple and reliable. > >> > >> I don't reall

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-31 Thread Deepak Majeti
> fork. That would obviously be a bad outcome for the community. > > > > It doesn't look like I will be able to convince you that a monorepo is > > a good idea; what I would ask instead is that you be willing to give > > it a shot, and if it turns out in the wa

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-31 Thread Deepak Majeti
ed it. For example, there were 25 JIRA's in the 0.10.0 > >>> > release of arrow, many of which were holding up the release. I hope > that > >>> > seems like a reasonable compromise, and I think it will help reduce > the > >>> > complexity of the build/release tooling. > >>> > > >>> > > >>> > On Mon, Jul 30, 2018 at 8:50 PM Ted Dunning > >>> wrote: > >>> > > >>> >> On Mon, Jul 30, 2018 at 5:39 PM Wes McKinney > >>> wrote: > >>> >> > >>> >> > > >>> >> > > The community will be less willing to accept large > >>> >> > > changes that require multiple rounds of patches for stability > and > >>> API > >>> >> > > convergence. Our contributions to Libhdfs++ in the HDFS > community > >>> took > >>> >> a > >>> >> > > significantly long time for the very same reason. > >>> >> > > >>> >> > Please don't use bad experiences from another open source > community as > >>> >> > leverage in this discussion. I'm sorry that things didn't go the > way > >>> >> > you wanted in Apache Hadoop but this is a distinct community which > >>> >> > happens to operate under a similar open governance model. > >>> >> > >>> >> > >>> >> There are some more radical and community building options as well. > Take > >>> >> the subversion project as a precedent. With subversion, any Apache > >>> >> committer can request and receive a commit bit on some large > fraction of > >>> >> subversion. > >>> >> > >>> >> So why not take this a bit further and give every parquet committer > a > >>> >> commit bit in Arrow? Or even make them be first class committers in > >>> Arrow? > >>> >> Possibly even make it policy that every Parquet committer who asks > will > >>> be > >>> >> given committer status in Arrow. > >>> >> > >>> >> That relieves a lot of the social anxiety here. Parquet committers > >>> can't be > >>> >> worried at that point whether their patches will get merged; they > can > >>> just > >>> >> merge them. Arrow shouldn't worry much about inviting in the > Parquet > >>> >> committers. After all, Arrow already depends a lot on parquet so > why not > >>> >> invite them in? > >>> >> > >>> > -- regards, Deepak Majeti

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-30 Thread Deepak Majeti
took a significantly long time for the very same reason. On Mon, Jul 30, 2018 at 6:05 PM Wes McKinney wrote: > hi Deepak > > On Mon, Jul 30, 2018 at 5:18 PM, Deepak Majeti > wrote: > > @Wes > > My observation is that most of the parquet-cpp contributors you listed >

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-30 Thread Deepak Majeti
e version to > > test. > > > Tests in Python is dependent on cpp sub-repo to ensure the API still > > pass. > > > > > > This should be the best of both worlds, if sub-repo are supposed > option. > > > > > > --Donald E. Foss > > >

Re: [DISCUSS] Solutions for improving the Arrow-Parquet C++ development morass

2018-07-29 Thread Deepak Majeti
releases would include a coordinated snapshot of the Parquet > implementation as it stands > > Continuing with the status quo has become unsatisfactory to me and as > a result I've become less motivated to work on the parquet-cpp > codebase. > > The only Parquet C++ com

[jira] [Created] (ARROW-2496) [C++] Add support for Libhdfs++

2018-04-23 Thread Deepak Majeti (JIRA)
Deepak Majeti created ARROW-2496: Summary: [C++] Add support for Libhdfs++ Key: ARROW-2496 URL: https://issues.apache.org/jira/browse/ARROW-2496 Project: Apache Arrow Issue Type: Improvement

Re: Next Arrow sync call

2018-03-29 Thread Deepak Majeti
Wes, Can you add me too? Thanks! On Wed, Mar 28, 2018 at 9:52 PM, Alex Hagerman wrote: > Hi, > > Can I get an invite as well? > > Thank you. > > Alex > > > > On 03/28/2018 09:28 PM, Aneesh Karve wrote: > >> Hi Wes, please add me to the Gcal invite. Tha

[jira] [Created] (ARROW-1186) [C++] Enable option to build arrow with minimal dependencies needed to build Parquet library

2017-07-05 Thread Deepak Majeti (JIRA)
Deepak Majeti created ARROW-1186: Summary: [C++] Enable option to build arrow with minimal dependencies needed to build Parquet library Key: ARROW-1186 URL: https://issues.apache.org/jira/browse/ARROW-1186

[jira] [Created] (ARROW-820) [C++] Build dependencies for Parquet library without arrow support

2017-04-13 Thread Deepak Majeti (JIRA)
Deepak Majeti created ARROW-820: --- Summary: [C++] Build dependencies for Parquet library without arrow support Key: ARROW-820 URL: https://issues.apache.org/jira/browse/ARROW-820 Project: Apache Arrow