Re: New Pandas-Apache repo

2023-01-27 Thread Adesola Adedewe
Thanks for taking the time for reviewing . It work great for my use-case, as i am doing a one shot loading and data manipulation on big data. And the data is basically immutable for the rest of the lifetime of the process, just read. but i know from testing and benchmarking that it is limited by th

Re: New Pandas-Apache repo

2023-01-27 Thread Weston Pace
The new kernels are interesting. There has been some ask recently[1] for weighted averages and I think you have some of the pieces (if not all of it) here. We also recently plumbed in support for binary aggregates into Acero[2] so having more binary aggregate kernels would be nice. Outside of th

Re: New Pandas-Apache repo

2023-01-22 Thread Adesola Adedewe
Yes I will, I haven't taken enough time to clean up the README , it was generated based on my source code with CHATGPT. I will do that later in the week. On Sun, Jan 22, 2023 at 2:36 AM Benson Muite wrote: > On 1/22/23 13:15, Adesola Adedewe wrote: > > i'm working on a project where big financia

Re: New Pandas-Apache repo

2023-01-22 Thread Benson Muite
On 1/22/23 13:15, Adesola Adedewe wrote: > i'm working on a project where big financial data needs to be loaded stored > and manipulated. the data is stored as parquet. my initial version had > arrow just load the parquet data and i used the basic unorderedmap but this > limited me to only one data

Re: New Pandas-Apache repo

2023-01-22 Thread Adesola Adedewe
i'm working on a project where big financial data needs to be loaded stored and manipulated. the data is stored as parquet. my initial version had arrow just load the parquet data and i used the basic unorderedmap but this limited me to only one data type. i found i could make my database more gene

Re: New Pandas-Apache repo

2023-01-22 Thread Benson Muite
On 1/22/23 11:41, Adesola Adedewe wrote: > The project was initially meant to provide a simpler interface over arrow > apache so pretty much what was done with the python api, but it has > evolved to be more than that ,with indexing and other panda operations > implemented like reindex, resample, c

Re: New Pandas-Apache repo

2023-01-22 Thread Adesola Adedewe
The project was initially meant to provide a simpler interface over arrow apache so pretty much what was done with the python api, but it has evolved to be more than that ,with indexing and other panda operations implemented like reindex, resample, concat etc. I currently have it good enough for my

Re: New Pandas-Apache repo

2023-01-22 Thread Benson Muite
On 1/22/23 06:23, Adesola Adedewe wrote: > okay thanks for your consideration. > > On Sat, Jan 21, 2023 at 4:49 PM Sutou Kouhei wrote: > >> Hi, >> >> I'm not sure pandas like API is suitable for our official >> data frame API. >> >> FYI: >> >> * GitHub issue of this: >> https://github.com/ap

Re: New Pandas-Apache repo

2023-01-21 Thread Adesola Adedewe
okay thanks for your consideration. On Sat, Jan 21, 2023 at 4:49 PM Sutou Kouhei wrote: > Hi, > > I'm not sure pandas like API is suitable for our official > data frame API. > > FYI: > > * GitHub issue of this: > https://github.com/apache/arrow/issues/33747 > * [DISCUSS] Developing a "data fra

Re: New Pandas-Apache repo

2023-01-21 Thread Sutou Kouhei
Hi, I'm not sure pandas like API is suitable for our official data frame API. FYI: * GitHub issue of this: https://github.com/apache/arrow/issues/33747 * [DISCUSS] Developing a "data frame" subproject in the Arrow C++ libraries https://lists.apache.org/thread/50vbmw49w83sj3km326srown64c7hlf1