Re: New Pandas-Apache repo

Benson Muite Sun, 22 Jan 2023 02:36:26 -0800

On 1/22/23 13:15, Adesola Adedewe wrote:
> i'm working on a project where big financial data needs to be loaded stored
> and manipulated. the data is stored as parquet. my initial version had
> arrow just load the parquet data and i used the basic unorderedmap but this
> limited me to only one data type. i found i could make my database more
> generic with arrow and its performance benefits. unfortunately my team is
> mostly filled with python dev, so i decided to write a cleaner interface
> over arrow, and using interfaces closer to panda. This enabled us to use
> fewer lines of code as well, and still enjoy the benefit. i will write a
> blog post later, i was mostly looking for other developers looking to
> collaborate, or who may need this as well. not necessarily add it to the
> main library, but i'm not opposed to that. I also implemented some
> custom kernels like covariance correlation, cumprod, shift, pctchange.
>


The context is very helpful. A blog post would certainly alert others in
the Arrow community of your work.  Most developers are over burdened, so
explaining a use case and how it may help them would encourage
exploration and review of your repository, so would encourage a blog
post that alerts the wider Arrow developer community about your work.
Updating the README of your repository would also encourage use.

Re: New Pandas-Apache repo

Reply via email to