Re: Is it possible to add computed columns to a pyarrow dataset

2023-10-27 Thread Nate Bauernfeind
Hey David, You might consider using the source-available tool Deephaven ( https://deephaven.io). It has a rich set of feature; all sorts of joins ( https://deephaven.io/core/docs/conceptual/choose-joins/), the ability to create logical columns which are only materialized when fetched (see https://

Re: Is it possible to add computed columns to a pyarrow dataset

2023-10-27 Thread Ian Cook
n my processing > pipeline. > > > > > > From: Chang She > Sent: Wednesday, October 25, 2023 11:19 AM > To: user@arrow.apache.org > Subject: Re: Is it possible to add computed columns to a pyarrow dataset > > > > External Email: Use caution with links and att

RE: Is it possible to add computed columns to a pyarrow dataset

2023-10-26 Thread Lee, David (PAG)
computed columns to a pyarrow dataset External Email: Use caution with links and attachments Do you already have a storage layer to persist these views or do you only need ephemeral views? Sounds interesting curious to find out more about your use case On Wed, Oct 25, 2023 at 2:00 PM Lee, David (PAG

Re: Is it possible to add computed columns to a pyarrow dataset

2023-10-25 Thread Chang She
Do you already have a storage layer to persist these views or do you only need ephemeral views? Sounds interesting curious to find out more about your use case On Wed, Oct 25, 2023 at 2:00 PM Lee, David (PAG) wrote: > Here's my ideal use case scenario.. > > Create multiple datasets mapped to dif

Is it possible to add computed columns to a pyarrow dataset

2023-10-25 Thread Lee, David (PAG)
Here's my ideal use case scenario.. Create multiple datasets mapped to different file directories. Create more datasets by logically generating additional computed columns using expressions Create joined dataset by joining datasets Finally run a Scanner on the joined dataset to start materializa