Re: PySpark Dynamic DataFrame for easier inheritance

2021-12-29 Thread Takuya Ueshin
I'm afraid I'm also against the proposal so far. What's wrong with going with "1. Functions" and using transform which allows chaining functions? I was not sure what you mean by "manage the namespaces", though. def with_price(df, factor: float = 2.0): return df.withColumn("price", F.col("pri

Re: PySpark Dynamic DataFrame for easier inheritance

2021-12-29 Thread Maciej
On 12/29/21 16:18, Pablo Alcain wrote: > Hey Maciej! Thanks for your answer and the comments :)  > > On Wed, Dec 29, 2021 at 3:06 PM Maciej > wrote: > > This seems like a lot of trouble for not so common use case that has > viable alternatives. Once you ass

Re: PySpark Dynamic DataFrame for easier inheritance

2021-12-29 Thread Pablo Alcain
Hey Maciej! Thanks for your answer and the comments :) On Wed, Dec 29, 2021 at 3:06 PM Maciej wrote: > This seems like a lot of trouble for not so common use case that has > viable alternatives. Once you assume that class is intended for > inheritance (which, arguably we neither do or imply a th

Re: PySpark Dynamic DataFrame for easier inheritance

2021-12-29 Thread Maciej
This seems like a lot of trouble for not so common use case that has viable alternatives. Once you assume that class is intended for inheritance (which, arguably we neither do or imply a the moment) you're even more restricted that we are right now, according to the project policy and need for keep

Re: PySpark Dynamic DataFrame for easier inheritance

2021-12-29 Thread Pablo Alcain
Hey everyone! I'm re-sending this e-mail, now with a PR proposal ( https://github.com/apache/spark/pull/35045 if you want to take a look at the code with a couple of examples). The proposed change includes only a new class that would extend only the Python API without doing any change to the underl