+1 Bringing a Pandas API for pyspark to upstream Spark will only bring benefits for everyone (more eyes to use/see/fix/improve the API) as well as better alignment with core Spark improvements, the extra weight looks manageable.
On Mon, Mar 15, 2021 at 4:45 PM Nicholas Chammas <[email protected]> wrote: > > On Mon, Mar 15, 2021 at 2:12 AM Reynold Xin <[email protected]> wrote: >> >> I don't think we should deprecate existing APIs. > > > +1 > > I strongly prefer Spark's immutable DataFrame API to the Pandas API. I could > be wrong, but I wager most people who have worked with both Spark and Pandas > feel the same way. > > For the large community of current PySpark users, or users switching to > PySpark from another Spark language API, it doesn't make sense to deprecate > the current API, even by convention. --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
