sorry I am late to the discussion here -- the jira mentions using this extensions for dealing with shuffles, can you explain that part? I don't see how you would use this to change shuffle behavior at all.
On Tue, May 14, 2019 at 10:59 AM Thomas graves <tgra...@apache.org> wrote: > Thanks for replying, I'll extend the vote til May 26th to allow your > and other people feedback who haven't had time to look at it. > > Tom > > On Mon, May 13, 2019 at 4:43 PM Holden Karau <hol...@pigscanfly.ca> wrote: > > > > I’d like to ask this vote period to be extended, I’m interested but I > don’t have the cycles to review it in detail and make an informed vote > until the 25th. > > > > On Tue, May 14, 2019 at 1:49 AM Xiangrui Meng <m...@databricks.com> > wrote: > >> > >> My vote is 0. Since the updated SPIP focuses on ETL use cases, I don't > feel strongly about it. I would still suggest doing the following: > >> > >> 1. Link the POC mentioned in Q4. So people can verify the POC result. > >> 2. List public APIs we plan to expose in Appendix A. I did a quick > check. Beside ColumnarBatch and ColumnarVector, we also need to make the > following public. People who are familiar with SQL internals should help > assess the risk. > >> * ColumnarArray > >> * ColumnarMap > >> * unsafe.types.CaledarInterval > >> * ColumnarRow > >> * UTF8String > >> * ArrayData > >> * ... > >> 3. I still feel using Pandas UDF as the mid-term success doesn't match > the purpose of this SPIP. It does make some code cleaner. But I guess for > ETL use cases, it won't bring much value. > >> > > -- > > Twitter: https://twitter.com/holdenkarau > > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 > > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >