Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

Imran Rashid Wed, 15 May 2019 10:16:05 -0700

sorry I am late to the discussion here -- the jira mentions using this
extensions for dealing with shuffles, can you explain that part?  I don't
see how you would use this to change shuffle behavior at all.


On Tue, May 14, 2019 at 10:59 AM Thomas graves <[email protected]> wrote:

> Thanks for replying, I'll extend the vote til May 26th to allow your
> and other people feedback who haven't had time to look at it.
>
> Tom
>
> On Mon, May 13, 2019 at 4:43 PM Holden Karau <[email protected]> wrote:
> >
> > I’d like to ask this vote period to be extended, I’m interested but I
> don’t have the cycles to review it in detail and make an informed vote
> until the 25th.
> >
> > On Tue, May 14, 2019 at 1:49 AM Xiangrui Meng <[email protected]>
> wrote:
> >>
> >> My vote is 0. Since the updated SPIP focuses on ETL use cases, I don't
> feel strongly about it. I would still suggest doing the following:
> >>
> >> 1. Link the POC mentioned in Q4. So people can verify the POC result.
> >> 2. List public APIs we plan to expose in Appendix A. I did a quick
> check. Beside ColumnarBatch and ColumnarVector, we also need to make the
> following public. People who are familiar with SQL internals should help
> assess the risk.
> >> * ColumnarArray
> >> * ColumnarMap
> >> * unsafe.types.CaledarInterval
> >> * ColumnarRow
> >> * UTF8String
> >> * ArrayData
> >> * ...
> >> 3. I still feel using Pandas UDF as the mid-term success doesn't match
> the purpose of this SPIP. It does make some code cleaner. But I guess for
> ETL use cases, it won't bring much value.
> >>
> > --
> > Twitter: https://twitter.com/holdenkarau
> > Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [email protected]
>
>

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

Reply via email to