Re: [DISCUSS] Syntax for table DDL

2018-10-04 Thread Ryan Blue
Sounds good. I'll plan on adding a PR with Hive's CHANGE syntax in addition to what I've proposed here. I have all of these working in our Spark distribution, so I'm just waiting on finalizing the TableCatalog API to submit these upstream. On Wed, Oct 3, 2018 at 10:07 PM Wenchen Fan wrote: > Th

Spark SQL parser and DDL

2018-10-04 Thread Ryan Blue
Hi everyone, I’ve been working on SQL DDL statements for v2 tables lately, including the proposed additions to drop, rename, and alter columns. The most recent update I’ve added is to allow transformation functions in the PARTITION BY clause to pass to v2 data sources. This allows sources like Ice

Re: Data source V2 in spark 2.4.0

2018-10-04 Thread assaf.mendelson
Thanks for the info. I have been converting an internal data source to V2 and am now preparing it for 2.4.0. I have a couple of suggestions from my experience so far. First I believe we are missing documentation on this. I am currently writing an internal tutorial based on what I am learning, I

Re: Data source V2 in spark 2.4.0

2018-10-04 Thread Ryan Blue
Assaf, thanks for the feedback. The InternalRow issue is one we know about. If it helps, I wrote up some docs for InternalRow as part of SPAR

Re: [VOTE] SPARK 2.4.0 (RC2)

2018-10-04 Thread Shixiong(Ryan) Zhu
-1. Found an issue in a new 2.4 Java API: https://issues.apache.org/jira/browse/SPARK-25644 We should fix it in 2.4.0 to avoid future breaking changes. Best Regards, Ryan On Mon, Oct 1, 2018 at 7:22 PM Michael Heuer wrote: > FYI I’ve open two new issues against 2.4.0 rc2 > > https://issues.apa

Re: [VOTE] SPARK 2.4.0 (RC2)

2018-10-04 Thread shane knapp
> > Using Scala 2.12.7 is a not an infra change, but change to the build, > but again it's not even specific to 2.12.7. We should use the latest > if we can though. > > yep, exactly. we don't even have scala install on any of our jenkins nodes. this is all taken care of via build/mvn at runtime.