FW: JDK vs JRE in Docker Images

2019-04-18 Thread Rob Vesse
Sean Thanks for the pointers. Janino specifically says it only requires a JRE - https://janino-compiler.github.io/janino/#requirements As for scalac can't find a specific reference anywhere, appears to be self-contained AFAICT Rob On 17/04/2019, 1

Re: Thoughts on dataframe cogroup?

2019-04-18 Thread Chris Martin
Yes, totally agreed with Li here. For clarity, I'm happy to do the work to implement this, but it would be good to get feedback from the community in general and some of the Spark committers in particular. thanks, Chris On Wed, Apr 17, 2019 at 9:17 PM Li Jin wrote: > I have left some comments

Re: Spark 2.4.2

2019-04-18 Thread Michael Heuer
+100 > On Apr 18, 2019, at 1:48 AM, Reynold Xin wrote: > > We should have shaded all Spark’s dependencies :( > > On Wed, Apr 17, 2019 at 11:47 PM Sean Owen > wrote: > For users that would inherit Jackson and use it directly, or whose > dependencies do. Spark itself (w

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-04-18 Thread Jason Lowe
+1 (non-binding). Looking forward to seeing better support for processing columnar data. Jason On Tue, Apr 16, 2019 at 10:38 AM Tom Graves wrote: > Hi everyone, > > I'd like to call for a vote on SPARK-27396 - SPIP: Public APIs for > extended Columnar Processing Support. The proposal is to ex

Re: Spark 2.4.2

2019-04-18 Thread Felix Cheung
Re shading - same argument I’ve made earlier today in a PR... (Context- in many cases Spark has light or indirect dependencies but bringing them into the process breaks users code easily) From: Michael Heuer Sent: Thursday, April 18, 2019 6:41 AM To: Reynold Xi

Open PRs RE: Datasets Typed by Arbitrary Avro

2019-04-18 Thread Aleksander Eskilson
There are now a couple different pull-requests each attempting to address the need for an enhancement providing Typed Dataset support for Avro Objects. These PRs and their respective JIRA tickets are - https://github.com/apache/spark/pull/22878 : https://issues.apache.org/jira/browse/SPARK-2

Re: [SPARK-25079] moving from python 3.4 to python 3.6.8, impacts all active branches

2019-04-18 Thread shane knapp
alrighty folks, the future is here and we'll be moving to python 3.6 monday! all three PRs are green! master PR: https://github.com/apache/spark/pull/24266 2.4 PR: https://github.com/apache/spark/pull/24379 2.3 PR: https://github.com/apache/spark/pull/24380 more detailed email coming out this

[SPARK-25079][build system] the future of python3.6 is upon us!

2019-04-18 Thread shane knapp
well, upon us on monday. :) firstly, an important note: if you have an open PR, please check to see if you need to rebase my changes on it before testing. monday @ 11am PST, i will begin. in order: 0) jenkins enters quiet mode, running PRB builds cancelled 1) existing p3k env on all workers

Re: [SPARK-25079] moving from python 3.4 to python 3.6.8, impacts all active branches

2019-04-18 Thread Bryan Cutler
Great work, thanks Shane! On Thu, Apr 18, 2019 at 2:46 PM shane knapp wrote: > alrighty folks, the future is here and we'll be moving to python 3.6 > monday! > > all three PRs are green! > master PR: https://github.com/apache/spark/pull/24266 > 2.4 PR: https://github.com/apache/spark/pull/2437

[VOTE] Release Apache Spark 2.4.2

2019-04-18 Thread Wenchen Fan
Please vote on releasing the following candidate as Apache Spark version 2.4.2. The vote is open until April 23 PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 2.4.2 [ ] -1 Do not release this package because ... To le

Re: Spark 2.4.2

2019-04-18 Thread Wenchen Fan
I've cut RC1. If people think we must upgrade Jackson in 2.4, I can cut RC2 shortly. Thanks, Wenchen On Fri, Apr 19, 2019 at 3:32 AM Felix Cheung wrote: > Re shading - same argument I’ve made earlier today in a PR... > > (Context- in many cases Spark has light or indirect dependencies but > bri