[ ] Use Spark 1 & Spark 2 Support Branch [X] Use Spark 2 Only Branch Spark 2 has been out for a while, so probably not going to offend many people.
Jacob On Thu, Nov 16, 2017 at 5:45 AM, Neville Dipale <[email protected]> wrote: > [ ] Use Spark 1 & Spark 2 Support Branch > [X] Use Spark 2 Only Branch > > On 16 November 2017 at 15:08, Jean-Baptiste Onofré <[email protected]> > wrote: > >> Hi guys, >> >> To illustrate the current discussion about Spark versions support, you >> can take a look on: >> >> -- >> Spark 1 & Spark 2 Support Branch >> >> https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-MODULES >> >> This branch contains a Spark runner common module compatible with both >> Spark 1.x and 2.x. For convenience, we introduced spark1 & spark2 >> modules/artifacts containing just a pom.xml to define the dependencies set. >> >> -- >> Spark 2 Only Branch >> >> https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-ONLY >> >> This branch is an upgrade to Spark 2.x and "drop" support of Spark 1.x. >> >> As I'm ready to merge one of the other in the PR, I would like to >> complete the vote/discussion pretty soon. >> >> Correct me if I'm wrong, but it seems that the preference is to drop >> Spark 1.x to focus only on Spark 2.x (for the Spark 2 Only Branch). >> >> I would like to call a final vote to act the merge I will do: >> >> [ ] Use Spark 1 & Spark 2 Support Branch >> [ ] Use Spark 2 Only Branch >> >> This informal vote is open for 48 hours. >> >> Please, let me know what your preference is. >> >> Thanks ! >> Regards >> JB >> >> On 11/13/2017 09:32 AM, Jean-Baptiste Onofré wrote: >> >>> Hi Beamers, >>> >>> I'm forwarding this discussion & vote from the dev mailing list to the >>> user mailing list. >>> The goal is to have your feedback as user. >>> >>> Basically, we have two options: >>> 1. Right now, in the PR, we support both Spark 1.x and 2.x using three >>> artifacts (common, spark1, spark2). You, as users, pick up spark1 or spark2 >>> in your dependencies set depending the Spark target version you want. >>> 2. The other option is to upgrade and focus on Spark 2.x in Beam 2.3.0. >>> If you still want to use Spark 1.x, then, you will be stuck up to Beam >>> 2.2.0. >>> >>> Thoughts ? >>> >>> Thanks ! >>> Regards >>> JB >>> >>> >>> -------- Forwarded Message -------- >>> Subject: [VOTE] Drop Spark 1.x support to focus on Spark 2.x >>> Date: Wed, 8 Nov 2017 08:27:58 +0100 >>> From: Jean-Baptiste Onofré <[email protected]> >>> Reply-To: [email protected] >>> To: [email protected] >>> >>> Hi all, >>> >>> as you might know, we are working on Spark 2.x support in the Spark >>> runner. >>> >>> I'm working on a PR about that: >>> >>> https://github.com/apache/beam/pull/3808 >>> >>> Today, we have something working with both Spark 1.x and 2.x from a code >>> standpoint, but I have to deal with dependencies. It's the first step of >>> the update as I'm still using RDD, the second step would be to support >>> dataframe (but for that, I would need PCollection elements with schemas, >>> that's another topic on which Eugene, Reuven and I are discussing). >>> >>> However, as all major distributions now ship Spark 2.x, I don't think >>> it's required anymore to support Spark 1.x. >>> >>> If we agree, I will update and cleanup the PR to only support and focus >>> on Spark 2.x. >>> >>> So, that's why I'm calling for a vote: >>> >>> [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only >>> [ ] 0 (I don't care ;)) >>> [ ] -1, I would like to still support Spark 1.x, and so having >>> support of both Spark 1.x and 2.x (please provide specific comment) >>> >>> This vote is open for 48 hours (I have the commits ready, just waiting >>> the end of the vote to push on the PR). >>> >>> Thanks ! >>> Regards >>> JB >>> >> >> -- >> Jean-Baptiste Onofré >> [email protected] >> http://blog.nanthrax.net >> Talend - http://www.talend.com >> > >
