Re: YARN Shuffle service and its compatibility

2016-04-19 Thread Mark Grover
Great, thanks for confirming, Reynold. Appreciate it! On Tue, Apr 19, 2016 at 4:20 PM, Reynold Xin wrote: > I talked to Lianhui offline and he said it is not that big of a deal to > revert the patch. > > > On Tue, Apr 19, 2016 at 9:52 AM, Mark Grover wrote: > >> Thanks. >> >> I'm more than happ

Re: YARN Shuffle service and its compatibility

2016-04-19 Thread Reynold Xin
I talked to Lianhui offline and he said it is not that big of a deal to revert the patch. On Tue, Apr 19, 2016 at 9:52 AM, Mark Grover wrote: > Thanks. > > I'm more than happy to wait for more people to chime in here but I do feel > that most of us are leaning towards Option B anyways. So, I cr

Re: YARN Shuffle service and its compatibility

2016-04-19 Thread Mark Grover
Thanks. I'm more than happy to wait for more people to chime in here but I do feel that most of us are leaning towards Option B anyways. So, I created a JIRA (SPARK-14731) for reverting SPARK-12130 in Spark 2.0 and file a PR shortly. Mark On Tue, Apr 19, 2016 at 7:44 AM, Tom Graves wrote: > It

Re: YARN Shuffle service and its compatibility

2016-04-19 Thread Mark Grover
On Tue, Apr 19, 2016 at 2:26 AM, Steve Loughran wrote: > > > On 18 Apr 2016, at 23:05, Marcelo Vanzin wrote: > > > > On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin > wrote: > >> The bigger problem is that it is much easier to maintain backward > >> compatibility rather than dictating forward comp

Re: YARN Shuffle service and its compatibility

2016-04-19 Thread Tom Graves
It would be nice if we could keep this compatible between 1.6 and 2.0 so I'm more for Option B at this point since the change made seems minor and we can change to have shuffle service do internally like Marcelo mention. Then lets try to keep compatible, but if there is a forcing function lets f

Re: YARN Shuffle service and its compatibility

2016-04-19 Thread Steve Loughran
> On 18 Apr 2016, at 23:05, Marcelo Vanzin wrote: > > On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin wrote: >> The bigger problem is that it is much easier to maintain backward >> compatibility rather than dictating forward compatibility. For example, as >> Marcin said, if we come up with a sligh

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Mark Grover
Thanks for responding, Reynold, Marcelo and Marcin. >And I think that's really what Mark is proposing. Basically, "don't >intentionally break backwards compatibility unless it's really >required" (e.g. SPARK-12130). That would allow option B to work. Yeah, that's exactly what Option B is proposin

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Marcelo Vanzin
On Mon, Apr 18, 2016 at 3:09 PM, Reynold Xin wrote: > IIUC, the reason for that PR is that they found the string comparison to > increase the size in large shuffles. Maybe we should add the ability to > support the short name to Spark 1.6.2? Is that something that really yields noticeable gains i

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Reynold Xin
Got it. So Mark is pushing for "best-effort" support. IIUC, the reason for that PR is that they found the string comparison to increase the size in large shuffles. Maybe we should add the ability to support the short name to Spark 1.6.2? On Mon, Apr 18, 2016 at 3:05 PM, Marcelo Vanzin wrote: >

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Marcelo Vanzin
On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin wrote: > The bigger problem is that it is much easier to maintain backward > compatibility rather than dictating forward compatibility. For example, as > Marcin said, if we come up with a slightly different shuffle layout to > improve shuffle performanc

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Reynold Xin
Yea I re-read the email again. It'd work in this case. The bigger problem is that it is much easier to maintain backward compatibility rather than dictating forward compatibility. For example, as Marcin said, if we come up with a slightly different shuffle layout to improve shuffle performance, we

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Marcelo Vanzin
On Mon, Apr 18, 2016 at 1:53 PM, Reynold Xin wrote: > That's not the only one. For example, the hash shuffle manager has been off > by default since Spark 1.2, and we'd like to remove it in 2.0: > https://github.com/apache/spark/pull/12423 If I understand things correctly, Mark's option B (runnin

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Marcin Tustin
I'm good with option B at least until it blocks something utterly wonderful (like shuffles are 10x faster). On Mon, Apr 18, 2016 at 4:51 PM, Mark Grover wrote: > Hi all, > If you don't use Spark on YARN, you probably don't need to read further. > > Here's the *user scenario*: > There are going t

Re: YARN Shuffle service and its compatibility

2016-04-18 Thread Reynold Xin
That's not the only one. For example, the hash shuffle manager has been off by default since Spark 1.2, and we'd like to remove it in 2.0: https://github.com/apache/spark/pull/12423 How difficult it is to just change the package name to say v2? On Mon, Apr 18, 2016 at 1:51 PM, Mark Grover wrot

YARN Shuffle service and its compatibility

2016-04-18 Thread Mark Grover
Hi all, If you don't use Spark on YARN, you probably don't need to read further. Here's the *user scenario*: There are going to be folks who may be interested in running two versions of Spark (say Spark 1.6.x and Spark 2.x) on the same YARN cluster. And, here's the *problem*: That's all fine, sho