Re: [VOTE] Release Apache Spark 2.4.1 (RC2)

2019-03-12 Thread Jakub Wozniak
Hello,

Any more thoughts on this one?
Will that be let in 2.4.1 or rather not?

Thanks in advance,
Jakub


On 8 Mar 2019, at 11:26, Jakub Wozniak 
mailto:jakub.wozn...@cern.ch>> wrote:

Hi,

To me it is backwards compatible with older Hbase versions.
The code actually only falls back to the newer api on exception.

It would be great if this gets in.
Otherwise a setup with Hbase 2 + Spark 2.4 gets a bit complicated as we are 
forced to use an older version of the Hbase client (1.4.9) when running on Yarn.
In theory compatible but we see some performance degradations while doing reads 
from Hbase with the older client (we are investigating it now).
We have had issues in the past when Hbase server & client versions were not 
aligned so this is not our favourite.

Thanks,
Jakub


On 8 Mar 2019, at 11:15, Jakub Wozniak 
mailto:jakub.wozn...@cern.ch>> wrote:

I guess it is that one:
https://github.com/apache/spark/commit/dfed439e33b7bf224dd412b0960402068d961c7b#diff-9ebb59b7b008c694a8f583b94bd24e1d

Cheers,
Jakub


On 7 Mar 2019, at 17:25, Sean Owen mailto:sro...@gmail.com>> 
wrote:

Do you know what change fixed it?
If it's not a regression from 2.4.0 it wouldn't necessarily go into a
maintenance release. If there were no downside, maybe; does it cause
any incompatibility with older HBase versions?
It may be that this support is targeted for Spark 3 on purpose, which
is probably due in the middle of the year.

On Thu, Mar 7, 2019 at 8:57 AM Jakub Wozniak 
mailto:jakub.wozn...@cern.ch>> wrote:

Hello,

I have a question regarding the 2.4.1 release.

It looks like Spark 2.4 (and 2.4.1-rc) is not exactly compatible with Hbase 
2.x+ for the Yarn mode.
The problem is in the 
org.apache.spark.deploy.security.HbaseDelegationTokenProvider class that 
expects a specific version of TokenUtil class from Hbase that was changed 
between Hbase 1.x & 2.x.
On top the HadoopDelegationTokenManager does not use the ServiceLoader class so 
I cannot attach my own provider (providers are hardcoded).

It seems that both problems are resolved on the Spark master branch.

Is there any reason not to include this fix in the 2.4.1 release?
If so when do you plan to release it (the fix for Hbase)?

Or maybe there is something I’ve overlooked, please correct me if I’m wrong.

Best regards,
Jakub


On 7 Mar 2019, at 03:04, Saisai Shao 
mailto:sai.sai.s...@gmail.com>> wrote:

Do we have other block/critical issues for Spark 2.4.1 or waiting something to 
be fixed? I roughly searched the JIRA, seems there's no block/critical issues 
marked for 2.4.1.

Thanks
Saisai

shane knapp mailto:skn...@berkeley.edu>> 于2019年3月7日周四 
上午4:57写道:

i'll be popping in to the sig-big-data meeting on the 20th to talk about stuff 
like this.

On Wed, Mar 6, 2019 at 12:40 PM Stavros Kontopoulos 
mailto:stavros.kontopou...@lightbend.com>> 
wrote:

Yes its a touch decision and as we discussed today 
(https://docs.google.com/document/d/1pnF38NF6N5eM8DlK088XUW85Vms4V2uTsGZvSp8MNIA)
"Kubernetes support window is 9 months, Spark is two years". So we may end up 
with old client versions on branches still supported like 2.4.x in the future.
That gives us no choice but to upgrade, if we want to be on the safe side. We 
have tested 3.0.0 with 1.11 internally and it works but I dont know what it 
means to run with old
clients.


On Wed, Mar 6, 2019 at 7:54 PM Sean Owen 
mailto:sro...@gmail.com>> wrote:

If the old client is basically unusable with the versions of K8S
people mostly use now, and the new client still works with older
versions, I could see including this in 2.4.1.

Looking at https://github.com/fabric8io/kubernetes-client#compatibility-matrix
it seems like the 4.1.1 client is needed for 1.10 and above. However
it no longer supports 1.7 and below.
We have 3.0.x, and versions through 4.0.x of the client support the
same K8S versions, so no real middle ground here.

1.7.0 came out June 2017, it seems. 1.10 was March 2018. Minor release
branches are maintained for 9 months per
https://kubernetes.io/docs/setup/version-skew-policy/

Spark 2.4.0 came in Nov 2018. I suppose we could say it should have
used the newer client from the start as at that point (?) 1.7 and
earlier were already at least 7 months past EOL.
If we update the client in 2.4.1, versions of K8S as recently
'supported' as a year ago won't work anymore. I'm guessing there are
still 1.7 users out there? That wasn't that long ago but if the
project and users generally move fast, maybe not.

Normally I'd say, that's what the next minor release of Spark is for;
update if you want later infra. But there is no Spark 2.5.
I presume downstream distros could modify the dependency easily (?) if
needed and maybe already do. It wouldn't necessarily help end users.

Does the 3.0.x client not work at all with 1.10+ or just unsupported.
If it 'basically works but no guarantees' I'd favor not updating. If
it doesn't work at all, hm. That's tough. I think I'd favor updating
the client but think it's a tough call both way

Re: [VOTE] Release Apache Spark 2.4.1 (RC2)

2019-03-12 Thread Sean Owen
I don't think we'd fail the current RC for this change, no.

On Tue, Mar 12, 2019 at 3:51 AM Jakub Wozniak  wrote:
>
> Hello,
>
> Any more thoughts on this one?
> Will that be let in 2.4.1 or rather not?
>
> Thanks in advance,
> Jakub
>
>
> On 8 Mar 2019, at 11:26, Jakub Wozniak  wrote:
>
> Hi,
>
> To me it is backwards compatible with older Hbase versions.
> The code actually only falls back to the newer api on exception.
>
> It would be great if this gets in.
> Otherwise a setup with Hbase 2 + Spark 2.4 gets a bit complicated as we are 
> forced to use an older version of the Hbase client (1.4.9) when running on 
> Yarn.
> In theory compatible but we see some performance degradations while doing 
> reads from Hbase with the older client (we are investigating it now).
> We have had issues in the past when Hbase server & client versions were not 
> aligned so this is not our favourite.
>
> Thanks,
> Jakub
>
>
> On 8 Mar 2019, at 11:15, Jakub Wozniak  wrote:
>
> I guess it is that one:
> https://github.com/apache/spark/commit/dfed439e33b7bf224dd412b0960402068d961c7b#diff-9ebb59b7b008c694a8f583b94bd24e1d
>
> Cheers,
> Jakub
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[build system] jenkins web UI wedged, restarting service

2019-03-12 Thread shane knapp
i'm getting timeouts trying to connect to the site, so i'm preemptively
going to kick the service.  should be back up and running really soon.

i'll reply here w/any updates.

shane
-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Re: [build system] jenkins web UI wedged, restarting service

2019-03-12 Thread shane knapp
alright, jenkins is back up and the UI is super snappy!

sorry for the interruption in service.

On Tue, Mar 12, 2019 at 10:51 AM shane knapp  wrote:

> i'm getting timeouts trying to connect to the site, so i'm preemptively
> going to kick the service.  should be back up and running really soon.
>
> i'll reply here w/any updates.
>
> shane
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu