Hello,
Any more thoughts on this one?
Will that be let in 2.4.1 or rather not?
Thanks in advance,
Jakub
On 8 Mar 2019, at 11:26, Jakub Wozniak
mailto:jakub.wozn...@cern.ch>> wrote:
Hi,
To me it is backwards compatible with older Hbase versions.
The code actually only falls back to the newer api on exception.
It would be great if this gets in.
Otherwise a setup with Hbase 2 + Spark 2.4 gets a bit complicated as we are
forced to use an older version of the Hbase client (1.4.9) when running on Yarn.
In theory compatible but we see some performance degradations while doing reads
from Hbase with the older client (we are investigating it now).
We have had issues in the past when Hbase server & client versions were not
aligned so this is not our favourite.
Thanks,
Jakub
On 8 Mar 2019, at 11:15, Jakub Wozniak
mailto:jakub.wozn...@cern.ch>> wrote:
I guess it is that one:
https://github.com/apache/spark/commit/dfed439e33b7bf224dd412b0960402068d961c7b#diff-9ebb59b7b008c694a8f583b94bd24e1d
Cheers,
Jakub
On 7 Mar 2019, at 17:25, Sean Owen mailto:sro...@gmail.com>>
wrote:
Do you know what change fixed it?
If it's not a regression from 2.4.0 it wouldn't necessarily go into a
maintenance release. If there were no downside, maybe; does it cause
any incompatibility with older HBase versions?
It may be that this support is targeted for Spark 3 on purpose, which
is probably due in the middle of the year.
On Thu, Mar 7, 2019 at 8:57 AM Jakub Wozniak
mailto:jakub.wozn...@cern.ch>> wrote:
Hello,
I have a question regarding the 2.4.1 release.
It looks like Spark 2.4 (and 2.4.1-rc) is not exactly compatible with Hbase
2.x+ for the Yarn mode.
The problem is in the
org.apache.spark.deploy.security.HbaseDelegationTokenProvider class that
expects a specific version of TokenUtil class from Hbase that was changed
between Hbase 1.x & 2.x.
On top the HadoopDelegationTokenManager does not use the ServiceLoader class so
I cannot attach my own provider (providers are hardcoded).
It seems that both problems are resolved on the Spark master branch.
Is there any reason not to include this fix in the 2.4.1 release?
If so when do you plan to release it (the fix for Hbase)?
Or maybe there is something I’ve overlooked, please correct me if I’m wrong.
Best regards,
Jakub
On 7 Mar 2019, at 03:04, Saisai Shao
mailto:sai.sai.s...@gmail.com>> wrote:
Do we have other block/critical issues for Spark 2.4.1 or waiting something to
be fixed? I roughly searched the JIRA, seems there's no block/critical issues
marked for 2.4.1.
Thanks
Saisai
shane knapp mailto:skn...@berkeley.edu>> 于2019年3月7日周四
上午4:57写道:
i'll be popping in to the sig-big-data meeting on the 20th to talk about stuff
like this.
On Wed, Mar 6, 2019 at 12:40 PM Stavros Kontopoulos
mailto:stavros.kontopou...@lightbend.com>>
wrote:
Yes its a touch decision and as we discussed today
(https://docs.google.com/document/d/1pnF38NF6N5eM8DlK088XUW85Vms4V2uTsGZvSp8MNIA)
"Kubernetes support window is 9 months, Spark is two years". So we may end up
with old client versions on branches still supported like 2.4.x in the future.
That gives us no choice but to upgrade, if we want to be on the safe side. We
have tested 3.0.0 with 1.11 internally and it works but I dont know what it
means to run with old
clients.
On Wed, Mar 6, 2019 at 7:54 PM Sean Owen
mailto:sro...@gmail.com>> wrote:
If the old client is basically unusable with the versions of K8S
people mostly use now, and the new client still works with older
versions, I could see including this in 2.4.1.
Looking at https://github.com/fabric8io/kubernetes-client#compatibility-matrix
it seems like the 4.1.1 client is needed for 1.10 and above. However
it no longer supports 1.7 and below.
We have 3.0.x, and versions through 4.0.x of the client support the
same K8S versions, so no real middle ground here.
1.7.0 came out June 2017, it seems. 1.10 was March 2018. Minor release
branches are maintained for 9 months per
https://kubernetes.io/docs/setup/version-skew-policy/
Spark 2.4.0 came in Nov 2018. I suppose we could say it should have
used the newer client from the start as at that point (?) 1.7 and
earlier were already at least 7 months past EOL.
If we update the client in 2.4.1, versions of K8S as recently
'supported' as a year ago won't work anymore. I'm guessing there are
still 1.7 users out there? That wasn't that long ago but if the
project and users generally move fast, maybe not.
Normally I'd say, that's what the next minor release of Spark is for;
update if you want later infra. But there is no Spark 2.5.
I presume downstream distros could modify the dependency easily (?) if
needed and maybe already do. It wouldn't necessarily help end users.
Does the 3.0.x client not work at all with 1.10+ or just unsupported.
If it 'basically works but no guarantees' I'd favor not updating. If
it doesn't work at all, hm. That's tough. I think I'd favor updating
the client but think it's a tough call both way