Thanks, Steve--I should have tested out this theory before spamming the list.  
I haven't been able to get anything working after testing this theory out.

I'll hit up the Spark dev mailing list and try to garner enough interest to get 
some Jira's cut.

I really appreciate everyone's feedback, thanks everyone.

From: Steve Loughran [mailto:ste...@hortonworks.com]
Sent: Monday, June 29, 2015 10:32 AM
To: Dave Ariens
Cc: Tim Chen; Marcelo Vanzin; Olivier Girardot; user@spark.apache.org
Subject: Re: Accessing Kerberos Secured HDFS Resources from Spark on Mesos


On 29 Jun 2015, at 14:18, Dave Ariens 
<dari...@blackberry.com<mailto:dari...@blackberry.com>> wrote:

I'd like to toss out another idea that doesn't involve a complete end-to-end 
Kerberos implementation.  Essentially, have the driver authenticate to  
Kerberos, instantiate a Hadoop file system, and serialize/cache it for the 
executors to use instead of them having to instantiate their own.

- Driver authenticates to Kerberos via 
UserGroupInformation.loginUserFromKeytab(principal, keytab)
- Driver instantiates a Hadoop configuration via hdfs-site.xml and core-site.xml
- Driver instantiates the Hadoop file system from a path based on the Hadoop 
root URI (hdfs://hadoop-cluster.site.org/) and hadoop config
- Driver makes this file system available to all future executors
- Executors first check for an existing/cached file system object before 
instantiating their own


Hadoop automatically caches filesystems loaded with FileSystem.get(), unless 
you go (fs.NAME.impl.disable.cache=true), so all followup FileSystem.get() 
calls get the same instance automatically.

....But you can't share that information across JVMs or machines, at least in 
my experience. the non-keytab login stuff happens in the depths of the JVM; the 
keytab login is via the Hadoop codebase and some jvm-brittle introspection into 
kerberos implementation classes, code which doesn't directly offer shareability.

Delegation tokens are essentially the workaround: the driver creates those 
tokens and hands them off. That's essentially what YARN client apps are 
expected to do: there's nothing to stop the Mesos code doing the same thing, 
just a matter of implementation and (worse) testing.


-Steve

Reply via email to