On 22 Oct 2015, at 19:32, Chester Chen <ches...@alpinenow.com<mailto:ches...@alpinenow.com>> wrote:
Steven You summarized mostly correct. But there is a couple points I want to emphasize. Not every cluster have the Hive Service enabled. So The Yarn Client shouldn't try to get the hive delegation token just because security mode is enabled. I agree, but it shouldn't be failing with a stack trace. Log -yes, fail no. The Yarn Client code can check if the service is enabled or not (possible by check hive metastore URI is present or other hive-site.xml elements). If hive service is not enabled, then we don't need to get hive delegation token. Hence we don't have the exception. If we still try to get hive delegation regardless hive service is enabled or not ( like the current code is doing now), then code should still launch the yarn container and spark job, as the user could simply run a job against HDFS, not accessing Hive. Of course, access Hive will fail. That's exactly what should be happening: the token is only needed if the code tries to talk to hive. The problem is the YARN client doesn't know whether that's the case, so it tries every time. It shouldn't be failing though. Created an issue to cover this; I'll see what reflection it takes. I'll also pull the code out into a method that can be tested standalone: we shoudn't have to wait until a run on UGI.isSecure() mode. https://issues.apache.org/jira/browse/SPARK-11265 Meanwhile, for the curious, these slides include an animation of what goes on when a YARN app is launched in a secure cluster, to help explain why things seem a bit complicated http://people.apache.org/~stevel/kerberos/2015-09-kerberos-the-madness.pptx The 3rd point is that not sure why org.spark-project.hive's hive-exec and orga.apache.hadoop.hive hive-exec behave differently for the same method. Chester On Thu, Oct 22, 2015 at 10:18 AM, Charmee Patel <charm...@gmail.com<mailto:charm...@gmail.com>> wrote: A similar issue occurs when interacting with Hive secured by Sentry. https://issues.apache.org/jira/browse/SPARK-9042 By changing how Hive Context instance is created, this issue might also be resolved. On Thu, Oct 22, 2015 at 11:33 AM Steve Loughran <ste...@hortonworks.com<mailto:ste...@hortonworks.com>> wrote: On 22 Oct 2015, at 08:25, Chester Chen <ches...@alpinenow.com<mailto:ches...@alpinenow.com>> wrote: Doug We are not trying to compiling against different version of hive. The 1.2.1.spark hive-exec is specified on spark 1.5.2 Pom file. We are moving from spark 1.3.1 to 1.5.1. Simply trying to supply the needed dependency. The rest of application (besides spark) simply uses hive 0.13.1. Yes we are using yarn client directly, there are many functions we need and modified are not provided in yarn client. The spark launcher in the current form does not satisfy our requirements (at least last time I see it) there is a discussion thread about several month ago. From spark 1.x to 1.3.1, we fork the yarn client to achieve these goals ( yarn listener call backs, killApplications, yarn capacities call back etc). In current integration for 1.5.1, to avoid forking the spark, we simply subclass the yarn client overwrites a few methods. But we lost resource capacity call back and estimation by doing this. This is bit off the original topic. I still think there is a bug related to the spark yarn client in case of Kerberos + spark hive-exec dependency. Chester I think I understand what's being implied here. 1. In a secure cluster, a spark app needs a hive delegation token to talk to hive 2. Spark yarn Client (org.apache.spark.deploy.yarn.Client) uses reflection to get the delegation token 3. The reflection doesn't work, a CFNE exception is logged 4. The app should still launch, but it'll be without a hive token , so attempting to work with Hive will fail. I haven't seen this, because while I do test runs against a kerberos cluster, I wasn't talking to hive from the deployed app. It sounds like this workaround works because the hive RPC protocol is compatible enough with 0.13 that a 0.13 client can ask hive for the token, though then your remote CP is stuck on 0.13 Looking at the hive class, the metastore has now made the hive constructor private and gone to a factory method (public static Hive get(HiveConf c) throws HiveException) to get an instance. The reflection code would need to be updated. I'll file a bug with my name next to it