I have been in a similar world of pain. Basically, I tried to use an external Hive to have user access controls with a spark engine.At the end, I realized that it was a better idea to use apache tez instead of a spark engine for my particular case. But the journey is what I want to share with you.The big data apache tools and libraries such as Hive, Tez, Spark, Hadoop , Parquet etc etc are not interchangeable as we would like to think. There are very limited combinations for very specific versions. This is why tools like Ambari can be useful. Ambari sets a path of combos of versions known to work and the dirty work is done under the UI. More often than not, when you try a version that few people tried, you will get error messages that will derailed you and cause you to waste a lot of time. In addition, this group, as well as many other apache big data user groups, provides extremely poor support for users. The answers you usually get are not even hints to a solution. Their answers usually translate to "there is nothing I am willing to do about your problem. If I did, I should get paid" in many cryptic ways. If you ask your question to the Spark group they will take you to the Hive group and viceversa (I can almost guarantee it based on previous experiences) But in hindsight, people who work on this kinds of things typically make more money that the average developers. If you make more $$s it makes sense learning this stuff is supposed to be harder. Conclusion, don't try it. Or try using Tez/Hive instead of Spark/Hive if you are querying large files.
On Friday, March 17, 2017 11:33 AM, Stephen Sprague <sprag...@gmail.com> wrote: :( gettin' no love on this one. any SME's know if Spark 2.1.0 will work with Hive 2.1.0 ? That JavaSparkListener class looks like a deal breaker to me, alas. thanks in advance. Cheers, Stephen. On Mon, Mar 13, 2017 at 10:32 PM, Stephen Sprague <sprag...@gmail.com> wrote: hi guys, wondering where we stand with Hive On Spark these days? i'm trying to run Spark 2.1.0 with Hive 2.1.0 (purely coincidental versions) and running up against this class not found: java.lang. NoClassDefFoundError: org/apache/spark/ JavaSparkListener searching the Cyber i find this: 1. http://stackoverflow.com/ questions/41953688/setting- spark-as-default-execution- engine-for-hive which pretty much describes my situation too and it references this: 2. https://issues.apache.org/ jira/browse/SPARK-17563 which indicates a "won't fix" - but does reference this: 3. https://issues.apache.org/ jira/browse/HIVE-14029 which looks to be fixed in hive 2.2 - which is not released yet. so if i want to use spark 2.1.0 with hive am i out of luck - until hive 2.2? thanks, Stephen.