Querying Hive from R via SparkR

Thomas Achache Thu, 04 Feb 2016 02:18:28 -0800

Hello everyone,

We are running a Hive Cluster in a Kerberos environment, that we usually access 
via ssh from our local machines on windows. I would like to be able to query 
Hive directly from R on those same windows machines by using the 
SparkR<https://spark.apache.org/docs/1.5.2/sparkr.html> package:
https://spark.apache.org/docs/1.5.2/sparkr.html#from-hive-tables


Does anyone here have some experience doing that? I suspect this might not be 
the best mailing list to ask this question so don't hesitate to redirect me if 
needed.

I would like to know
a) if what I'm trying to achieve is even possible (maybe it's only possible if 
R is directly installed on the hive cluster, and we use Rstudio Servers?)
b) If anyone could point me to a web resource that explains how to setup the 
SparkContext and / or the HiveContext
c) If there is a simpler solution to query Hive from R (for instance, we also 
use a JDBC connection with Vertica and it works just fine in the setup 
described above)

Sorry if this is a bit off-topic but I'm completely lost in the documentation 
and I don't know where to ask help :(

All the best,

Thomas

Querying Hive from R via SparkR

Reply via email to