On 6/10/15 1:55 AM, James Pirz wrote:
I am trying to use Spark 1.3 (Standalone) against Hive 1.2 running on
Hadoop 2.6.
I looked the ThriftServer2 logs, and I realized that the server was
not starting properly, because of failure in creating a server socket.
In fact, I had passed the URI to m
I am trying to use Spark 1.3 (Standalone) against Hive 1.2 running on
Hadoop 2.6.
I looked the ThriftServer2 logs, and I realized that the server was not
starting properly, because of failure in creating a server socket. In fact,
I had passed the URI to my Hiveserver2 service, launched from Hive, a
Thanks Ayan, I used beeline in Spark to connect to Hiveserver2 that I
started from my Hive. So as you said, It is really interacting with Hive as
a typical 3rd party application, and it is NOT using Spark execution
engine. I was thinking that it gets metastore info from Hive, but uses
Spark to exec
On 6/9/15 8:42 AM, James Pirz wrote:
Thanks for the help!
I am actually trying Spark SQL to run queries against tables that I've
defined in Hive.
I follow theses steps:
- I start hiveserver2 and in Spark, I start Spark's Thrift server by:
$SPARK_HOME/sbin/start-thriftserver.sh --master
spark
I am afraid you are going other way around :) If you want to use Hive in
spark, you'd need a HiveContext with hive config files in spark cluster
(eveery node). This was spark can talk to hive metastore. Then you can
write queries on hive table using hiveContext's sql method and spark will
run it (
Thanks for the help!
I am actually trying Spark SQL to run queries against tables that I've
defined in Hive.
I follow theses steps:
- I start hiveserver2 and in Spark, I start Spark's Thrift server by:
$SPARK_HOME/sbin/start-thriftserver.sh --master
spark://spark-master-node-ip:7077
- and I start
On 6/6/15 9:06 AM, James Pirz wrote:
I am pretty new to Spark, and using Spark 1.3.1, I am trying to use
'Spark SQL' to run some SQL scripts, on the cluster. I realized that
for a better performance, it is a good idea to use Parquet files. I
have 2 questions regarding that:
1) If I wanna us
I am pretty new to Spark, and using Spark 1.3.1, I am trying to use 'Spark
SQL' to run some SQL scripts, on the cluster. I realized that for a better
performance, it is a good idea to use Parquet files. I have 2 questions
regarding that:
1) If I wanna use Spark SQL against *partitioned & bucketed