Re: Running SparkSql against Hive tables

2015-06-10 Thread Cheng Lian
On 6/10/15 1:55 AM, James Pirz wrote: I am trying to use Spark 1.3 (Standalone) against Hive 1.2 running on Hadoop 2.6. I looked the ThriftServer2 logs, and I realized that the server was not starting properly, because of failure in creating a server socket. In fact, I had passed the URI to m

Re: Running SparkSql against Hive tables

2015-06-09 Thread James Pirz
I am trying to use Spark 1.3 (Standalone) against Hive 1.2 running on Hadoop 2.6. I looked the ThriftServer2 logs, and I realized that the server was not starting properly, because of failure in creating a server socket. In fact, I had passed the URI to my Hiveserver2 service, launched from Hive, a

Re: Running SparkSql against Hive tables

2015-06-09 Thread James Pirz
Thanks Ayan, I used beeline in Spark to connect to Hiveserver2 that I started from my Hive. So as you said, It is really interacting with Hive as a typical 3rd party application, and it is NOT using Spark execution engine. I was thinking that it gets metastore info from Hive, but uses Spark to exec

Re: Running SparkSql against Hive tables

2015-06-08 Thread Cheng Lian
On 6/9/15 8:42 AM, James Pirz wrote: Thanks for the help! I am actually trying Spark SQL to run queries against tables that I've defined in Hive. I follow theses steps: - I start hiveserver2 and in Spark, I start Spark's Thrift server by: $SPARK_HOME/sbin/start-thriftserver.sh --master spark

Re: Running SparkSql against Hive tables

2015-06-08 Thread ayan guha
I am afraid you are going other way around :) If you want to use Hive in spark, you'd need a HiveContext with hive config files in spark cluster (eveery node). This was spark can talk to hive metastore. Then you can write queries on hive table using hiveContext's sql method and spark will run it (

Re: Running SparkSql against Hive tables

2015-06-08 Thread James Pirz
Thanks for the help! I am actually trying Spark SQL to run queries against tables that I've defined in Hive. I follow theses steps: - I start hiveserver2 and in Spark, I start Spark's Thrift server by: $SPARK_HOME/sbin/start-thriftserver.sh --master spark://spark-master-node-ip:7077 - and I start

Re: Running SparkSql against Hive tables

2015-06-07 Thread Cheng Lian
On 6/6/15 9:06 AM, James Pirz wrote: I am pretty new to Spark, and using Spark 1.3.1, I am trying to use 'Spark SQL' to run some SQL scripts, on the cluster. I realized that for a better performance, it is a good idea to use Parquet files. I have 2 questions regarding that: 1) If I wanna us

Running SparkSql against Hive tables

2015-06-05 Thread James Pirz
I am pretty new to Spark, and using Spark 1.3.1, I am trying to use 'Spark SQL' to run some SQL scripts, on the cluster. I realized that for a better performance, it is a good idea to use Parquet files. I have 2 questions regarding that: 1) If I wanna use Spark SQL against *partitioned & bucketed