Re: SparkSQL + Tableau Connector

Arush Kharbanda Wed, 11 Feb 2015 02:39:03 -0800

Hi

I used this, though its using a embedded driver and is not a good
approch.It works. You can configure for some other metastore type also. I
have not tried the metastore uri's.


<configuration>


<property>

  <name>javax.jdo.option.ConnectionURL</name>


<value>jdbc:derby:;databaseName=/opt/bigdata/spark-1.2.0/metastore_db;create=true</value>

  <description>URL for the DB</description>

</property>


<property>

  <name>javax.jdo.option.ConnectionDriverName</name>

  <value>org.apache.derby.jdbc.EmbeddedDriver</value>

</property>


<!-- <property>

  <name>hive.metastore.uris</name>

  <value>thrift://x.x.x.x:10000 <http://172.17.1.172:10000/></value>

  <description>IP address (or fully-qualified domain name) and port of the
metastore host</description>

</property> -->

</configuration>

On Wed, Feb 11, 2015 at 3:59 PM, Todd Nist <tsind...@gmail.com> wrote:

> Hi Arush,
>
> So yes I want to create the tables through Spark SQL.  I have placed the
> hive-site.xml file inside of the $SPARK_HOME/conf directory I thought that
> was all I should need to do to have the thriftserver use it.  Perhaps my
> hive-site.xml is worng, it currently looks like this:
>
> <configuration>
> <property>
>   <name>hive.metastore.uris</name>
>   <!-- Ensure that the following statement points to the Hive Metastore
> URI in your cluster -->
>   <value>thrift://sandbox.hortonworks.com:9083</value>
>   <description>URI for client to contact metastore server</description>
> </property>
> </configuration>
>
> Which leads me to believe it is going to pull form the thriftserver from
> Horton?  I will go look at the docs to see if this is right, it is what
> Horton says to do.  Do you have an example hive-site.xml by chance that
> works with Spark SQL?
>
> I am using 8.3 of tableau with the SparkSQL Connector.
>
> Thanks for the assistance.
>
> -Todd
>
> On Wed, Feb 11, 2015 at 2:34 AM, Arush Kharbanda <
> ar...@sigmoidanalytics.com> wrote:
>
>> BTW what tableau connector are you using?
>>
>> On Wed, Feb 11, 2015 at 12:55 PM, Arush Kharbanda <
>> ar...@sigmoidanalytics.com> wrote:
>>
>>>  I am a little confused here, why do you want to create the tables in
>>> hive. You want to create the tables in spark-sql, right?
>>>
>>> If you are not able to find the same tables through tableau then thrift
>>> is connecting to a diffrent metastore than your spark-shell.
>>>
>>> One way to specify a metstore to thrift is to provide the path to
>>> hive-site.xml while starting thrift using --files hive-site.xml.
>>>
>>> similarly you can specify the same metastore to your spark-submit or
>>> sharp-shell using the same option.
>>>
>>>
>>>
>>> On Wed, Feb 11, 2015 at 5:23 AM, Todd Nist <tsind...@gmail.com> wrote:
>>>
>>>> Arush,
>>>>
>>>> As for #2 do you mean something like this from the docs:
>>>>
>>>> // sc is an existing SparkContext.val sqlContext = new 
>>>> org.apache.spark.sql.hive.HiveContext(sc)
>>>> sqlContext.sql("CREATE TABLE IF NOT EXISTS src (key INT, value 
>>>> STRING)")sqlContext.sql("LOAD DATA LOCAL INPATH 
>>>> 'examples/src/main/resources/kv1.txt' INTO TABLE src")
>>>> // Queries are expressed in HiveQLsqlContext.sql("FROM src SELECT key, 
>>>> value").collect().foreach(println)
>>>>
>>>> Or did you have something else in mind?
>>>>
>>>> -Todd
>>>>
>>>>
>>>> On Tue, Feb 10, 2015 at 6:35 PM, Todd Nist <tsind...@gmail.com> wrote:
>>>>
>>>>> Arush,
>>>>>
>>>>> Thank you will take a look at that approach in the morning.  I sort of
>>>>> figured the answer to #1 was NO and that I would need to do 2 and 3 thanks
>>>>> for clarifying it for me.
>>>>>
>>>>> -Todd
>>>>>
>>>>> On Tue, Feb 10, 2015 at 5:24 PM, Arush Kharbanda <
>>>>> ar...@sigmoidanalytics.com> wrote:
>>>>>
>>>>>> 1.  Can the connector fetch or query schemaRDD's saved to Parquet or
>>>>>> JSON files? NO
>>>>>> 2.  Do I need to do something to expose these via hive / metastore
>>>>>> other than creating a table in hive? Create a table in spark sql to 
>>>>>> expose
>>>>>> via spark sql
>>>>>> 3.  Does the thriftserver need to be configured to expose these in
>>>>>> some fashion, sort of related to question 2 you would need to configure
>>>>>> thrift to read from the metastore you expect it read from - by default it
>>>>>> reads from metastore_db directory present in the directory used to launch
>>>>>> the thrift server.
>>>>>>  On 11 Feb 2015 01:35, "Todd Nist" <tsind...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm trying to understand how and what the Tableau connector to
>>>>>>> SparkSQL is able to access.  My understanding is it needs to connect to 
>>>>>>> the
>>>>>>> thriftserver and I am not sure how or if it exposes parquet, json,
>>>>>>> schemaRDDs, or does it only expose schemas defined in the metastore / 
>>>>>>> hive.
>>>>>>>
>>>>>>>
>>>>>>> For example, I do the following from the spark-shell which generates
>>>>>>> a schemaRDD from a csv file and saves it as a JSON file as well as a
>>>>>>> parquet file.
>>>>>>>
>>>>>>> import *org.apache.sql.SQLContext
>>>>>>> *import com.databricks.spark.csv._
>>>>>>> val sqlContext = new SQLContext(sc)
>>>>>>> val test = 
>>>>>>> sqlContext.csfFile("/data/test.csv")test.toJSON().saveAsTextFile("/data/out")
>>>>>>> test.saveAsParquetFile("/data/out")
>>>>>>>
>>>>>>> When I connect from Tableau, the only thing I see is the "default"
>>>>>>> schema and nothing in the tables section.
>>>>>>>
>>>>>>> So my questions are:
>>>>>>>
>>>>>>> 1.  Can the connector fetch or query schemaRDD's saved to Parquet or
>>>>>>> JSON files?
>>>>>>> 2.  Do I need to do something to expose these via hive / metastore
>>>>>>> other than creating a table in hive?
>>>>>>> 3.  Does the thriftserver need to be configured to expose these in
>>>>>>> some fashion, sort of related to question 2.
>>>>>>>
>>>>>>> TIA for the assistance.
>>>>>>>
>>>>>>> -Todd
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>
>>>
>>> *Arush Kharbanda* || Technical Teamlead
>>>
>>> ar...@sigmoidanalytics.com || www.sigmoidanalytics.com
>>>
>>
>>
>>
>> --
>>
>> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>
>>
>> *Arush Kharbanda* || Technical Teamlead
>>
>> ar...@sigmoidanalytics.com || www.sigmoidanalytics.com
>>
>
>


-- 

[image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com>

*Arush Kharbanda* || Technical Teamlead

ar...@sigmoidanalytics.com || www.sigmoidanalytics.com

Re: SparkSQL + Tableau Connector

Reply via email to