Re: Tableau + Spark SQL Thrift Server + Cassandra

Todd Nist Fri, 03 Apr 2015 17:52:41 -0700

Thanks Mohammed,

I was aware of Calliope, but haven't used it since with since the
spark-cassandra-connector project got released.  I was not aware of the
CalliopeServer2; cool thanks for sharing that one.


I would appreciate it if you could lmk how you decide to proceed with this;
I can see this coming up on my radar in the next few months; thanks.

-Todd

On Fri, Apr 3, 2015 at 5:53 PM, Mohammed Guller <moham...@glassbeam.com>
wrote:

>  Thanks, Todd.
>
>
>
> It is an interesting idea; worth trying.
>
>
>
> I think the cash project is old. The tuplejump guy has created another
> project called CalliopeServer2, which works like a charm with BI tools that
> use JDBC, but unfortunately Tableau throws an error when it connects to it.
>
>
>
> Mohammed
>
>
>
> *From:* Todd Nist [mailto:tsind...@gmail.com]
> *Sent:* Friday, April 3, 2015 11:39 AM
> *To:* pawan kumar
> *Cc:* Mohammed Guller; user@spark.apache.org
>
> *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra
>
>
>
> Hi Mohammed,
>
>
>
> Not sure if you have tried this or not.  You could try using the below api
> to start the thriftserver with an existing context.
>
>
> https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L42
>
> The one thing that Michael Ambrust @ databrick recommended was this:
>
> You can start a JDBC server with an existing context.  See my answer here:
> http://apache-spark-user-list.1001560.n3.nabble.com/Standard-SQL-tool-access-to-SchemaRDD-td20197.html
>
> So something like this based on example from Cheng Lian:
>
>
> * Server*
>
> import  org.apache.spark.sql.hive.HiveContext
>
> import  org.apache.spark.sql.catalyst.types._
>
>
>
> val  sparkContext  =  sc
>
> import  sparkContext._
>
> val  sqlContext  =  new  HiveContext(sparkContext)
>
> import  sqlContext._
>
> makeRDD((1,"hello") :: (2,"world") 
> ::Nil).toSchemaRDD.cache().registerTempTable("t")
>
> // replace the above with the C* + spark-casandra-connectore to generate 
> SchemaRDD and registerTempTable
>
>
>
> import  org.apache.spark.sql.hive.thriftserver._
>
> HiveThriftServer2.startWithContext(sqlContext)
>
>   Then Startup
>
> ./bin/beeline -u jdbc:hive2://localhost:10000/default
>
> 0: jdbc:hive2://localhost:10000/default> select * from t;
>
>
>
>   I have not tried this yet from Tableau.   My understanding is that the
> tempTable is only valid as long as the sqlContext is, so if one terminates
> the code representing the *Server*, and then restarts the standard thrift
> server, sbin/start-thriftserver ..., the table won't be available.
>
>
>
> Another possibility is to perhaps use the tuplejump cash project,
> https://github.com/tuplejump/cash.
>
>
>
> HTH.
>
>
>
> -Todd
>
>
>
> On Fri, Apr 3, 2015 at 11:11 AM, pawan kumar <pkv...@gmail.com> wrote:
>
> Thanks mohammed. Will give it a try today. We would also need the
> sparksSQL piece as we are migrating our data store from oracle to C* and it
> would be easier to maintain all the reports rather recreating each one from
> scratch.
>
> Thanks,
> Pawan Venugopal.
>
> On Apr 3, 2015 7:59 AM, "Mohammed Guller" <moham...@glassbeam.com> wrote:
>
> Hi Todd,
>
>
>
> We are using Apache C* 2.1.3, not DSE. We got Tableau to work directly
> with C* using the ODBC driver, but now would like to add Spark SQL to the
> mix. I haven’t been able to find any documentation for how to make this
> combination work.
>
>
>
> We are using the Spark-Cassandra-Connector in our applications, but
> haven’t been able to figure out how to get the Spark SQL Thrift Server to
> use it and connect to C*. That is the missing piece. Once we solve that
> piece of the puzzle then Tableau should be able to see the tables in C*.
>
>
>
> Hi Pawan,
>
> Tableau + C* is pretty straight forward, especially if you are using DSE.
> Create a new DSN in Tableau using the ODBC driver that comes with DSE. Once
> you connect, Tableau allows to use C* keyspace as schema and column
> families as tables.
>
>
>
> Mohammed
>
>
>
> *From:* pawan kumar [mailto:pkv...@gmail.com]
> *Sent:* Friday, April 3, 2015 7:41 AM
> *To:* Todd Nist
> *Cc:* user@spark.apache.org; Mohammed Guller
> *Subject:* Re: Tableau + Spark SQL Thrift Server + Cassandra
>
>
>
> Hi Todd,
>
> Thanks for the link. I would be interested in this solution. I am using
> DSE for cassandra. Would you provide me with info on connecting with DSE
> either through Tableau or zeppelin. The goal here is query cassandra
> through spark sql so that I could perform joins and groupby on my queries.
> Are you able to perform spark sql queries with tableau?
>
> Thanks,
> Pawan Venugopal
>
> On Apr 3, 2015 5:03 AM, "Todd Nist" <tsind...@gmail.com> wrote:
>
> What version of Cassandra are you using?  Are you using DSE or the stock
> Apache Cassandra version?  I have connected it with DSE, but have not
> attempted it with the standard Apache Cassandra version.
>
>
>
> FWIW,
> http://www.datastax.com/dev/blog/datastax-odbc-cql-connector-apache-cassandra-datastax-enterprise,
> provides an ODBC driver tor accessing C* from Tableau.  Granted it does not
> provide all the goodness of Spark.  Are you attempting to leverage the
> spark-cassandra-connector for this?
>
>
>
>
>
>
>
> On Thu, Apr 2, 2015 at 10:20 PM, Mohammed Guller <moham...@glassbeam.com>
> wrote:
>
> Hi –
>
>
>
> Is anybody using Tableau to analyze data in Cassandra through the Spark
> SQL Thrift Server?
>
>
>
> Thanks!
>
>
>
> Mohammed
>
>
>
>
>
>
>

Re: Tableau + Spark SQL Thrift Server + Cassandra

Reply via email to