Felix, My goal is to use Spark SQL JDBC Thriftserver to access HBase tables using just SQL. I have been able to CREATE tables using this statement below in the past:
CREATE TABLE <table-name> USING org.apache.spark.sql.jdbc OPTIONS ( url "jdbc:postgresql://<hostname>:<port>/dm?user=<username>&password=<password>", dbtable "dim.dimension_acamp" ); After doing this, I can access the PostgreSQL table using Spark SQL JDBC Thriftserver using SQL statements (SELECT, UPDATE, INSERT, etc.). I want to do the same with HBase tables. We tried this using Hive and HiveServer2, but the response times are just too long. Thanks, Ben > On Oct 8, 2016, at 10:53 AM, Felix Cheung <felixcheun...@hotmail.com> wrote: > > Ben, > > I'm not sure I'm following completely. > > Is your goal to use Spark to create or access tables in HBASE? If so the link > below and several packages out there support that by having a HBASE data > source for Spark. There are some examples on how the Spark code look like in > that link as well. On that note, you should also be able to use the HBASE > data source from pure SQL (Spark SQL) query as well, which should work in the > case with the Spark SQL JDBC Thrift Server (with USING, > http://spark.apache.org/docs/latest/sql-programming-guide.html#tab_sql_10 > <http://spark.apache.org/docs/latest/sql-programming-guide.html#tab_sql_10>). > > > _____________________________ > From: Benjamin Kim <bbuil...@gmail.com <mailto:bbuil...@gmail.com>> > Sent: Saturday, October 8, 2016 10:40 AM > Subject: Re: Spark SQL Thriftserver with HBase > To: Felix Cheung <felixcheun...@hotmail.com > <mailto:felixcheun...@hotmail.com>> > Cc: <user@spark.apache.org <mailto:user@spark.apache.org>> > > > Felix, > > The only alternative way is to create a stored procedure (udf) in database > terms that would run Spark scala code underneath. In this way, I can use > Spark SQL JDBC Thriftserver to execute it using SQL code passing the key, > values I want to UPSERT. I wonder if this is possible since I cannot CREATE a > wrapper table on top of a HBase table in Spark SQL? > > What do you think? Is this the right approach? > > Thanks, > Ben > > On Oct 8, 2016, at 10:33 AM, Felix Cheung <felixcheun...@hotmail.com > <mailto:felixcheun...@hotmail.com>> wrote: > > HBase has released support for Spark > hbase.apache.org/book.html#spark <http://hbase.apache.org/book.html#spark> > > And if you search you should find several alternative approaches. > > > > > > On Fri, Oct 7, 2016 at 7:56 AM -0700, "Benjamin Kim" <bbuil...@gmail.com > <mailto:bbuil...@gmail.com>> wrote: > > Does anyone know if Spark can work with HBase tables using Spark SQL? I know > in Hive we are able to create tables on top of an underlying HBase table that > can be accessed using MapReduce jobs. Can the same be done using HiveContext > or SQLContext? We are trying to setup a way to GET and POST data to and from > the HBase table using the Spark SQL JDBC thriftserver from our RESTful API > endpoints and/or HTTP web farms. If we can get this to work, then we can load > balance the thriftservers. In addition, this will benefit us in giving us a > way to abstract the data storage layer away from the presentation layer code. > There is a chance that we will swap out the data storage technology in the > future. We are currently experimenting with Kudu. > > Thanks, > Ben > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> > >