Felix, The only alternative way is to create a stored procedure (udf) in database terms that would run Spark scala code underneath. In this way, I can use Spark SQL JDBC Thriftserver to execute it using SQL code passing the key, values I want to UPSERT. I wonder if this is possible since I cannot CREATE a wrapper table on top of a HBase table in Spark SQL?
What do you think? Is this the right approach? Thanks, Ben > On Oct 8, 2016, at 10:33 AM, Felix Cheung <felixcheun...@hotmail.com> wrote: > > HBase has released support for Spark > hbase.apache.org/book.html#spark <http://hbase.apache.org/book.html#spark> > > And if you search you should find several alternative approaches. > > > > > > On Fri, Oct 7, 2016 at 7:56 AM -0700, "Benjamin Kim" <bbuil...@gmail.com > <mailto:bbuil...@gmail.com>> wrote: > > Does anyone know if Spark can work with HBase tables using Spark SQL? I know > in Hive we are able to create tables on top of an underlying HBase table that > can be accessed using MapReduce jobs. Can the same be done using HiveContext > or SQLContext? We are trying to setup a way to GET and POST data to and from > the HBase table using the Spark SQL JDBC thriftserver from our RESTful API > endpoints and/or HTTP web farms. If we can get this to work, then we can load > balance the thriftservers. In addition, this will benefit us in giving us a > way to abstract the data storage layer away from the presentation layer code. > There is a chance that we will swap out the data storage technology in the > future. We are currently experimenting with Kudu. > > Thanks, > Ben > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org>