RE: Hive External Storage Handlers

Lavelle, Shawn Tue, 19 Jul 2016 06:35:03 -0700

Thanks All,
   Perhaps moving to 2.0.0 will be the answer.  We are trying to move to 
Spark-SQL, but I wasn’t sure how much of Hive the HiveContext supports – such 
as the external table API.


   The problem I encountered with Spark-SQL 1.6 was that the predicate storage 
handler’s are not being pushed to the storage handler.  I’ve seen some reports 
that this Spark has a defect with the metastore and that could be driving this 
issue, but nothing definitive.

   Re 1:1 mapping: The product I’m working on has a backing store with the same 
tables across different databases. We present a single table across ODBC and 
use the storage handler to query the different keyspaces without the user 
knowing.  (No, I can’t change the architecture.)

   Thanks all,

~ Shawn M Lavelle

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Tuesday, July 19, 2016 1:26 AM
To: user <user@hive.apache.org>
Subject: Re: Hive External Storage Handlers



"So not use a self-compiled hive or Spark version, but only the ones supplied 
by distributions (cloudera, Hortonworks, Bigtop...) You will face performance 
problems, strange errors etc when building and testing your code using 
self-compiled versions."

This comment does not make sense and is meaningless without any evidence. 
Either you provide evidence that you have done this work and you encountered 
errors or better not mention it. Sounds like scaremongering.









Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 19 July 2016 at 06:51, Jörn Franke 
<jornfra...@gmail.com<mailto:jornfra...@gmail.com>> wrote:
So not use a self-compiled hive or Spark version, but only the ones supplied by 
distributions (cloudera, Hortonworks, Bigtop...) You will face performance 
problems, strange errors etc when building and testing your code using 
self-compiled versions.

If you use the Hive APIs then the engine should not be relevant for your 
storage handler. Nevertheless, the APIs of the storage handler might have 
changed.

However, I wonder why a 1-1 mapping does not work for you.

On 18 Jul 2016, at 22:46, Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
Hi,

You can move up to Hive 2 that works fine and pretty stable. You can opt for 
Hive 1.2.1 if yoy wish.

If you want to use Spark (the replacement for Shark) as the execution engine 
for Hive then the version that works (that I have managed to make it work with 
Hive is Spark 1.3.1) that you will need to build from source.

It works and it is table.

Otherwise you may decide to use Spark Thrift Server (STS) that allows JDBC 
access to Spark SQL (through beeline, Squirrel , Zeppelin) that has Hive SQL 
context built into it as if you were using Hive Thrift Server (HSS)

HTH



Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction.



On 18 July 2016 at 21:38, Lavelle, Shawn 
<shawn.lave...@osii.com<mailto:shawn.lave...@osii.com>> wrote:
Hello,

    I am working with an external storage handler written for Hive 0.11 and run 
on a Shark execution engine.  I’d like to move forward and upgrade to hive 
1.2.1 on spark 1.6 or even 2.0.

   This storage has a need to run queries across tables existing in different 
databases in the external data store, so existing drivers that map hive to 
external storage in 1 to 1 mappings are insufficient. I have attempted this 
upgrade already, but found out that predicate pushdown was not occurring.  Was 
this changed in 1.2?

   Can I update and use the same storage handler in Hive or has this concept 
been replaced by the RDDs and DataFrame API?

   Are these questions better for the Spark list?

   Thank you,

~ Shawn M Lavelle



<image2a7f96.GIF>
Shawn Lavelle
Software Development

4101 Arrowhead Drive
Medina, Minnesota 55340-9457
Phone: 763 551 0559<tel:763%20551%200559>
Fax: 763 551 0750<tel:763%20551%200750>
Email: shawn.lave...@osii.com<mailto:shawn.lave...@osii.com>
Website: www.osii.com<http://www.osii.com>

RE: Hive External Storage Handlers

Reply via email to