Re: Hive metadata on Hbase

2016-10-25 Thread Furcy Pin
Hi Mich, No, I am not using HBase as a metastore now, but I am eager for it to become production ready and released in CDH and HDP. Concerning locks, I think HBase would do fine because it is ACID at the row level. It only appends data on HDFS, but it works by keeping regions in RAM, plus a write

Re: Hive metadata on Hbase

2016-10-25 Thread Mich Talebzadeh
Hi Furcy, Having used Hbase for part of Batch layer in Lambda Architecture I have come to conclusion that it is a very good product despite the fact that because of its cryptic nature it is not much loved or appreciated. However, it may be useful to have a Hive metastore skin on top of Hbase table

Re: Hive metadata on Hbase

2016-10-25 Thread Furcy Pin
Hi Mich, I mostly agree with you, but I would comment on the part about using HBase as a maintenance free core product: I would say that most medium company using Hadoop rely on Hortonworks or Cloudera, that both provides a pre-packaged HBase installation. It would probably make sense for them to

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
Thanks Alan for detailed explanation. Please bear in mind that any tool that needs to work with some repository (Oracle TimesTen IMDB has its metastore on Oracle classic), SAP Replication Server has its repository RSSD on SAP ASE and others First thing they do, they go and cache those tables and k

Re: Hive metadata on Hbase

2016-10-24 Thread Alan Gates
Some thoughts on this: First, there’s no plan to remove the option to use an RDBMS such as Oracle as your backend. Hive’s RawStore interface is built such that various implementations of the metadata storage can easily coexist. Obviously different users will make different choices about what

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
Hi Furcy, Thanks for updates. transactional tables creates issue for us. When many updates are done they create many delta files that require compaction. This by itself is not an issue for Hive. However, Spark fails to read these delta files so the job crashes. Regards, Mich Dr Mich Talebzade

Re: Hive metadata on Hbase

2016-10-24 Thread Furcy Pin
Hi Mich, the umbrella JIRA for this gives a few reason. https://issues.apache.org/jira/browse/HIVE-9452 (with even more details in the attached pdf https://issues.apache.org/jira/secure/attachment/12697601/HBaseMetastoreApproach.pdf ) In my experience, Hive tables with a lot of partitions (> 10 0

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
Hive 2.0.1 Subversion git://reznor-mbp-2.local/Users/sergey/git/hivegit -r e3cfeebcefe9a19c5055afdcbb00646908340694 Compiled by sergey on Tue May 3 21:03:11 PDT 2016 >From source with checksum 5a49522e4b572555dbbe5dd4773bc7c2 Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?

Re: Hive metadata on Hbase

2016-10-24 Thread Per Ullberg
What version of hive are you running? /Pelle On Monday, October 24, 2016, Mich Talebzadeh wrote: > @Per > > We run full transactional enabled Hive metadb on an Oracle DB. > > I don't have statistics now but will collect from AWR reports no problem. > > @Jorn, > > The primary reason Oracle was c

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
@Per We run full transactional enabled Hive metadb on an Oracle DB. I don't have statistics now but will collect from AWR reports no problem. @Jorn, The primary reason Oracle was chosen is because the company has global licenses for Oracle + MSSQL + SAP and they are classified as Enterprise Gra

Re: Hive metadata on Hbase

2016-10-23 Thread Per Ullberg
I thought the main gain was to get ACID on Hive performant enough. @Mich: Do you run with ACID-enabled tables? How many Create/Update/Deletes do you do per second? best regards /Pelle On Mon, Oct 24, 2016 at 7:39 AM, Jörn Franke wrote: > I think the main gain is more about getting rid of a ded

Re: Hive metadata on Hbase

2016-10-23 Thread Jörn Franke
I think the main gain is more about getting rid of a dedicated database including maintenance and potential license cost. For really large clusters and a lot of users this might be even more beneficial. You can avoid clustering the database etc. > On 24 Oct 2016, at 00:46, Mich Talebzadeh wrot

Hive metadata on Hbase

2016-10-23 Thread Mich Talebzadeh
A while back there was some notes on having Hive metastore on Hbase as opposed to conventional RDBMSs I am currently involved with some hefty work with Hbase and Phoenix for batch ingestion of trade data. As long as you define your Hbase table through Phoenix and with secondary Phoenix indexes on