Re: Proposal: File based metastore

Owen O'Malley Mon, 29 Jan 2018 09:44:41 -0800


> On Jan 29, 2018, at 9:29 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote:
> 
> 
> 
> On Mon, Jan 29, 2018 at 12:10 PM, Owen O'Malley <owen.omal...@gmail.com 
> <mailto:owen.omal...@gmail.com>> wrote:
> You should really look at what the Netflix guys are doing on Iceberg.
> 
> https://github.com/Netflix/iceberg <https://github.com/Netflix/iceberg>
> 
> They have put a lot of thought into how to efficiently handle tabular data in 
> S3. They put all of the metadata in S3 except for a single link to the name 
> of the table's root metadata file.
> 
> Other advantages of their design:
> Efficient atomic addition and removal of files in S3.
> Consistent schema evolution across formats
> More flexible partitioning and bucketing.
> 
> .. Owen
> 
> On Sun, Jan 28, 2018 at 12:02 PM, Edward Capriolo <edlinuxg...@gmail.com 
> <mailto:edlinuxg...@gmail.com>> wrote:
> All,
> 
> I have been bouncing around the earth for a while and have had the privilege 
> of working at 4-5 places. On arrival each place was in a variety of states in 
> their hadoop journey. 
> 
> One large company that I was at had a ~200 TB hadoop cluster. They actually 
> ran PIG and there ops group REFUSED to support hive, even though they had 
> written thousands of lines of pig macros to deal with selecting from a 
> partition, or a pig script file you would import so you would know what the 
> columns of the data at location /x/y/z is. 
> 
> In another lifetime I have been at a shop that used SCALDING. Again lots of 
> custom effort there with avro and parquet, all to do things that hive would 
> do our of the box. Again the biggest challenge is the thrift service and 
> metastore.
> 
> In the cloud many people will use a bootstrap script  
> https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop-script.html 
> <https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop-script.html> 
> or 'msck repair'
> 
> The "rise of the cloud" has changed us all the metastore is being a database 
> is a hard paradigm to support. Imagine for example I created data to an s3 
> bucket with hive, and another group in my company requires read only access 
> to this data for an ephemeral request. Sharing the data is easy, S3 access 
> can be granted, sharing the metastore and thrift services are much more 
> complicated. 
> 
> So lets think out of the box:
> 
> https://www.datastax.com/2011/03/brisk-is-here-hadoop-and-cassandra-together-at-last
>  
> <https://www.datastax.com/2011/03/brisk-is-here-hadoop-and-cassandra-together-at-last>
> 
> Datastax was able to build a platform where the filesystem and the metastore 
> were backed into Cassandra. Even though a HBase user would not want that, the 
> novel thing about that approach is that the metastore was not "some extra 
> thing in a database" that you had to deal with.
> 
> What I am thinking is that for the user of s3, the metastore should be in s3. 
> Probably in hidden files inside the warehouse/table directory(ies).
> 
> Think of it as msck repair "on the fly" 
> "https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.5/com.ibm.swg.im.infosphere.biginsights.commsql.doc/doc/biga_msckrep.html
>  
> <https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.5/com.ibm.swg.im.infosphere.biginsights.commsql.doc/doc/biga_msckrep.html>"
>   
> 
> The implementation could be something like this:
> 
> On startup read hive.warehouse.dir look for "_warehouse" That would help us 
> locate the databases and in the databases we can locate tables, with the 
> tables we can locate partitions.
> 
> This will of course scale horribly across tables with 90000000 partitions but 
> that would not be our use case. For all the people with "msck repair" in the 
> bootstrap they have a much cleaner way of using hive. 
> 
> The implementations could even be "Stacked" files first metastore lookback 
> second.
> 
> It would be also wise to have a tool available in the CLI "metastore <table> 
> toJson" making it drop dead simple to export the schema definitions. 
> 
> Thoughts?
> 
> 
> 
> 
> Close!
> 
> They ultimately have many concepts right but the dealbreaker is they have 
> there own file format. This ultimately will be a downfall. Hive needs to 
> continue working with a variety of formats. This seems like a non-starter as 
> everyone is already divided into camps on not-invented-here file formats.


They define a different layout, but they use Avro, ORC, or Parquet for the data.

> 
> Potentially we could implement as a StorageHandler, this interface has been 
> flexible and has had success. 
> https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage 
> <https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage>, a storage handler 
> can delegate to iceberg or something else.
> 
> I was thinking of this problem as more of a "docker" type solution. For 
> example, lets say you have build a 40GB dataset divided into partition by 
> day. Imagine we build a docker image the image would launch with an embedded 
> derby DB (read only) with a start script that completely describes the data 
> and the partitions.  (You need some way to connect it to your processing) but 
> now we have a one-shot "shippable" hive. 
> 
> Another approach: We have a JSON format with files that live in each of the 
> 40 partitions. If you are running Hive metastore and your system admins are 
> start you can run:
> 
> hive> scan /data/sent/to/me/data.bundle
> 
> The above command would scan and import that data into your datastore. It 
> could be a wizard, it could be headless. But now I can share datasets on 
> clouds and use them easily. 
> 
> 
> 
>

Re: Proposal: File based metastore

Reply via email to