Hello, At the 2018 DataWorks conference in Berlin, Hotels.com presented Waggle Dance <https://github.com/HotelsDotCom/waggle-dance>, a tool for federating multiple Hive clusters and providing the illusion of a unified data catalog from disparate instances.
We’ve been running Waggle Dance in production for well over a year and it has formed a critical part of our data platform architecture and infrastructure.We believe that this type of functionality will be of increasing importance as Hadoop and Hive workloads migrate to the cloud. While Waggle Dance is one solution, significant benefits could be realized if these kinds of abilities were an integral part of the Hive platform. If this sounds of interest, I've created a proposal on the Hive wiki. I've outlined why we think such a feature is needed in Hive, the benefits gained by offering it as a built-in feature, and representation of a possible implementation. Our proposed implementation draws inspiration from the remote table features present in some traditional RDBMSes, which may already be familiar to you. https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452092 Feedback gratefully accepted, Elliot. Senior Engineer Big Data Platform Team Hotels.com