Hi, It is from Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem (Addison-Wesley Data & Analytics Series) (Kindle Locations 735-738).
Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summary, ad hoc queries, and the analysis of large data sets using an SQL-like language called HiveQL. Hive transparently translates queries into MapReduce jobs that are executed in HBase. Hive is considered the de facto standard for interactive SQL queries over petabytes of data On Tuesday, 10 November 2015, 13:03, Binglin Chang <decst...@gmail.com> wrote: Hive transparently translates queries into MapReduce jobs that are executed in HBase I think this is not correct, are you sure it is from some book? On Tue, Nov 10, 2015 at 6:56 PM, Ashok Kumar <ashok34...@yahoo.com> wrote: hi, I have read in a book about Hadoop that says Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summary, ad hoc queries, and the analysis of large data sets using an SQL-like language called HiveQL. Hive transparently translates queries into MapReduce jobs that are executed in HBase. Hive is considered the de facto standard for interactive SQL queries over petabytes of data. What is the relation between Hive and HBase? I always thought that HBase is an independent database. Is it correct that Hive itself uses MapReduce engine that in turn uses HBase as the database. I always thought that Hive is a data warehouse database or I am missing something. Thanking you