This simply does not work but we need to make Hive use external indexes. This is a must
Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 20 April 2016 at 19:37, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > If I may, I would also like to see where the Hive optimizer shows that it > is used with explain ... or other means. It will be interesting. > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 20 April 2016 at 19:20, Marcin Tustin <mtus...@handybook.com> wrote: > >> Could you expand on this? This sounds like something that would be great >> to know, and probably fold into the wiki. >> >> On Wed, Apr 20, 2016 at 11:57 AM, Jörn Franke <jornfra...@gmail.com> >> wrote: >> >>> Hive has working indexes. However many people overlook that a block is >>> usually much larger than in a relational database and thus do not use them >>> right. >>> >>> On 19 Apr 2016, at 09:31, Mich Talebzadeh <mich.talebza...@gmail.com> >>> wrote: >>> >>> The issue is that Hive has indexes (not index store) but they don't work >>> so there we go. May be in later releases we can make use of these indexes >>> for faster queries. Hive allows even bitmap indexes on Fact table but they >>> are never used by COB. >>> >>> show indexes on sales; >>> >>> >>> +-----------------------+-----------------------+-----------------------+------------------------------------------+-----------------------+----------+--+ >>> | idx_name | tab_name | col_names >>> | idx_tab_name | idx_type | >>> comment | >>> >>> +-----------------------+-----------------------+-----------------------+------------------------------------------+-----------------------+----------+--+ >>> | sales_cust_bix | sales | cust_id >>> | oraclehadoop__sales_sales_cust_bix__ | bitmap >>> | | >>> | sales_channel_bix | sales | channel_id >>> | oraclehadoop__sales_sales_channel_bix__ | bitmap >>> | | >>> | sales_prod_bix | sales | prod_id >>> | oraclehadoop__sales_sales_prod_bix__ | bitmap >>> | | >>> | sales_promo_bix | sales | promo_id >>> | oraclehadoop__sales_sales_promo_bix__ | bitmap >>> | | >>> | sales_time_bix | sales | time_id >>> | oraclehadoop__sales_sales_time_bix__ | bitmap >>> | | >>> >>> +-----------------------+-----------------------+-----------------------+------------------------------------------+-----------------------+----------+--+ >>> >>> >>> >>> >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> On 18 April 2016 at 23:51, Marcin Tustin <mtus...@handybook.com> wrote: >>> >>>> We use a hive with ORC setup now. Queries may take thousands of seconds >>>> with joins, and potentially tens of seconds with selects on very large >>>> tables. >>>> >>>> My understanding is that the goal of hbase is to provide much lower >>>> latency for queries. Obviously, this comes at the cost of not being able to >>>> perform joins. I don't actually use hbase, so I hesitate to say more about >>>> it. >>>> >>>> On Mon, Apr 18, 2016 at 6:48 PM, Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Thanks Marcin. >>>>> >>>>> What is the definition of low latency here? Are you referring to the >>>>> performance of SQL against HBase tables compared to Hive. As I understand >>>>> HBase is a columnar database. Would it be possible to use Hive against ORC >>>>> to achieve the same? >>>>> >>>>> Dr Mich Talebzadeh >>>>> >>>>> >>>>> >>>>> LinkedIn * >>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>> >>>>> >>>>> >>>>> http://talebzadehmich.wordpress.com >>>>> >>>>> >>>>> >>>>> On 18 April 2016 at 23:43, Marcin Tustin <mtus...@handybook.com> >>>>> wrote: >>>>> >>>>>> HBase has a different use case - it's for low-latency querying of big >>>>>> tables. If you combined it with Hive, you might have something nice for >>>>>> certain queries, but I wouldn't think of them as direct competitors. >>>>>> >>>>>> On Mon, Apr 18, 2016 at 6:34 PM, Mich Talebzadeh < >>>>>> mich.talebza...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I notice that Impala is rarely mentioned these days. I may be >>>>>>> missing something. However, I gather it is coming to end now as I don't >>>>>>> recall many use cases for it (or customers asking for it). In contrast, >>>>>>> Hive has hold its ground with the new addition of Spark and Tez as >>>>>>> execution engines, support for ACID and ORC and new stuff in Hive 2. In >>>>>>> addition provided a good choice for its metastore it scales well. >>>>>>> >>>>>>> If Hive had the ability (organic) to have local variable and stored >>>>>>> procedure support then it would be top notch Data Warehouse. Given its >>>>>>> metastore, I don't see any technical reason why it cannot support these >>>>>>> constructs. >>>>>>> >>>>>>> I was recently asked to comment on migration from commercial DWs to >>>>>>> Big Data (primarily for TCO reason) and really could not recall any >>>>>>> better >>>>>>> candidate than Hive. Is HBase a viable alternative? Obviously whatever >>>>>>> one >>>>>>> decides there is still HDFS, a good engine for Hive (sounds like many >>>>>>> prefer TEZ although I am a Spark fan) and the ubiquitous YARN. >>>>>>> >>>>>>> Let me know your thoughts. >>>>>>> >>>>>>> >>>>>>> Dr Mich Talebzadeh >>>>>>> >>>>>>> >>>>>>> >>>>>>> LinkedIn * >>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://talebzadehmich.wordpress.com >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> Want to work at Handy? Check out our culture deck and open roles >>>>>> <http://www.handy.com/careers> >>>>>> Latest news <http://www.handy.com/press> at Handy >>>>>> Handy just raised $50m >>>>>> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >>>>>> led >>>>>> by Fidelity >>>>>> >>>>>> >>>>> >>>> >>>> Want to work at Handy? Check out our culture deck and open roles >>>> <http://www.handy.com/careers> >>>> Latest news <http://www.handy.com/press> at Handy >>>> Handy just raised $50m >>>> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >>>> led >>>> by Fidelity >>>> >>>> >>> >> >> Want to work at Handy? Check out our culture deck and open roles >> <http://www.handy.com/careers> >> Latest news <http://www.handy.com/press> at Handy >> Handy just raised $50m >> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >> led >> by Fidelity >> >> >