Could you expand on this? This sounds like something that would be great to know, and probably fold into the wiki.
On Wed, Apr 20, 2016 at 11:57 AM, Jörn Franke <jornfra...@gmail.com> wrote: > Hive has working indexes. However many people overlook that a block is > usually much larger than in a relational database and thus do not use them > right. > > On 19 Apr 2016, at 09:31, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > The issue is that Hive has indexes (not index store) but they don't work > so there we go. May be in later releases we can make use of these indexes > for faster queries. Hive allows even bitmap indexes on Fact table but they > are never used by COB. > > show indexes on sales; > > > +-----------------------+-----------------------+-----------------------+------------------------------------------+-----------------------+----------+--+ > | idx_name | tab_name | col_names > | idx_tab_name | idx_type | > comment | > > +-----------------------+-----------------------+-----------------------+------------------------------------------+-----------------------+----------+--+ > | sales_cust_bix | sales | cust_id | > oraclehadoop__sales_sales_cust_bix__ | bitmap | > | > | sales_channel_bix | sales | channel_id | > oraclehadoop__sales_sales_channel_bix__ | bitmap | > | > | sales_prod_bix | sales | prod_id | > oraclehadoop__sales_sales_prod_bix__ | bitmap | > | > | sales_promo_bix | sales | promo_id | > oraclehadoop__sales_sales_promo_bix__ | bitmap | > | > | sales_time_bix | sales | time_id | > oraclehadoop__sales_sales_time_bix__ | bitmap | > | > > +-----------------------+-----------------------+-----------------------+------------------------------------------+-----------------------+----------+--+ > > > > > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 18 April 2016 at 23:51, Marcin Tustin <mtus...@handybook.com> wrote: > >> We use a hive with ORC setup now. Queries may take thousands of seconds >> with joins, and potentially tens of seconds with selects on very large >> tables. >> >> My understanding is that the goal of hbase is to provide much lower >> latency for queries. Obviously, this comes at the cost of not being able to >> perform joins. I don't actually use hbase, so I hesitate to say more about >> it. >> >> On Mon, Apr 18, 2016 at 6:48 PM, Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Thanks Marcin. >>> >>> What is the definition of low latency here? Are you referring to the >>> performance of SQL against HBase tables compared to Hive. As I understand >>> HBase is a columnar database. Would it be possible to use Hive against ORC >>> to achieve the same? >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> On 18 April 2016 at 23:43, Marcin Tustin <mtus...@handybook.com> wrote: >>> >>>> HBase has a different use case - it's for low-latency querying of big >>>> tables. If you combined it with Hive, you might have something nice for >>>> certain queries, but I wouldn't think of them as direct competitors. >>>> >>>> On Mon, Apr 18, 2016 at 6:34 PM, Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I notice that Impala is rarely mentioned these days. I may be missing >>>>> something. However, I gather it is coming to end now as I don't recall >>>>> many >>>>> use cases for it (or customers asking for it). In contrast, Hive has hold >>>>> its ground with the new addition of Spark and Tez as execution engines, >>>>> support for ACID and ORC and new stuff in Hive 2. In addition provided a >>>>> good choice for its metastore it scales well. >>>>> >>>>> If Hive had the ability (organic) to have local variable and stored >>>>> procedure support then it would be top notch Data Warehouse. Given its >>>>> metastore, I don't see any technical reason why it cannot support these >>>>> constructs. >>>>> >>>>> I was recently asked to comment on migration from commercial DWs to >>>>> Big Data (primarily for TCO reason) and really could not recall any better >>>>> candidate than Hive. Is HBase a viable alternative? Obviously whatever one >>>>> decides there is still HDFS, a good engine for Hive (sounds like many >>>>> prefer TEZ although I am a Spark fan) and the ubiquitous YARN. >>>>> >>>>> Let me know your thoughts. >>>>> >>>>> >>>>> Dr Mich Talebzadeh >>>>> >>>>> >>>>> >>>>> LinkedIn * >>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>> >>>>> >>>>> >>>>> http://talebzadehmich.wordpress.com >>>>> >>>>> >>>>> >>>> >>>> >>>> Want to work at Handy? Check out our culture deck and open roles >>>> <http://www.handy.com/careers> >>>> Latest news <http://www.handy.com/press> at Handy >>>> Handy just raised $50m >>>> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >>>> led >>>> by Fidelity >>>> >>>> >>> >> >> Want to work at Handy? Check out our culture deck and open roles >> <http://www.handy.com/careers> >> Latest news <http://www.handy.com/press> at Handy >> Handy just raised $50m >> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >> led >> by Fidelity >> >> > -- Want to work at Handy? Check out our culture deck and open roles <http://www.handy.com/careers> Latest news <http://www.handy.com/press> at Handy Handy just raised $50m <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> led by Fidelity