I don’t think Index on hive (as a separate entity) adds any value  although you 
can create one 

 

You can create an ORC table which will have characteristics that can simulate 
index like behaviour

 

CLUSTERED BY (object_id) INTO 256 BUCKETS

STORED AS ORC

TBLPROPERTIES ( "orc.compress"="SNAPPY",

"orc.create.index"="true",

"orc.bloom.filter.columns"="object_id",

 

That improves query response

 

 

HTH

 

Dr Mich Talebzadeh

 

LinkedIn  
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one 
out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Ltd, its subsidiaries nor their employees accept any 
responsibility.

 

From: Ting(Goden) Yao [mailto:t...@pivotal.io] 
Sent: 05 January 2016 18:18
To: user@hive.apache.org
Subject: Is Hive Index officially not recommended?

 

Hi,

 

We hit an issue when doing Hive testing to rebuild index on Tez.

We were told by our Hadoop distro vendor that it's not recommended (or should 
avoid) using index with Hive.

 

But I don't see an official message on Hive wiki 
<https://cwiki.apache.org/confluence/display/Hive/IndexDev>  or documentation.

Can someone confirm that so we'll ask our users to avoid indexing.

 

Thanks.

-Goden

 

==Exceptions (if you're interested in details) ==

Exception:

2015-12-08 22:55:30,263 FATAL [AsyncDispatcher event handler] 
event.AsyncDispatcher: Error in dispatcher thread
org.apache.tez.dag.api.TezUncheckedException: Unable to instantiate class with 
1 arguments: org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator
    at 
org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:80)
    at 
org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:98)
    at 
org.apache.tez.dag.app.dag.RootInputInitializerManager.createInitializer(RootInputInitializerManager.java:137)
    at 
org.apache.tez.dag.app.dag.RootInputInitializerManager.runInputInitializers(RootInputInitializerManager.java:114)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl.setupInputInitializerManager(VertexImpl.java:3943)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl.access$3900(VertexImpl.java:180)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.handleInitEvent(VertexImpl.java:2956)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2906)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl$InitTransition.transition(VertexImpl.java:2887)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
    at 
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
    at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1556)
    at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:179)
    at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1764)
    at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1750)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at 
org.apache.tez.common.ReflectionUtils.getNewInstance(ReflectionUtils.java:69)
    ... 20 more
Caused by: java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.exec.tez.DynamicPartitionPruner.initialize(DynamicPartitionPruner.java:154)
    at 
org.apache.hadoop.hive.ql.exec.tez.DynamicPartitionPruner.<init>(DynamicPartitionPruner.java:110)
    at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.<init>(HiveSplitGenerator.java:95)
    ... 25 more
2015-12-08 22:55:30,266 ERROR [AsyncDispatcher event handler] impl.VertexImpl: 
Can't handle Invalid event V_START on vertex Map 1 with vertexId 
vertex_1449613300943_0002_1_00 at current state NEW
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
V_START at NEW
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
    at 
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
    at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1556)
    at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:179)
    at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1764)
    at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1750)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
    at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
    at java.lang.Thread.run(Thread.java:745)
2015-12-08 22:55:30,267 ERROR [AsyncDispatcher event handler] impl.VertexImpl: 
Invalid event V_INTERNAL_ERROR on Vert

Reply via email to