No indexing in hive.
On Sunday, May 13, 2012, Ranjith wrote: > Indexes can be built on tables managed by hive. For external tables I do > not believe that to be true. Please feel to correct if I am wrong. > > Thanks, > Ranjith > > On May 12, 2012, at 9:24 PM, Nanda Vijaydev > <nanda.vijay...@gmail.com<javascript:_e({}, 'cvml', > 'nanda.vijay...@gmail.com');>> > wrote: > > In hive, the raw data is in HDFS and there is a metadata layer that > defines the structure of the raw data. Table is usually a reference to > metadata, probably in a mySQL server and it contains a reference to the > location of the data in HDFS, type of delimiter or serde to use and so on. > 1. With hive managed tables, when you drop a table, both the metadata in > mysql and raw data on the cluster gets deleted. > 2. With external tables, when you drop a table, just the metadata gets > deleted and the raw data continues to exist on the cluster. > > > On Thu, May 10, 2012 at 3:02 PM, David Kulp > <dk...@fiksu.com<javascript:_e({}, 'cvml', 'dk...@fiksu.com');> > > wrote: > >> It's simpler than this. All files look the same -- and are often very >> simple delimited text -- whether managed or external. The only difference >> is that the files associated with a managed table are dropped when the >> table is dropped and files that are loaded into a managed table are moved >> into hive's private path. External tables never move or remove files. >> Performance is the same. >> >> On May 10, 2012, at 5:52 PM, kulkarni.swar...@gmail.com<javascript:_e({}, >> 'cvml', 'kulkarni.swar...@gmail.com');>wrote: >> >> > I am pretty new to hive and was trying to clearly understand the >> difference between a managed and an external table. >> > >> > As my current understanding stands, a managed table is a table whose >> data is completely owned by hive whereas an external table is usually >> created to have a hive frontend for the data managed in external systems.I >> would suppose this would mean that a query on an external table goes out to >> fetch data from the given external table, deserialize according to the >> given/suitable SerDe and then show the output of the query in hive format. >> > >> > So does this mean that cost of using external tables is much higher >> than the native ones? Or is there some caching that comes into play that I >> am not seeing right now. >> > >> > Thanks for the help. >> > >> > -- >> > Swarnim >> >> > -- Raja Thiruvathuru