It's simpler than this.  All files look the same -- and are often very simple 
delimited text -- whether managed or external.  The only difference is that the 
files associated with a managed table are dropped when the table is dropped and 
files that are loaded into a managed table are moved into hive's private path.  
External tables never move or remove files.  Performance is the same.

On May 10, 2012, at 5:52 PM, kulkarni.swar...@gmail.com wrote:

> I am pretty new to hive and was trying to clearly understand the difference 
> between a managed and an external table. 
> 
> As my current understanding stands, a managed table is a table whose data is 
> completely owned by hive whereas an external table is usually created to have 
> a hive frontend for the data managed in external systems.I would suppose this 
> would mean that a query on an external table goes out to fetch data from the 
> given external table, deserialize according to the given/suitable SerDe and 
> then show the output of the query in hive format.
> 
> So does this mean that cost of using external tables is much higher than the 
> native ones? Or is there some caching that comes into play that I am not 
> seeing right now.
> 
> Thanks for the help.
> 
> -- 
> Swarnim

Reply via email to