The only actual differences is:

If you drop a managed table the LOCATION it refers to will be deleted.
If you drop an external table the LOCATION it refers to will not be deleted.

Confusion happens because when hive creates a managed table it defaults to :

fs.default.name+/user/hive/warehouse/+tablename
eg
hdfs://myserver:9091:/user/hive/warehouse/tablename

So people make the leap that EXTERNAL tables have a location and
managed tables do not, but MANAGED tables can have a location outside
the warehouse and EXTERNAL tables could have a location inside the
warehouse depending on how the tables/ partitions were defined.


On Thu, May 10, 2012 at 5:52 PM, kulkarni.swar...@gmail.com
<kulkarni.swar...@gmail.com> wrote:
> I am pretty new to hive and was trying to clearly understand the difference
> between a managed and an external table.
>
> As my current understanding stands, a managed table is a table whose data is
> completely owned by hive whereas an external table is usually created to
> have a hive frontend for the data managed in external systems.I would
> suppose this would mean that a query on an external table goes out to fetch
> data from the given external table, deserialize according to the
> given/suitable SerDe and then show the output of the query in hive format.
>
> So does this mean that cost of using external tables is much higher than the
> native ones? Or is there some caching that comes into play that I am not
> seeing right now.
>
> Thanks for the help.
>
> --
> Swarnim

Reply via email to